<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Overview of the CLEF-2025 CheckThat! Lab Task 2 on Claim Normalization</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Megha Sundriyal</string-name>
          <email>meghas@iiitd.ac.in</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tanmoy Chakraborty</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Preslav Nakov</string-name>
          <email>Preslav.Nakov@mbzuai.ae</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Indian Institute of Technology Delhi</institution>
          ,
          <country country="IN">India</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Indraprastha Institute of Information Technology Delhi</institution>
          ,
          <country country="IN">India</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Mohamed bin Zayed University of Artificial Intelligence</institution>
          ,
          <addr-line>UAE</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>We present an overview of Task 2 from CheckThat! at CLEF 2025, which focuses on claim normalization. The tasks asks systems to transform informal and often noisy social media posts into clear, concise, and verifiable statements known as normalized claims, which capture the core factual assertion of a post, which makes them much easier to verify and fact-check. The task is especially relevant in multilingual and low-resource contexts, where the diversity of languages and limited labelled data pose serious challenges. Task 2 was conducted in two distinct settings: (i) monolingual, where systems were trained and tested on the same language, and (ii) zero-shot, where models had to normalize claims in a new target language without any in-language training data. The monolingual track covered thirteen languages, including English, German, French, Spanish, Portugese, Hindi, Marathi, Punjabi, Tamil, Arabic, Thai, Indonesian, and Polish. While the zero-shot setting introduced seven more languages, such as Dutch, Romanian, Bengali, Telugu, Korean, Greek, and Czech. This structure allowed us to evaluate both language-specific performance and cross-lingual generalization. In total, 18 teams participated in Task 2, submitting 1,226 valid runs across the two settings. The submissions were evaluated using the METEOR score. Many teams leveraged transformer-based models, multilingual embeddings, and retrieval-augmented strategies. In this paper, we outline the task setup, give details about the datasets, and provide a detailed summary of the diverse approaches adopted by the participating teams.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Claim Normalization</kwd>
        <kwd>Social Media Posts</kwd>
        <kwd>Multilinguality</kwd>
        <kwd>Claims</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Social media have revolutionized global communication, removing geographical barriers and allowing
global knowledge exchange. However, it has also become a breeding ground for misinformation,
spreading false claims quickly across languages and cultures [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. These false claims jeopardize the
integrity of online discourse and public trust. For instance, they have afected various critical events,
including the 45th US Presidential Election [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], the COVID-19 pandemic [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ], the Russia–Ukraine
conflict [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], etc. While journalists and fact-checkers work tirelessly to ensure the accuracy of online
content, the sheer volume and the linguistic diversity of social media posts make it dificult to identify
and debunk every single claim across diverse languages efectively [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. In recent years, several studies
have examined the needs of fact-checkers and have identified tasks that could be automated to reduce
their manual eforts and to improve the efectiveness of their work [
        <xref ref-type="bibr" rid="ref10 ref7 ref8 ref9">7, 8, 9, 10</xref>
        ]. These tasks include
looking for the source of evidence for verification [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], exploring other versions of misinformation [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ],
and searching within existing fact-checking datasets [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
      </p>
      <p>
        Social media posts are often written in vague, informal language, frequently mixing opinions, using
rhetorical questions, and incomplete thoughts. This makes it dificult to extract clear, check-worthy
claims, defined as factual statements that can be verified or disproven [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. Recently, Sundriyal et al.
[
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] introduced the task of claim normalization, which aims to simplify a given text containing a claim,
such as a long, noisy social media post, into a concise and precise statement.
      </p>
      <p>The task is a precursor to fact-checking, distilling the essence of the claim and removing any
unnecessary information, thereby increasing the eficiency and reliability of the fact-checking process.</p>
      <p>
        Despite eforts to combat misinformation across a variety of languages [
        <xref ref-type="bibr" rid="ref15 ref16 ref17">15, 16, 17</xref>
        ], research into
claim normalization has been predominantly English-centred. The Task 2 from CheckThat! at CLEF
2025 aims to bridge this gap by ofering the task in a multi-lingual setting.
      </p>
      <p>
        The CheckThat! Lab aims to accelerate the development of tools and datasets that enable diferent
phases of the fact-checking pipeline. Since its beginning, the lab has organized several shared tasks that
represent real-world issues in misinformation detection and verification, with a focus on multilingual,
cross-domain, and practical applications. The 2025 edition of the lab included four tasks in monolingual,
multilingual, and cross-lingual settings, covering over 20 languages across these tasks [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. This paper
presents Task 2 on Claim Normalization, which addresses the problem of converting informal, noisy
social media posts into clear, concise, and verifiable claims. The task plays a vital role in bridging
unstructured content with structured fact-checking workflows, especially in multilingual and
lowresource settings.
      </p>
      <p>Task Description. In this year’s CheckThat! Lab, Task 2 addressed the growing need to extract
verified claims from the informal language found on social media. Unlike standard fact-checking
pipelines that rely on well-formed input, our task aimed to rewrite user-generated content—often
imprecise, opinionated, or fragmented—into clear, concise, and factual statements, the way that human
fact-checkers formulate the claims they are checking.</p>
      <p>The task is especially timely and relevant in multilingual and low-resource settings. To simulate
realistic fact-checking scenarios, Task 2 was conducted in two settings:
• Monolingual: In the monolingual setting, training, development datasets are provided for the
language used for testing. The model is trained, validated, and tested on the same language,
allowing it to learn language-specific structures and patterns. The languages included in this
setup are English, German, French, Spanish, Portugese, Hindi, Marathi, Punjabi, Tamil, Arabic,
Thai, Indonesian, and Polish.
• Zero-shot: The zero-shot setting provides only the test data for the target language, without
any corresponding training or development data for it. The participants may train their models
using data from other languages or conduct zero-shot experiments with large language models
(LLMs), evaluating the performance in the target language without prior exposure. This setup
tests the model’s ability to generalize to unseen languages. The languages in this setting are
Dutch, Romanian, Bengali, Telugu, Korean, Greek, and Czech.</p>
      <p>In the following sections, we give details about our dataset, a detailed overview of the participating
systems, and discussion of the approaches.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        Fact-checking is critical for combating the spread of false claims. As fully automating manual
factchecking is very time-consuming, researchers have worked on specific subtasks that can help human
fact-checkers. This encompasses a spectrum of tasks, including claim detection [
        <xref ref-type="bibr" rid="ref10 ref19">10, 19</xref>
        ], claim
checkworthiness assessment [
        <xref ref-type="bibr" rid="ref20 ref21">20, 21</xref>
        ], claim span identification [
        <xref ref-type="bibr" rid="ref17 ref8">8, 17</xref>
        ], claim verification [
        <xref ref-type="bibr" rid="ref22 ref23">22, 23</xref>
        ], etc.
      </p>
      <p>
        The proliferation of false claims on social media platforms has led to the development of specialized
systems tailored for handling informal texts from these platforms [
        <xref ref-type="bibr" rid="ref24 ref25 ref26">24, 25, 26</xref>
        ]. These systems are designed
to quickly identify and debunk potentially misleading information, allowing for timely intervention
by human fact-checkers. Within the larger context of fact-checking, claim normalization has recently
emerged as an important novel research direction. Sundriyal et al. [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] introduced this task of claim
normalization, which distils the key claim from long noisy social media posts.
      </p>
      <p>
        Most existing methods aimed at combating misinformation have primarily focused on English
[
        <xref ref-type="bibr" rid="ref14 ref25 ref26">14, 26, 25</xref>
        ]. However, there has been a recent surge in interest regarding the advancement of
factchecking techniques for various languages. Jaradat et al. [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] developed ClaimRank, an online system
to identify sentences with credible claims in Arabic and English. Gupta and Srikumar [27] developed
X-FACT, a multilingual dataset for factual verification of real-world claims across 25 languages. Mittal
et al. [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] released X-CLAIM, a multilingual dataset for claim span identification, consisting of 7,000
realworld claims collected from various social media platforms in five Indian languages and English. Pikuliak
et al. [28] introduced MultiClaim, a multilingual dataset for detecting previously checked claim retrieval.
They gathered 28k social media posts in 27 languages, 206k professional fact-checks in 39 languages,
and 31k connections between these two groups. Chang et al. [29] introduced a multilingual version
of the FEVER dataset. Over the past seven years, the CheckThat! Lab organized several multilingual
claim-related tasks as part of CLEF, gradually expanding language support and attracting an increasing
number of submissions [
        <xref ref-type="bibr" rid="ref9">30, 31, 32, 9, 33, 34</xref>
        ]. The most recent edition of the CheckThat! lab included
six tasks in fifteen languages, including Arabic, Bulgarian, English, Dutch, French, Georgian, German,
Greek, Italian, Polish, Portuguese, Russian, Slovene, Spanish, and code-mixed Hindi-English [34].
      </p>
      <p>
        Despite the growing interest in fact-checking across multiple languages, the task of claim
normalization has been largely unexplored beyond English [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. This narrow focus presents challenges as
multilingual social media platforms host content in multiple languages, and thus claims originate in
many languages. Moreover, linguistic nuances and cultural contexts complicate the task, emphasizing
the need for multilingual approaches. This motivated our multilingual claim normalization task this
year.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Dataset</title>
      <sec id="sec-3-1">
        <title>Below, we describe the dataset for our tasks, which we call mCLAN.</title>
        <sec id="sec-3-1-1">
          <title>3.1. Data Compilation</title>
          <p>Inspired by the principle of dataset recycling Koch et al. [35], we identified and reused four datasets,
which we repurposed for the task of claim normalization. This reduces annotation efort as well as
subjective annotation biases. Below, we describe each dataset in detail:</p>
          <p>
            (a) CLAN [
            <xref ref-type="bibr" rid="ref14">14</xref>
            ]: It contains 6,388 social media posts, each with normalized claims from various
fact-checking websites. Notably, every example in the dataset is in English. We use all the pairs of a
post and its corresponding normalized claim.
          </p>
          <p>(b) MultiClaim [28]: It contains multilingual fact-checking pairs obtained from 142 fact-checking
sites, making it the largest dataset of fact-checks released to date, encompassing 39 languages. Each
fact-checking article is represented in the dataset by its claim, title, publication date, and URL. However,
the entire text of the articles has not been published. In addition, the dataset includes relevant social
media posts with text, OCR of attached images (if any), publication date, social media platform, and
fact-checker rating for each post. We used this dataset to collect claims from fact-checking websites
and corresponding social media posts for our study. This allowed us to extract 21k in post-claim pairs.
It is worth noting that we only use monolingual pairs from this dataset in our work.</p>
          <p>
            (c) X-Claim [
            <xref ref-type="bibr" rid="ref17">17</xref>
            ]: This is a multilingual dataset labeled for claim spans and includes six languages,
primarily focusing on low-resource languages. The authors collected social media posts and
corresponding claims from several fact-checking websites. They used a variety of filtering rules to eliminate
posts containing videos, Instagram reels, or excessively short or long text. Using awesome-align [36],
they found word tokens in the post-sentence that matched those in the normalized claim. The claim
span was then calculated as a sequence of word tokens that began with the first aligned word token
and ended with the last aligned word token in the sentence. Given that each example in this dataset
included social media posts and the corresponding claims obtained from the fact-check sites, we used
all the examples in the dataset: 5,840 post-claim pairs in six languages.
(d) Twitter Dataset [37]: The authors proposed an abstractive text summarization dataset consisting
of noisy claims from Twitter and their gold summaries for eficiently detecting previously fact-checked
claims that use abstractive summaries to generate crisp queries. They crawled Twitter for URLs from
fact-checking organizations like Snopes, PolitiFact, The Quint, etc., resulting in a preliminary collection
of Tweet and Claim Review1 pairs. Pairs with tweets in languages other than English were discarded,
as were such with only image or video content. They also ensured that each tweet included a claim
and could be textually summarized to match the corresponding Claim Review. The final dataset only
included &lt;Social Media Content, Claim Review&gt; pairs with both components in English. We used all the
567 pairs provided in this dataset.
          </p>
          <p>To ensure the data quality of the final compiled corpus, we randomly selected 50 examples from each
language and asked native speakers to verify the post and the corresponding normalized claims. For
languages where we could not find native speakers, we used the Google Translate API to translate them
into English and cross-checked the quality of the examples. Table 1 shows a few examples from our
mCLAN dataset in diferent languages. We consolidated all examples from these datasets and performed
a combined analysis. Table 1 shows a few examples from mCLAN dataset in diferent languages.</p>
        </sec>
        <sec id="sec-3-1-2">
          <title>3.2. Data Statistics and Analysis</title>
          <p>Through data compilation, we obtained a total of 28,012 instances in twenty languages from all datasets.
To maintain uniformity, we used the train/dev/test splits from the original datasets. For languages with
a small number of instances, e.g., around 100, we only kept the test sets with no training data. Table 2
gives details about the final dataset and the train/dev/test splits.</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>1Short summary of the claim written by the fact-checker.</title>
        <p>To better apprehend the distribution of languages, we analysed the dataset linguistically. With its
diverse vocabulary and flexible syntax, English is the primary language used on several social media
platforms [38]. Thus, our dataset is also primarily composed of English examples. While German
and Dutch are less dominant, they still benefit from a shared Latin script and similar grammatical
structure. The Indic languages in the dataset encompass Hindi, Marathi, Punjabi, and Bengali. Hindi
uses the Devanagari script. Marathi also uses the Devanagari script, albeit with some diferences in the
characters. The Gurmukhi script is used for Punjabi, and Bengali is written using the Bengali script.
Due to their diverse scripts and extensive use of diacritics, Indic languages pose unique computational
challenges. The dataset also includes two languages from the Dravidian languages: Tamil and Telugu.
Both are important representatives of the Dravidian language family, with scripts derived from the
ancient Brahmic script.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Submissions</title>
      <p>We received submissions from 18 teams, totalling 1,226 valid runs across all the languages; 12 of these
teams submitted their working notes. Table 3 lists all teams and their ranking for each language.</p>
      <p>Baseline and Evaluation Metric. We used mT5-large as our baseline. For the monolingual setting,
we fine-tuned the model using language-specific training data, translating the instruction “Identify
the central claim in the given post: &lt; &gt;” into the language of the test claim. This allowed the
model to operate directly in the target language. We used METEOR as an evaluation measure.</p>
      <p>Table 6 presents the results for the monolingual setup, while Table 4 reports the scores for the
zero-shot setup. Most of the teams outperformed the baseline, while dfkinit2b [39], DS@GT [40], TIFIN
[41], and AKCIT-FN [42] consistently ranked among the top-performers across most languages. Team
dfkinit2b [39] was ranked first in 6 out of 13 languages in the monolingual setting. In the zero-shot
setting, they were first across all seven unseen languages.</p>
      <sec id="sec-4-1">
        <title>4.1. Overview of the Systems</title>
        <p>Most teams used sequence-to-sequence generation strategies for claim normalization, typically relying
on transformer-based models. The most prevalent approach involved fine-tuning pretrained models
such as BART, T5, mBART, and LLaMA on monolingual data.</p>
        <p>Team dfkinit2b [39] participated in both settings, testing zero- and few-shot prompting with models
such as Gemma-3, Qwen-3, Qwen-2.5, Llama-3.3, and Mistral. They explored various prompts and used
cosine similarity to select demonstrations for few-shot learning. They also included adapter fine-tuning,
data pre-processing with language checks and emoji removal, and data augmentation via translation.
For the final submission, they ensembled top-performing model outputs by computing embedding
centroids with multilingual SentenceTransformers and selecting claims closest to these centroids.</p>
        <p>
          Team DS@GT [40] embedded the unnormalized claims from the pooled train and development
datasets, as well as from the test set, using state-of-the-art embeddings for each language. For testing,
a GPT-4o mini model was prompted following the approach discussed in [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ], using the top-3 most
similar examples from the train and development sets as in-context examples. The final response for the
monolingual task was derived by combining the best-matching answer from the train and development
sets, based on cosine similarity, and the output of the GPT-4 model. For zero-shot, they used a modified
version of CACN [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ], essentially using the prompting method with standard examples.
        </p>
        <p>Team TIFIN [41] fine-tuned Qwen-14B using LoRA with 4-bit precision for eficiency. They
preprocessed data by filtering meaningful post-claim pairs, removing duplicates, and creating a unified
multilingual dataset. Instruction-based fine-tuning incorporated Chain-of-Thought prompting with
5W1H questions to guide claim extraction. During inference, context resolution replaced partial posts
with complete ones, and few-shot prompting with similar examples improved claim structure. This
approach aimed to boost claim extraction accuracy and multilingual performance.</p>
        <p>Team AKCIT-FN [42] adopted a dual-strategy approach tailored to data availability. For the 13
supervised languages, they fine-tuned various language-specific and multilingual Small Language
Models (SLMs) such as PTT5, AraT5, and Varta T5. For the seven zero-shot languages, they used
prompting with Large Language Models (LLMs) such as the GPT series, Gemini, and Qwen 2.5. Their
methodology also included a data cleaning algorithm to remove repetitive content and trailing None
placeholders, as well as cross-split deduplication. Few-shot prompting experiments for monolingual
settings involved selecting examples randomly, based on dificulty (METEOR score), or using HDBSCAN
cluster prototypes for semantic diversity.</p>
        <p>Team Factiverse and IAI [43] focused on the monolingual setting, comparing four main approaches:
zero-shot prompting, fine-tuning, Fixed In-Context Learning (FICL), and Adaptive In-Context Learning
(AICL). For the ICL methods, they used a ChromaDB vector store with all-MiniLM-L6-v2 embeddings
to retrieve semantically similar examples from the training data based on cosine distance. While FICL
used a fixed number of top-K examples, the team’s novel AICL approach dynamically selected examples
by applying a cosine distance threshold, eliminating the need to pre-determine the number of shots.
They also explored data augmentation via machine translation for low-resource languages.</p>
        <p>The MMA team [44] focused on the monolingual setting, exploring several model architectures and
training strategies. Their approaches included fine-tuning a unified multilingual umt5 model on all
languages, as well as training separate umt5 models for each language. They also tested zero-shot
prompting with Qwen2.5 models and employed a parameter-eficient fine-tuning (PEFT) method using
LoRA, which involved a two-stage process of first extracting key points and then generating a claim from
those points. For Arabic, they conducted specific experiments by fine-tuning ara-t5 and augmenting
the training data with scraped post-claim pairs from the Google Fact Check Tools API.</p>
        <p>The UmuTeam [49] used a generative approach based on the Flan-T5-Base model for the Claim
Extraction and Normalization task. Their strategy varied based on the data setting: for the monolingual
scenarios, they fine-tuned a separate instance of Flan-T5-Base for each language, using only that
language’s specific training data to allow the models to specialize. For the zero-shot languages, they
ifne-tuned a single Flan-T5-Base model on the concatenated training data from all other languages,
aiming to leverage cross-lingual transfer for generalization.</p>
        <p>The UNH team [45] only experimented with the English language. Their fine-tuning experiments
included fully fine-tuning a Flan-T5 Large model, using LoRA for a Flan-T5 Base model, and fine-tuning
a DeepSeek-Llama-8b model. Their prompting strategies involved few-shot prompting with
keywordbased example selection, iterative self-refinement to improve claim quality, and a Max Multi-Prompt
method that simulated choosing the best output from several targeted prompts.</p>
        <p>Saivineetha [50] focused on Hindi and Telugu. For Hindi, which was in the monolingual setting, they
performed Parameter-Eficient Fine-Tuning (PEFT) using QLoRA with 4-bit quantization on the Gemma
2 9B instruct model. The model was instruction fine-tuned on the provided Hindi dataset of posts and
normalized claims. For Telugu, which was in the zero-shot setting, they used zero-shot prompting with
the Gemma 3 12B instruct model, using a prompt template designed to convert unstructured Telugu
posts into normalized claims.</p>
        <p>The JU_NLP@M&amp;S team [48] framed the claim normalization task as a monolingual
sequenceto-sequence generation problem, centered on fine-tuning a BART-Large transformer model. Their
methodology included a preprocessing module for tokenization using byte-level BPE, padding inputs
to a fixed length, and truncating where necessary. Model training was conducted for 5 epochs using
Hugging Face’s Seq2SeqTrainer, employing mixed-precision (FP16) to optimize memory usage and a
learning rate of 3e-5. For inference, they used beam search with four beams to enhance the quality of
the generated claims.</p>
        <p>Team Investigators [46] focused on the claim normalization task by fine-tuning several models,
including LLaMA-3.2, BART, and T5, with a particular focus on the flan-t5-base model for the final
submission. Their methodology was primarily monolingual, with extensive experiments on the English
and Spanish datasets. Before training, they implemented a pre-processing pipeline to filter out records
that were not in the target language. For the zero-shot setting, they experimented with cross-lingual
transfer by training a model on the Spanish dataset and then evaluating it on the Korean test data.</p>
        <p>Team OpenFact [47] experimented with several decoder-only LLMs, including LLaMA 3.1,
DeepSeekR1, and GPT-4.1-mini. Their methodology had three steps: (1) generating up to three initial claim
candidates, (2) iteratively refining each candidate using a self-reflection technique where the model
provides feedback on its output, and (3) using an LLM as a judge to select the best among the refined
candidates. They also performed supervised fine-tuning on the GPT-4.1-mini model using the cleaned
training data.
dfkinit2b [39] § §
DS@GT [40] § §
TIFIN [41] § §
AKCIT-FN [42] § §
Factiverse and IAI [43] §
MMA [44] §
UNH [45] §
Investigators [46] § § §
OpenFact [47] § § §
JU_NLP@M&amp;S [48] §
Saivineetha [50] § §
UmuTeam [49] § §</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Discussion of Approaches</title>
      <p>The participating teams in CheckThat! 2025 Task 2 tried several strategies for multilingual claim
normalization. These approaches can be analyzed through four major dimensions: model architecture,
ifne-tuning vs. in-context learning paradigms, data handling, and performance across monolingual and
zero-shot settings. An overview of the approaches is given in Table 5.</p>
      <sec id="sec-5-1">
        <title>5.1. Model Architectures</title>
        <p>The primary area of divergence among the teams was their selection of model architecture. Some teams
handled the task as a typical sequence-to-sequence problem, using encoder-decoder models that excel
at summarization. For instance, the JU_NLP@M&amp;S team fine-tuned BART-Large for monolingual
textto-text generation. UmuTeam, Investigators, and MMA explored variants of T5, including multilingual
models such as Flan-T5 and UMT5. In contrast, other teams used decoder-only large language models
to improve their in-context learning and reasoning abilities. OpenFact evaluated models such as
LLaMA 3.1, DeepSeek-R1, and GPT-4.1-mini. Similarly, TIFIN and dfkinit2b chose Qwen for their
multilingual performance and eficiency in fine-tuning. This distinction highlights the trade-of between
the recognised strengths of encoder-decoder in generation tasks and the growing potential of
decoderonly models for flexible reasoning.</p>
      </sec>
      <sec id="sec-5-2">
        <title>5.2. Adaptation Strategies</title>
        <p>Fine-tuning was a common choice among the participating teams. Several teams employed
parametereficient fine-tuning (PEFT) approaches, such as LoRA or QLoRA. For example, Saivineetha fine-tuned
Gemma 2 for Hindi, while TIFIN and dfkinit2b applied LoRA to Qwen models. OpenFact’s supervised
ifne-tuning of GPT-4.1-mini was reported to be their most efective configuration. In contrast, teams
using decoder-only models emphasized in-context learning (ICL). DS@GT used retrieval-based ICL,
pulling top-3 similar examples from the training set as dynamic prompts. dfkinit2b also adopted semantic
similarity-driven selection for few-shot prompts. Factiverse used Adaptive In-Context Learning (AICL),
which dynamically modifies the number of in-context examples depending on similarity thresholds.
TIFIN implemented a 5W1H prompting strategy, structuring claim-related information into six categories
(Who, What, Where, When, Why, and How) to guide model reasoning. OpenFact and UNH also used
self-refinement, where an LLM iteratively critiques and improves its outputs.</p>
      </sec>
      <sec id="sec-5-3">
        <title>5.3. Data Handling and Hybrid Methods</title>
        <p>Due to the noisy nature of social media data, data preprocessing becomes crucial. OpenFact used
GPT-4.1-mini to filter out training instances with some mismatches with the ground truth. To augment
the training data, MMA scraped additional Arabic samples using Google’s Fact Check Tools API, while
Investigators used the Gemini API to generate synthetic examples. DS@GT created a retrieval-first
pipeline that reused existing normalizations when similar posts were found. dfkinit2b employed an
ensemble method to generate claims based on five diferent approaches. The output closest to the
centroid of all created embeddings was then chosen. This strategy worked well in both monolingual
and zero-shot settings.</p>
      </sec>
      <sec id="sec-5-4">
        <title>5.4. Adapting to Monolingual and Zero-Shot Scenarios</title>
        <p>In the monolingual setting, where training data for 13 languages was available, the participating teams
either trained language-specific models or used the data to retrieve information for ICL. Saivineetha,
for example, trained a dedicated Hindi model, while DS@GT and Factiverse retrieved similar examples
to construct prompts dynamically. In the zero-shot setting, the teams had to rely on cross-lingual
generalization. UmuTeam and MMA developed multilingual models based on merged monolingual data
and applied them to zero-shot languages. Other teams, such as TIFIN and DS@GT, used English-centric
prompts and relied on the inherent multilingual capacity of the LLMs to handle the target languages.
Among the most efective zero-shot strategies were dfkinit2b’s ensemble approach and OpenFact’s
ifne-tuned GPT-4.1-mini, both of which performed consistently well across languages without labeled
data.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>We presented a detailed overview of Task 2 from the CheckThat! Lab at CLEF 2025. It focused on
claim normalization, the task of transforming informal and noisy social media content into clear,
concise, and verifiable statements. In total, 18 teams participated in the task. Most of the participants
used Transformer-based models, with a clear trend towards leveraging large language models from
the T5, Qwen, and Llama families. Common and efective strategies included parameter-eficient
ifne-tuning, retrieval-augmented in-context learning, and sophisticated data preprocessing. The dual
setting for monolingual and zero-shot evaluation provided a valuable framework for assessing both
language-specific adaptation and cross-lingual generalization.</p>
    </sec>
    <sec id="sec-7">
      <title>Declaration on Generative AI</title>
      <p>In this study, we employed mT5-large as the baseline system. All experiments were carried out under
controlled conditions. To help with spell check suggestions, OpenAI GPT-4o was accessed through
a plugin on Overleaf. The authors thoroughly evaluated and edited all of the tool’s suggestions. No
generative AI tools were used to generate the content of the main manuscript. The authors take full
responsibility for the final content of the publication.</p>
      <p>Social Media, PNAS nexus 3 (2024) pgae217.
[27] A. Gupta, V. Srikumar, X-Fact: A New Benchmark Dataset for Multilingual Fact Checking, in:
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the
11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers),
Association for Computational Linguistics, Online, 2021, pp. 675–682.
[28] M. Pikuliak, I. Srba, R. Moro, T. Hromadka, T. Smoleň, M. Melišek, I. Vykopal, J. Simko, J. Podroužek,
M. Bielikova, Multilingual Previously Fact-Checked Claim Retrieval, in: H. Bouamor, J. Pino,
K. Bali (Eds.), Proceedings of the 2023 Conference on Empirical Methods in Natural Language
Processing, Association for Computational Linguistics, Singapore, 2023, pp. 16477–16500.
[29] Y.-C. Chang, C. Kruengkrai, J. Yamagishi, XFEVER: Exploring Fact Verification across Languages,
in: Proceedings of the 35th Conference on Computational Linguistics and Speech Processing
(ROCLING 2023), 2023, pp. 1–11.
[30] P. Nakov, A. Barrón-Cedeno, T. Elsayed, R. Suwaileh, L. Màrquez, W. Zaghouani, P. Atanasova,
S. Kyuchukov, G. Da San Martino, Overview of the CLEF-2018 CheckThat! Lab on Automatic
Identification and Verification of Political Claims, in: Experimental IR Meets Multilinguality,
Multimodality, and Interaction: 9th International Conference of the CLEF Association, CLEF 2018,
Avignon, France, September 10-14, 2018, Proceedings 9, Springer, 2018, pp. 372–387.
[31] S. Shaar, A. Nikolov, N. Babulkov, F. Alam, A. Barrón-Cedeno, T. Elsayed, M. Hasanain, R. Suwaileh,
F. Haouari, G. Da San Martino, et al., Overview of CheckThat! 2020 English: Automatic
Identification and Verification of Claims in Social Media., CLEF (Working Notes) 2696 (2020).
[32] P. Nakov, G. Da San Martino, T. Elsayed, A. Barrón-Cedeño, R. Míguez, S. Shaar, F. Alam, F. Haouari,
M. Hasanain, W. Mansour, et al., Overview of the CLEF-2021 CheckThat! Lab on Detecting
Check-Worthy Claims, Previously Fact-Checked Claims, and Fake News, in: Proceedings of the
12th International Conference of the CLEF Association: Information Access Evaluation Meets
Multiliguality, Multimodality, and Visualization, CLEF ’2021, Bucharest, Romania (online), 2021,
pp. 264–291.
[33] A. Barrón-Cedeño, F. Alam, A. Galassi, G. Da San Martino, P. Nakov, T. Elsayed, D. Azizov,
T. Caselli, G. S. Cheema, F. Haouari, et al., Overview of the CLEF–2023 CheckThat! Lab on
Checkworthiness, Subjectivity, Political Bias, Factuality, and Authority of News Articles and
their Source, in: International conference of the cross-language evaluation forum for European
languages, Springer, 2023, pp. 251–275.
[34] A. Barrón-Cedeño, F. Alam, J. M. Struß, P. Nakov, T. Chakraborty, T. Elsayed, P. Przybyła, T. Caselli,
G. Da San Martino, F. Haouari, et al., Overview of the CLEF-2024 CheckThat! Lab:
CheckWorthiness, Subjectivity, Persuasion, Roles, Authorities, and Adversarial Robustness, in:
International Conference of the Cross-Language Evaluation Forum for European Languages, Springer,
2024, pp. 28–52.
[35] B. Koch, E. Denton, A. Hanna, J. G. Foster, Reduced, Reused and Recycled: The Life of a Dataset
in Machine Learning Research, in: Thirty-fifth Conference on Neural Information Processing
Systems Datasets and Benchmarks Track (Round 2), 2021.
[36] Z.-Y. Dou, G. Neubig, Word Alignment by Fine-tuning Embeddings on Parallel Corpora, in:
Proceedings of the 16th Conference of the European Chapter of the Association for Computational
Linguistics: Main Volume, 2021, pp. 2112–2128.
[37] V. Bhatnagar, D. Kanojia, K. Chebrolu, Harnessing Abstractive Summarization for Fact-Checked
Claim Detection, in: Proceedings of the 29th International Conference on Computational
Linguistics, 2022, pp. 2934–2945.
[38] A. Petrosyan, Most used languages online by share of websites 2024, 2024. Accessed: 01 June 2024.
[39] T. Anikina, I. Vykopal, S. Kula, R. K. Chikkala, N. Skachkova, J. Yang, V. Solopova, V. Schmitt,
S. Ostermann, dfkinit2b at CheckThat! 2025: Leveraging LLMs and Ensemble of Methods for
Multilingual Claim Normalization, in: [51], 2025.
[40] A. Pramov, J. Ma, B. Patel, DS@GT at CheckThat! 2025: A Simple Retrieval-First, LLM-Backed</p>
      <p>Framework for Claim Normalization, in: [51], 2025.
[41] M. Sharma, A. Suneesh, M. Jain, P. K. Rajpoot, P. Devadiga, B. Hazarika, A. Shrivastava, K.
Gurumurthy, A. B. Suresh, A. U. Baliga, TIFIN at CheckThat! 2025: Reasoning-Guided Claim
Normalization for Noisy Multilingual Social Media Posts, in: [51], 2025.
[42] F. L. N. Almada, K. D. P. Mariano, M. A. Dutra, V. E. d. S. Monteiro, J. R. S. Gomes, A. R. Galvão Filho,
A. d. S. Soares, Akcit-FN at CheckThat!2025: Switching Fine-Tuned SLMs and LLM Prompting for
Multilingual Claim Normalization, in: [51], 2025.
[43] P. Amatya, V. Setty, Factiverse and IAI at CheckThat! 2025: Adaptive ICL for Claim Extraction, in:
[51], 2025.
[44] M. Saeed, M. Yasser, M. Torki, N. Elmakky, MMA at CheckThat! 2025: Multilingual Claim</p>
      <p>Normalization of Social-Media Posts, in: [51], 2025.
[45] J. Wilder, N. Kadapala, Y. Xu, M. Alsaadi, M. Rogers, P. Agrawal, A. Hassick, L. Dietz, UNH at</p>
      <p>Check That! 2025: Fine-tuning Vs Prompting, in: [51], 2025.
[46] S. M. A. Hashmi, S. Aamir, M. Anas, T. Usmani, F. Alvi, A. Samad, Investigators at CheckThat!
2025: Using LLMs to Improve Fact-Checking, in: [51], 2025.
[47] M. Sawiński, K. Węcel, E. Księżniak, OpenFact at CheckThat! 2025: Application of self-reflecting
and reasoning LLMs for fact-checking claim normalization, in: [51], 2025.
[48] M. Mondal, S. Saha, D. Saha, D. Das, JU_NLP@M&amp;S at CheckThat! 2025: Automated Claim
Extraction and Normalization for Misinformation Detection in Social Media Content, in: [51],
2025.
[49] T. B. Beltrán, R. Pan, J. A. García Díaz, R. Valencia García, UmuTeam at CheckThat! 2025:</p>
      <p>Language-specific versus multilingual models for Fact-Checking, in: [51], 2025.
[50] S. V. Baddepudi Venkata Naga Sri, Saivineetha at CheckThat! 2025: Exploring Fine-Tuning and</p>
      <p>Zero-Shot Approaches for Claim Normalization, in: [51], 2025.
[51] G. Faggioli, N. Ferro, P. Rosso, D. Spina (Eds.), Working Notes of CLEF 2025 - Conference and Labs
of the Evaluation Forum, CLEF 2025, Madrid, Spain, 2025.</p>
      <p>Thai
DS@GT 0.5859
AKCIT-FN 0.3179
dfkinit2b 0.2999
Baseline 0.2015
Factiverse and IAI 0.0965
OpenFact 0.0872
aryasuneesh 0.0464
UmuTeam 0.0147</p>
      <p>Arabic
dfkinit2b 0.5037
DS@GT 0.5035
MMA 0.4584
OpenFact 0.4175
TIFIN 0.3705
AKCIT-FN 0.3277
Factiverse and IAI 0.2457
Baseline 0.2186
UmuTeam 0.0003</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>Muhammed</surname>
          </string-name>
          <string-name>
            <surname>T</surname>
          </string-name>
          , S. K. Mathew,
          <article-title>The Disaster of Misinformation: A Review of Research in Social Media</article-title>
          ,
          <source>International Journal of Data Science and Analytics</source>
          <volume>13</volume>
          (
          <year>2022</year>
          )
          <fpage>271</fpage>
          -
          <lpage>285</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>H.</given-names>
            <surname>Allcott</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gentzkow</surname>
          </string-name>
          ,
          <source>Social Media and Fake News in the 2016 Election, Journal of Economic Perspectives</source>
          <volume>31</volume>
          (
          <year>2017</year>
          )
          <fpage>211</fpage>
          -
          <lpage>236</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>F.</given-names>
            <surname>Alam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Shaar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Dalvi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Sajjad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Nikolov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Mubarak</surname>
          </string-name>
          , G. Da San Martino,
          <string-name>
            <given-names>A.</given-names>
            <surname>Abdelali</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Durrani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Darwish</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Al-Homaid</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Zaghouani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Caselli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Danoe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Stolk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Bruntink</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          ,
          <article-title>Fighting the COVID-19 Infodemic: Modeling the Perspective of Journalists, FactCheckers, Social Media Platforms, Policy Makers, and the Society</article-title>
          , in: M.
          <article-title>-</article-title>
          <string-name>
            <surname>F. Moens</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          <string-name>
            <surname>Huang</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Specia</surname>
          </string-name>
          , S. W.-t. Yih (Eds.),
          <source>Findings of the Association for Computational Linguistics: EMNLP</source>
          <year>2021</year>
          ,
          <article-title>Association for Computational Linguistics</article-title>
          , Punta Cana, Dominican Republic,
          <year>2021</year>
          , pp.
          <fpage>611</fpage>
          -
          <lpage>649</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Barrón-Cedeño</surname>
          </string-name>
          , G. Da San Martino,
          <string-name>
            <given-names>F.</given-names>
            <surname>Alam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Struß</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Míguez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Caselli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kutlu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Zaghouani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Shaar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. K.</given-names>
            <surname>Shahi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Mubarak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Nikolov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Babulkov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y. S.</given-names>
            <surname>Kartal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Beltrán</surname>
          </string-name>
          ,
          <string-name>
            <surname>The</surname>
            <given-names>CLEF</given-names>
          </string-name>
          -2022
          <source>CheckThat! Lab on Fighting the COVID-19 Infodemic and Fake News Detection, in: Proceedings of the 44th European Conference on IR Research: Advances in Information Retrieval, ECIR '22</source>
          , Springer-Verlag, Berlin, Heidelberg,
          <year>2022</year>
          , pp.
          <fpage>416</fpage>
          -
          <lpage>428</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>I.</given-names>
            <surname>Khaldarova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Pantti</surname>
          </string-name>
          , Fake news,
          <source>Journalism Practice</source>
          <volume>10</volume>
          (
          <year>2016</year>
          )
          <fpage>891</fpage>
          -
          <lpage>901</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>N. L.</given-names>
            <surname>Tsang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Feng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. L.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <article-title>How Fact-Checkers Delimit Their Scope of Practices and Use Sources: Comparing Professional</article-title>
          and
          <string-name>
            <given-names>Partisan</given-names>
            <surname>Practitioners</surname>
          </string-name>
          ,
          <source>Journalism</source>
          <volume>24</volume>
          (
          <year>2023</year>
          )
          <fpage>2232</fpage>
          -
          <lpage>2251</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A.</given-names>
            <surname>Barrón-Cedeño</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Alam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Chakraborty</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Elsayed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Przybyła</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Struß</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Haouari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hasanain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Ruggeri</surname>
          </string-name>
          , et al.,
          <string-name>
            <surname>The</surname>
            <given-names>CLEF</given-names>
          </string-name>
          -2024 CheckThat! Lab:
          <string-name>
            <surname>Check-Worthiness</surname>
          </string-name>
          , Subjectivity, Persuasion, Roles, Authorities, and Adversarial Robustness,
          <source>in: European Conference on Information Retrieval</source>
          , Springer,
          <year>2024</year>
          , pp.
          <fpage>449</fpage>
          -
          <lpage>458</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M.</given-names>
            <surname>Sundriyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kulkarni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Pulastya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. S.</given-names>
            <surname>Akhtar</surname>
          </string-name>
          , T. Chakraborty, Empowering the Factcheckers!
          <article-title>Automatic Identification of Claim Spans on Twitter</article-title>
          , in: Y.
          <string-name>
            <surname>Goldberg</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          <string-name>
            <surname>Kozareva</surname>
          </string-name>
          , Y. Zhang (Eds.),
          <source>Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing</source>
          , Association for Computational Linguistics, Abu Dhabi, United Arab Emirates,
          <year>2022</year>
          , pp.
          <fpage>7701</fpage>
          -
          <lpage>7715</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Barrón-Cedeño</surname>
          </string-name>
          , G. Da San Martino,
          <string-name>
            <given-names>F.</given-names>
            <surname>Alam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Struß</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Míguez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Caselli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kutlu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Zaghouani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Shaar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. K.</given-names>
            <surname>Shahi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Mubarak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Nikolov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Babulkov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y. S.</given-names>
            <surname>Kartal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Beltrán</surname>
          </string-name>
          ,
          <article-title>Overview of the CLEF-2022 CheckThat! Lab on Fighting the COVID-19 Infodemic and Fake News Detection</article-title>
          ,
          <source>in: Proceedings of the 13th International Conference of the CLEF Association: Information Access Evaluation meets Multilinguality</source>
          , Multimodality, and Visualization, CLEF '
          <year>2022</year>
          , Bologna, Italy,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>S.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sundriyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. S.</given-names>
            <surname>Akhtar</surname>
          </string-name>
          , T. Chakraborty, LESA:
          <article-title>Linguistic Encapsulation and Semantic Amalgamation Based Generalised Claim Detection from Online Content</article-title>
          , in: P. Merlo,
          <string-name>
            <given-names>J.</given-names>
            <surname>Tiedemann</surname>
          </string-name>
          , R. Tsarfaty (Eds.),
          <source>Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics:</source>
          Main Volume,
          <article-title>Association for Computational Linguistics</article-title>
          , Online,
          <year>2021</year>
          , pp.
          <fpage>3178</fpage>
          -
          <lpage>3188</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>J.</given-names>
            <surname>Thorne</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Vlachos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Christodoulopoulos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mittal</surname>
          </string-name>
          ,
          <article-title>FEVER: a Large-scale Dataset for Fact Extraction and VERification</article-title>
          , in: M.
          <string-name>
            <surname>Walker</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Ji</surname>
            ,
            <given-names>A</given-names>
          </string-name>
          . Stent (Eds.),
          <source>Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</source>
          , Volume
          <volume>1</volume>
          (
          <string-name>
            <surname>Long</surname>
            <given-names>Papers)</given-names>
          </string-name>
          ,
          <source>Association for Computational Linguistics</source>
          , New Orleans, Louisiana,
          <year>2018</year>
          , pp.
          <fpage>809</fpage>
          -
          <lpage>819</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>A.</given-names>
            <surname>Kazemi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Garimella</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Gafney</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Hale</surname>
          </string-name>
          ,
          <article-title>Claim Matching Beyond English to Scale Global Fact-Checking, in: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th</article-title>
          <source>International Joint Conference on Natural Language Processing</source>
          (Volume
          <volume>1</volume>
          :
          <string-name>
            <surname>Long</surname>
            <given-names>Papers)</given-names>
          </string-name>
          ,
          <year>2021</year>
          , pp.
          <fpage>4504</fpage>
          -
          <lpage>4517</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>S.</given-names>
            <surname>Shaar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Babulkov</surname>
          </string-name>
          , G. Da San Martino, P. Nakov,
          <article-title>That is a Known Lie: Detecting Previously FactChecked Claims</article-title>
          ,
          <source>in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>3607</fpage>
          -
          <lpage>3618</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>M.</given-names>
            <surname>Sundriyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Chakraborty</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          , From Chaos to Clarity: Claim Normalization to Empower Fact-Checking, in: H.
          <string-name>
            <surname>Bouamor</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Pino</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          Bali (Eds.),
          <source>Findings of the Association for Computational Linguistics: EMNLP</source>
          <year>2023</year>
          ,
          <article-title>Association for Computational Linguistics</article-title>
          , Singapore,
          <year>2023</year>
          , pp.
          <fpage>6594</fpage>
          -
          <lpage>6609</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>I.</given-names>
            <surname>Jaradat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Gencheva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Barrón-Cedeño</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Màrquez</surname>
          </string-name>
          , P. Nakov,
          <article-title>ClaimRank: Detecting CheckWorthy Claims in Arabic and English, in: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations, Association for Computational Linguistics</article-title>
          , New Orleans, Louisiana,
          <year>2018</year>
          , pp.
          <fpage>26</fpage>
          -
          <lpage>30</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>A.</given-names>
            <surname>Barrón-Cedeño</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Alam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Caselli</surname>
          </string-name>
          , G. Da San Martino, T. Elsayed,
          <string-name>
            <given-names>A.</given-names>
            <surname>Galassi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Haouari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Ruggeri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Struß</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. N.</given-names>
            <surname>Nandi</surname>
          </string-name>
          , et al.,
          <string-name>
            <surname>The</surname>
            <given-names>CLEF</given-names>
          </string-name>
          -2023 CheckThat! Lab: Checkworthiness, Subjectivity, Political Bias, Factuality, and Authority,
          <source>in: European Conference on Information Retrieval</source>
          , Springer,
          <year>2023</year>
          , pp.
          <fpage>506</fpage>
          -
          <lpage>517</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>S.</given-names>
            <surname>Mittal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sundriyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          , Lost in Translation, Found in Spans: Identifying Claims in Multilingual Social Media,
          <source>in: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>3887</fpage>
          -
          <lpage>3902</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>F.</given-names>
            <surname>Alam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Struß</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Chakraborty</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Dietze</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Hafid</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Korre</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Muti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Ruggeri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Schellhammer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Setty</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sundriyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Todorov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. V.</given-names>
            ,
            <surname>The</surname>
          </string-name>
          <string-name>
            <given-names>CLEF</given-names>
            -2025 CheckThat! Lab: Subjectivity,
            <surname>Fact-Checking</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Claim</given-names>
            <surname>Normalization</surname>
          </string-name>
          , and Retrieval, in: C.
          <string-name>
            <surname>Hauf</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Macdonald</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Jannach</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Kazai</surname>
            ,
            <given-names>F. M.</given-names>
          </string-name>
          <string-name>
            <surname>Nardini</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Pinelli</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Silvestri</surname>
          </string-name>
          , N. Tonellotto (Eds.),
          <source>Advances in Information Retrieval</source>
          , Springer Nature Switzerland, Cham,
          <year>2025</year>
          , pp.
          <fpage>467</fpage>
          -
          <lpage>478</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>M.</given-names>
            <surname>Sundriyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. S.</given-names>
            <surname>Akhtar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sengupta</surname>
          </string-name>
          , T. Chakraborty,
          <source>DESYR: Definition and Syntactic Representation Based Claim Detection on the Web, in: Proceedings of the 30th ACM International Conference on Information &amp; Knowledge Management, CIKM '21</source>
          ,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2021</year>
          , p.
          <fpage>1764</fpage>
          -
          <lpage>1773</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>M.</given-names>
            <surname>Sundriyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. S.</given-names>
            <surname>Akhtar</surname>
          </string-name>
          , T. Chakraborty,
          <article-title>Leveraging Rationality Labels for Explainable Claim Check-Worthiness</article-title>
          ,
          <source>IEEE Transactions on Artificial Intelligence</source>
          (
          <year>2025</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>P.</given-names>
            <surname>Gencheva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Màrquez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Barrón-Cedeño</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Koychev</surname>
          </string-name>
          ,
          <article-title>A context-aware approach for detecting worth-checking claims in political debates</article-title>
          ,
          <source>in: Proc. of RANLP</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>267</fpage>
          -
          <lpage>276</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>M.</given-names>
            <surname>Sundriyal</surname>
          </string-name>
          , G. Malhotra,
          <string-name>
            <given-names>M. S.</given-names>
            <surname>Akhtar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sengupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Fano</surname>
          </string-name>
          , T. Chakraborty, Document Retrieval and Claim Verification to Mitigate COVID-19 Misinformation, in
          <source>: Proc. of workshop on CONSTRAINT, ACL</source>
          ,
          <year>2022</year>
          , pp.
          <fpage>66</fpage>
          -
          <lpage>74</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>M.</given-names>
            <surname>Glockner</surname>
          </string-name>
          , I. Staliu¯naitė, J. Thorne,
          <string-name>
            <given-names>G.</given-names>
            <surname>Vallejo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Vlachos</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Gurevych</surname>
          </string-name>
          ,
          <article-title>AmbiFC: Fact-checking Ambiguous Claims with Evidence, Transactions of the Association for Computational Linguistics 12 (</article-title>
          <year>2024</year>
          )
          <fpage>1</fpage>
          -
          <lpage>18</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>M.</given-names>
            <surname>Hardalov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Chernyavskiy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Koychev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ilvovsky</surname>
          </string-name>
          , P. Nakov,
          <article-title>CrowdChecked: Detecting Previously Fact-Checked Claims in Social Media</article-title>
          , in: Y. He,
          <string-name>
            <given-names>H.</given-names>
            <surname>Ji</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.-H. Chang</surname>
          </string-name>
          (Eds.),
          <source>Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing</source>
          (Volume
          <volume>1</volume>
          :
          <string-name>
            <surname>Long</surname>
            <given-names>Papers)</given-names>
          </string-name>
          ,
          <source>Online only</source>
          ,
          <year>2022</year>
          , pp.
          <fpage>266</fpage>
          -
          <lpage>285</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>E. C.</given-names>
            <surname>Choi</surname>
          </string-name>
          , E. Ferrara, FACT-GPT:
          <article-title>Fact-Checking Augmentation via Claim Matching with LLMs</article-title>
          ,
          <source>in: Companion Proceedings of the ACM Web Conference</source>
          <year>2024</year>
          ,
          <year>2024</year>
          , pp.
          <fpage>883</fpage>
          -
          <lpage>886</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>C. P.</given-names>
            <surname>Drolsbach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Solovev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Pröllochs</surname>
          </string-name>
          , Community Notes Increase Trust in Fact-Checking on
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>