<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Conference and Labs of the Evaluation Forum, September</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>DEFAULT at CheckThat! 2024: Retrieval Augmented Classification using Diferentiable Top-K Operator for Rumor Verification based on Evidence from Authorities</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sayanta Adhikari</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Himanshu Sharma</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Rupa Kumari</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Shrey Satapara</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Maunendra Desarkar</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Indian Institute of Technology Hyderabad</institution>
          ,
          <addr-line>Telangana, 502285</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2024</year>
      </pub-date>
      <volume>0</volume>
      <fpage>9</fpage>
      <lpage>12</lpage>
      <abstract>
        <p>The paper describes Team DEFAULT's submission at CheckThat! 2024 Task-5 on Rumor Verification based on Evidence from Authorities: In this paper, we present an approach for rumor verification on Twitter, focusing on integrating evidence from authoritative accounts to determine the veracity of rumors. We propose an architecture and a training regime as the preferred method to ensure seamless gradient flow. We formulate rumor verification using evidence from authorities as a Retrieval-Augmented Classification (RAC) task. By re-parameterizing the Top-K operator and applying Entropy-based Smoothing, our method addresses the discontinuity issues faced after retrieval, enhancing the accuracy of rumor verification. Using this classification-aware retrieval, the retriever achieves Recall@5 0.778, outperforming the baseline, placing team DEFAULT third on the test data leaderboard for retrieval. For classification, our approach performs on par with the baseline.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Rumor Verification</kwd>
        <kwd>Retrieval Augmented Classification</kwd>
        <kwd>Diferential TopK</kwd>
        <kwd>Optimal Transport</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        In the present era, social media has become one of the widely used mediums for information sharing
due to its capabilities of fast information sharing at a low cost. This has made online social media a
preferred choice for many individuals and organizations for propaganda-driven misinformation sharing
to influence public opinions and decisions [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. The spread of rumors and misinformation through
social media has become a significant concern. Verifying the veracity of rumors and combating the
dissemination of misinformation is crucial for maintaining the integrity of online discourse. This paper
proposes a novel approach, Retrieval-Augmented Classification (RAC), which combines document
retrieval and classification techniques to address the problem of rumor verification [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        The shared task “Rumor Verification using Evidence from Authorities" at CHECKTHAT! LAB at
CLEF-2024 [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ] related to rumor verification contains two steps approach. The first step involves
document retrieval, wherein authoritative tweets related to a rumor are analyzed to identify the most
relevant tweets. These sources, including reputable organizations or subject matter experts, can provide
valuable evidence supporting or refuting the rumor. The second step is classification, where the retrieved
evidence is leveraged to determine the rumor’s veracity, categorizing it as Supported (TRUE), Refuted
(FALSE), or Unverifiable (NEUTRAL).
      </p>
      <p>To illustrate the methodology, consider a scenario involving a rumor circulating on social media
about a potential disease outbreak. The RAC method would first retrieve relevant documents using
sophisticated algorithms. These documents would then be analyzed based on key features identified
by machine learning models. The rumor would be classified as true if these sources corroborate the
outbreak with compelling evidence. Conversely, if the sources refute the claim or lack suficient evidence,
the rumor would be labelled as false or unverifiable , respectively.</p>
      <p>In traditional retrieval systems, the relevance of a document is determined solely by its similarity
to the query. However, for tasks like rumor verification, the evidence required to validate or refute
a claim may not necessarily resemble the claim itself. This discrepancy between the query and the
desired evidence can lead to suboptimal retrieval performance when using traditional similarity-based
techniques.</p>
      <p>To address this challenge, we proposed a classification-aware retrieval approach by providing an
alignment between the retriever and the classifier, resulting in better retrieval. To jointly train the
retriever and classifier, we removed discontinuity associated with the Top-K document selection for
retrieval by replacing it with Soft Top-K, which allows the gradients to flow between retrieval and
classification module, resulting in end-to-end training using a common loss function. Details about the
proposed approach and its performance, along with analyses, are discussed in the subsequent sections
of the paper.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Works</title>
      <p>
        Rumor verification and Fact-checking are well-known tasks in NLP that have attracted many researchers.
There has been a lot of work related to rumor verification related to dataset collection and training
methods. Fact Checking[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] is one of the early works on claim verification collected from claim
verification websites. Fact Extraction and Verification (FEVER)[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] is a well-known shared task organized at
SemEVal for fact verification.
      </p>
      <p>
        Liu et al. (2020) [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] proposed an approach using a Kernel Graph Attention Network (KGAT). Bekoulis
et al. (2021) [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] emphasized the importance of evidence-aware sentence selection, while Kruengkrai et
al. (2021) [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] presented a multi-level attention model for integrating evidence. These studies provide
valuable insights for developing efective RAC systems for rumor verification using evidence from
authorities.
      </p>
      <p>
        Recent rumor verification research on retrieval augmented verification [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] integrates retrieval and
classification using a zero-shot approach by retrieving real-time web-scraped evidence and matching
claim tests using pretrained language systems. Their graph-structured representation gathers evidence
automatically and highlights unverifiable claim parts. There has been some work for a comprehensive
rumor debunking system using an LLM (involving retrieval, discrimination, and guided generation)[
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
Various systems have been developed to enhance the extraction and application of clinical trial
information. One such system is CliVER [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], an end-to-end system that uses retrieval-augmented techniques to
automatically retrieve clinical trial abstracts, extract pertinent sentences, and apply the PICO framework
to support or refute scientific claims. This system represents a significant advancement in integrating
artificial intelligence and clinical research methodologies, streamlining the process of evidence synthesis
and decision-making in clinical settings.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Preliminary</title>
      <sec id="sec-3-1">
        <title>3.1. Retrieval Augmented Classification</title>
        <p>
          Verifying facts or rumors is challenging due to the subjective nature of the task. It requires access to
contextual information regarding the domain from the current timeline. The verification task can be
reduced to evidence retrieval and claim verification based on the retrieved evidence. This aligns closely
with the domain of retrieval-augmented generation (RAG) [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ], where the task is to generate an answer
in context with a retrieved document. Similarly, we posed rumor verification based on evidence from
authoritative sources as a RAC task where a class needs to be predicted based on the original claim and
retrieved evidence.
        </p>
        <p>RAC can be approached in two ways: 1) training the retriever and classifier independently (
Independent Training), and 2) training the retriever and classifier together ( Joint Training). Independent
training allows each component to be trained separately and then combined. However, a major
drawback is the lack of alignment between the retrieval and classification processes despite being a pipeline.
The classifier’s performance is inherently linked to the retrieval quality, contradicting the notion of
independence. The dependency between the retrieved relevant evidence and the classification of the
given rumor highlights the need for a joint training objective. Joint Training allows for alignment
between the retriever and classifier components, but the major challenge is the discontinuity between
these processes.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Diferential Top-K</title>
        <p>
          To address the issue of discontinuity (Figure 1(a)), we referred to an Optimal Transport (OT) trick
for reparameterizing the Top-K function with entropy regularisation (to make it smooth) [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]. This
technique first formulates the extraction of Top-K elements from a vector into an Optimal Transport
Problem and then applies entropy regularisation to facilitate smooth gradient flow. We used SOFT
(Scalable Optimal transport-based diFferenTialble) Top-K operator in place of the Top-K operator to get
the Top K elements.
1. Problem Formulation Consider the score vector (containing relevance scores for each of the tweets
concerning the rumor tweet) to be  = {}=1, where  is the total number of tweets provided in the
timeline. The standard top- operator returns indexes with Top-K elements, which is equivalent to a
vector  = [1, ..., ], such that
 =
{︃1 if  is one of the top- relevant tweets in with respect to the rumor tweet,
0 otherwise.
        </p>
        <p>Using , we can extract the Top-K elements from . In the case of sorted Top-K,  is a matrix that,
when multiplied with the input , provides us with the Top-K elements in sorted order.
2. Re-parameterizing Top-K Operator as OT Problem: Now, let’s consider the probability associated

with the score vector,  = {}=1 and the output support space,  = {0, 1} (0 to map all the Top-K
elements and 1 for the remaining,  = 2) be  = 1 1 and  = [︀  , −  ]︀ respectively, where  is the
total number of timeline tweets and  represents the total number of evidence tweets that needs to be
retrieved from the timeline.</p>
        <p>
          Γ* = argminΓ≥ 0⟨, Γ⟩, s.t., Γ1 = ,
Γ 1 = ,
Γ, Γ* ∈ R× 
Here, Γ, represents the probability of mapping the input  of  to the output  of  and , of 
represents the cost incurred to move from  to  . Here, Γ represents the joint probability distribution
over the support  cartesian product .
3. Solution: Under the above conditions, the optimal transport plan Γ* is given by (in closed form):
Γ* ,1 =
{︃ 1 , if  ≤ ,

0, if  + 1 ≤  ≤ 
, Γ* ,2 =
{︃0, if  ≤ ,
1 , if  + 1 ≤  ≤ 
where  being the sorting permutation, i.e.,  1 &lt;  2 &lt; · · · &lt;   . Based on Γ* we define  =
Γ* · [
          <xref ref-type="bibr" rid="ref1">1, 0</xref>
          ] . The matrix  is the mapping matrix that provides the position of Top-K elements.
4. Smoothing by Entropy Regularization: Employing entropy regularisation to the OT problem
yields a smoothed approximation. The OT optimization problem further changes to:
Γ* = argminΓ≥ 0⟨, Γ⟩ +  (Γ), s.t., Γ1 = ,
Γ 1 = ,  &gt;
0
where (Γ) = ∑︀, Γ, log Γ, is the entropy regularizer. Based on the above Γ* we define:  =
Γ* , · [
          <xref ref-type="bibr" rid="ref1">1, 0</xref>
          ] as the smoothed counterpart of the standard top- operator output ( in Equation 1).
Throughout our approach, we consider sorted Top-K. Using the Soft-Top-K operator in place of the
Top-K operator helps train the model end-to-end and thus helps in aligning the retriever and the
classifier accordingly.
(1)
(2)
(3)
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Methodology</title>
      <p>(a)</p>
      <p>Timelines</p>
      <p>D
Quxery</p>
      <p>To perform RAC for rumor verification based on evidence from authorities, we propose a novel
architecture that can be trained end-to-end. We propose a transformer-encoder-based architecture
as a retriever followed by a Top-K operator to extract relevant evidence. This is then used to help
the classifier classify the Query Tweet(x). As shown in Figure 1(a), this Top-K operator provides a
discontinuity in the pipeline. To remove this discontinuity, we parameterized Top-K with a smoother
version, Soft Top-K (details provided in subsection 3.2). Figure 1(b) shows the final architecture along
with diferent losses (defined in subsection 4.1) that are used for training purposes.</p>
      <p>If the classifier cannot classify correctly, then it won’t be able to guide the retriever regarding the
relevance of the tweets and vice versa. So, providing models with no information about the downstream
tasks might lead to poor performance and sub-optimal convergence. To counter this efect, we propose
a training method for our architecture. In this, we will first independently train the classifier and then.
We will jointly train both the retriever and classifier. Further, we freeze the retriever and train the
classifier again to increase the classifier’s performance.</p>
      <p>We define a Retriever, , parameterized by  . It computes embeddings for each document (Timelines
) and the Query Tweet . The similarity score between the embedding of  and the embedding
of  is used to extract the relevant tweets. We define the score for  as . To extract the Top-K
relevant tweets, we pass it through Soft Top-K function, which returns a matrix , which gives us
information about the Top-K relevant document indexes. We multiply the matrix  with  to extract
Top-K relevant documents. As we are using Soft Top-K, directly multiplying  to  leads to a change
in the token ids of the word, so we multiply  with the classifier embedding corresponding to  to
get Top-K document embeddings. We define the classifier  as a combination of two functions, 
parameterized by  and  parameterized by . Here,  represents the initial embedding layer of the
BERT model, and  represents the classification head. The classifier can be represented as a composition
of  and , i.e.,  =  ∘ . The classifier verifies  in context with the embeddings of the extracted

evidences ( = {}=1 =  × ([, ];  )). Providing this evidence with  might lead to an overflow
of the model’s context window. To deal with this problem, we get logits concerning each evidence, i.e.,
 = (, ),  = 1, 2, · · · , , and then perform a weighted aggregation of logits using the relevance

scores, i.e.,  = {}=1 =  × , provided by the retriever for each of the evidence. The probability
associated with query tweet  based on the evidence set , denoted as ;
 =
∑︀=1 .
∑︀
=1 
(4)</p>
      <sec id="sec-4-1">
        <title>4.1. Losses and Optimization Objective</title>
        <p>To train our model, we use cross-entropy loss (ℒ ) using  and ground truth value 
(5)
(6)
(7)
 
1 ∑︁ ∑︁ . log()
ℒ = −</p>
        <p>=1 =1
where  denotes the total number of samples and  denotes the total number of classes. To provide
better guidance to the Retriever, we introduced a new loss term, called the Density Loss ℒ over the
output of the Soft-Top-K operator. The Soft-Top-K operator returns a matrix  to provide the tweets
that must be considered. While forming the data, we are already aware of those positions so that we
can provide the ground truth * matrix. We compute the density loss as a mean of cross-entropy of
each row of  for each row of * . Mathematically,</p>
        <p>1 ∑︁ ∑︁ ∑︁ * [, ]. log ([, ])
ℒ = −   =1 =1 =1
where  and * represents the predicted and ground truth  matrix for the query input  and 
denotes the total number of rows in . The final loss is an aggregation of these two losses. As both the
losses are of the same scale, we add them with equal weight. Based on the defined losses, we define our
optimization problem as
arg min ℒ + ℒ</p>
        <p>,,
In practice, we use Adam optimizer to train this objective. For more details regarding the training
process, refer to Algorithm 1. The required datasets for training as per Algorithm 1 are independent
training classifier datasets (  ) and joint training datasets ( ). Further details of this dataset are
provided in subsection 5.2.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Experimental Setup</title>
      <sec id="sec-5-1">
        <title>5.1. Dataset Description</title>
        <p>
          We utilized the dataset from CLEF 2024 CheckThat! Lab, Task 5 [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] includes Twitter data curated for
rumor verification in English and Arabic. Our experiments focused on the English dataset, comprising
96 training and 32 validation samples. Each sample contains an id (unique identifier), rumor (tweet
text), timeline (tweets from authorities during the rumor’s timeframe), label (veracity: SUPPORTS,
REFUTES, or NOT ENOUGH INFO), and evidence (tweets from authorities aiding classification). We
augmented the dataset to increase the sample size.
        </p>
      </sec>
      <sec id="sec-5-2">
        <title>5.2. Data Augmentation and Training</title>
        <sec id="sec-5-2-1">
          <title>5.2.1. Independent Training</title>
          <p>Retriever: For independent training of the Retriever, we use contrastive training [15]. Each rumor
tweet is paired with an evidence tweet as a positive sample and  (3 in our experiments) non-evidence
tweets from the timeline as negative samples. These triplets train the model with contrastive loss
functions [15, 16]. We create multiple samples with randomly chosen negative tweets for robustness
and exclude samples without any evidence tweets. We have considered multiple score functions for
scoring the similarity between the tweets: (a) Euclidean Distance between the representation vectors,
(b) Cosine similarity between the two representation vectors, and (c) MaxSim similarity proposed in the
paper of ColBERT [15]. We initialize the retriever with colber-ir/colbertv2.0 checkpoint weights from
huggingface. To further finetune the model, we use a batch size of 1 (fixed), epochs as 5 (using early
stopping), learning rate as 5 − 5 with similarity score as MaxSim, and contrastive loss as provided in
[15].</p>
          <p>Classifier: For independent training of the Classifier, we create tweet pairs. Each rumor tweet is paired
with an evidence tweet and labelled according to the original data label ("SUPPORTS" or "REFUTES")
or paired with a non-evidence tweet and labelled as "NOT ENOUGH INFO." This process ensures a
balanced class distribution in the final training dataset. After initializing with pretrained weights, we
used a batch size of 2 (fixed), epochs of 7 (got using early stopping) and a learning rate of 1 − 5 to
ifne-tune the classifier.</p>
        </sec>
        <sec id="sec-5-2-2">
          <title>5.2.2. Joint Training</title>
          <p>For joint training, each rumor tweet is paired with a document set of size  (64 in our experiments).
The document set includes all, some, or none of the evidence tweets, filled to  with non-evidence
timeline tweets. Document sets with evidence are labelled based on the original data point, while those
without evidence are labelled "NOT ENOUGH INFO." We shufle the document sets to avoid bias from
tweet order and ensure a balanced class distribution in the final dataset. To train the model, we used a
batch size of 1 (fixed), with a learning rate of 1 − 5, a K value of 5 (given), and the epsilon value of Soft
Top-K as 0.01 (fixed). We train the model for 5 epochs (using early stopping).</p>
        </sec>
        <sec id="sec-5-2-3">
          <title>5.2.3. Our Approach</title>
          <p>As joint training starts training from pretrained weights, it is dificult for the classifier to guide the
retriever and vice-versa. In our approach, we first independently train the classifier on the independent
training dataset with a similar hyperparameter provided in subsubsection 5.2.1 for 5 epochs (got using
early stopping). Then, we finetune the whole architecture (Retriever + Classifier) end-to-end. We used
the dataset presented in subsubsection 5.2.2 to perform joint training. We use the hyperparameters
provided in subsubsection 5.2.2 for joint training. After this, we again finetuned the classifier with a
frozen retriever to further boost the classifier’s performance. To further train, we used a batch size of 1
(fixed), with a learning rate of 1 − 5 (fixed) for 5 epochs.</p>
          <p>We provide the statistics about the dataset we got after performing augmentation in Table 1. Further,
we used these data to train our model. As we have tweets as our input data, we preprocessed each tweet
by removing the links in the tweet. We converted each of the emojis in the tweet with their relevant
text translation using the ‘emoji’ python package [17]. We used a single NVIDIA Tesla 32GB V100 GPU
to train our models. It took around an hour to train the whole model on the dataset.</p>
        </sec>
      </sec>
      <sec id="sec-5-3">
        <title>5.3. Evaluation Metrics</title>
        <p>The primary measure for evaluating evidence retrieval is Mean Average Precision (MAP). Under
this metric, systems receive no credit for retrieving tweets related to unverifiable rumors. Another
important evaluation metric is Recall@5 (R@5), which measures the proportion of relevant tweets
retrieved among the top 5 retrieved tweets. We utilize the Macro-F1 (M-F1) score for classification
evaluation, which calculates the harmonic mean of precision and recall across all classes. Additionally,
we consider a Strict Macro-F1 score, where the correctness of a rumor label is contingent upon at least
one retrieved authority evidence being correct.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Results</title>
      <p>Table 2 provides results related to diferent experiments we have performed. We can conclude from
the results that MaxSim similarity performs better than the other similarities we considered. We also
observed that those initialized with ColBERT pretrained weights performed better than those initialized
with BERT pretrained weights. This is trivial as ColBERT is specifically trained for information retrieval
and matches between individual tokens of the two texts instead of comparing the overall pooled vectors
representing the two texts- claim and candidate evidence tweet. Inspection at finer granularity helps
it identify the matches better. We can also see that Joint Training performs better than Independent
Training. We can also observe that our proposed training curriculum performs better than both purely
Joint and Independent Training. We can also observe that it reduces performance when diferent
pretrained models are used for the retriever and the classifier. Overall, we can observe that ColBERT-B
with our Approach performs best among all our approaches. This approach can beat KGAT’s retriever
performance by a huge margin, but the classification performance is less than that of KGAT.</p>
    </sec>
    <sec id="sec-7">
      <title>7. Conclusion</title>
      <p>We present a joint training framework to simultaneously optimize an evidence retriever and a rumor
classifier in an end-to-end fashion. We show that our approach performs better than both independent
and joint individually. Our experiments have shown that merging these two approaches together leads
to better performance. From the results, we can conclude that our approach can retrieve relevant
tweets accurately and it can extract at least one relevant tweet for all the rumor claims as Macro-F1 and
Strict-Macro-F1 are the same for ColBERT-B with our Approach.</p>
      <p>Also, the results show the importance of joint training. Using the Soft Top-K operation as a
diferentiable approximation of the standard Top-K operation can not only encounter discontinuity but enhance
the model’s performance. Further, we conclude that having Soft-TopK-based reparameterization and
independent training followed by joint training leads to better performance. Moreover, we observe
that the classifier-guided retriever boosts the performance of the retriever, such that it outperforms the
baseline by a huge margin, whereas the classifier’s performance is on par with the baseline.
optimal transport, Advances in Neural Information Processing Systems 33 (2020) 20520–20531.
[15] O. Khattab, M. Zaharia, Colbert: Eficient and efective passage search via contextualized late
interaction over bert, in: Proceedings of the 43rd International ACM SIGIR conference on research
and development in Information Retrieval, 2020, pp. 39–48.
[16] I. Malkiel, D. Ginzburg, O. Barkan, A. Caciularu, Y. Weill, N. Koenigstein, Metricbert: Text
representation learning via self-supervised triplet training, in: ICASSP 2022 - 2022 IEEE
International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022, pp. 1–5.
doi:10.1109/ICASSP43922.2022.9746018.
[17] emoji — pypi.org, https://pypi.org/project/emoji/, 2024. [Accessed 31-05-2024].
[18] J. D. M.-W. C. Kenton, L. K. Toutanova, Bert: Pre-training of deep bidirectional transformers for
language understanding, in: Proceedings of NAACL-HLT, 2019, pp. 4171–4186.
[19] O. Khattab, M. Zaharia, Colbert: Eficient and efective passage search via contextualized late
interaction over bert, in: Proceedings of the 43rd International ACM SIGIR conference on research
and development in Information Retrieval, 2020, pp. 39–48.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>D.</given-names>
            <surname>Varshney</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. K.</given-names>
            <surname>Vishwakarma</surname>
          </string-name>
          ,
          <article-title>A review on rumour prediction and veracity assessment in online social network</article-title>
          ,
          <source>Expert Systems with Applications</source>
          <volume>168</volume>
          (
          <year>2021</year>
          )
          <article-title>114208</article-title>
          . URL: https://www. sciencedirect.com/science/article/pii/S0957417420309362. doi:https://doi.org/10.1016/j. eswa.
          <year>2020</year>
          .
          <volume>114208</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. M. S.</given-names>
            <surname>Khoo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. L.</given-names>
            <surname>Chieu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Xia</surname>
          </string-name>
          ,
          <article-title>Coupled hierarchical transformer for stance-aware rumor verification in social media conversations</article-title>
          ,
          <source>Association for Computational Linguistics</source>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Barrón-Cedeño</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Alam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Chakraborty</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Elsayed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Przybyła</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Struß</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Haouari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hasanain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Ruggeri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Suwaileh</surname>
          </string-name>
          ,
          <string-name>
            <surname>The</surname>
            <given-names>CLEF</given-names>
          </string-name>
          -2024 CheckThat! Lab:
          <article-title>Check-worthiness, subjectivity, persuasion, roles, authorities, and adversarial robustness</article-title>
          , in: N.
          <string-name>
            <surname>Goharian</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Tonellotto</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>He</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Lipani</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>McDonald</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Macdonald</surname>
          </string-name>
          , I. Ounis (Eds.),
          <source>Advances in Information Retrieval</source>
          , Springer Nature Switzerland, Cham,
          <year>2024</year>
          , pp.
          <fpage>449</fpage>
          -
          <lpage>458</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>F.</given-names>
            <surname>Haouari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Elsayed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Suwaileh</surname>
          </string-name>
          ,
          <article-title>Overview of the CLEF-2024 CheckThat! Lab Task 5 on Rumor Verification using Evidence from Authorities</article-title>
          , in: G. Faggioli,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Galuščáková</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . García Seco de Herrera (Eds.), Working Notes of CLEF 2024 -
          <article-title>Conference and Labs of the Evaluation Forum</article-title>
          ,
          <string-name>
            <surname>CLEF</surname>
          </string-name>
          <year>2024</year>
          , Grenoble, France,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Vlachos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Riedel</surname>
          </string-name>
          ,
          <article-title>Fact checking: Task definition and dataset construction</article-title>
          ,
          <source>in: Proceedings of the ACL 2014 workshop on language technologies and computational social science</source>
          ,
          <year>2014</year>
          , pp.
          <fpage>18</fpage>
          -
          <lpage>22</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J.</given-names>
            <surname>Thorne</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Vlachos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Cocarascu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Christodoulopoulos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mittal</surname>
          </string-name>
          ,
          <article-title>The fact extraction and VERification (FEVER) shared task</article-title>
          , in: J.
          <string-name>
            <surname>Thorne</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Vlachos</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          <string-name>
            <surname>Cocarascu</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Christodoulopoulos</surname>
            ,
            <given-names>A</given-names>
          </string-name>
          . Mittal (Eds.),
          <source>Proceedings of the First Workshop on Fact Extraction and VERification (FEVER)</source>
          ,
          <source>Association for Computational Linguistics</source>
          , Brussels, Belgium,
          <year>2018</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>9</lpage>
          . URL: https://aclanthology.org/W18-5501. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>W18</fpage>
          -5501.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Xiong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <article-title>Fine-grained fact verification with kernel graph attention network</article-title>
          , in: D.
          <string-name>
            <surname>Jurafsky</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Chai</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Schluter</surname>
          </string-name>
          , J. Tetreault (Eds.),
          <article-title>Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics</article-title>
          , Online,
          <year>2020</year>
          , pp.
          <fpage>7342</fpage>
          -
          <lpage>7351</lpage>
          . URL: https://aclanthology.org/
          <year>2020</year>
          .acl-main.
          <volume>655</volume>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2020</year>
          . acl-main.
          <volume>655</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>G.</given-names>
            <surname>Bekoulis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Papagiannopoulou</surname>
          </string-name>
          ,
          <string-name>
            <surname>N.</surname>
          </string-name>
          <article-title>Deligiannis, Understanding the impact of evidence-aware sentence selection for fact checking</article-title>
          , in: A.
          <string-name>
            <surname>Feldman</surname>
          </string-name>
          , G. Da San Martino, C. Leberknight, P. Nakov (Eds.),
          <source>Proceedings of the Fourth Workshop on NLP for Internet Freedom: Censorship</source>
          , Disinformation, and Propaganda, Association for Computational Linguistics, Online,
          <year>2021</year>
          , pp.
          <fpage>23</fpage>
          -
          <lpage>28</lpage>
          . URL: https://aclanthology.org/
          <year>2021</year>
          .nlp4if-
          <fpage>1</fpage>
          .4. doi:
          <volume>10</volume>
          .18653/v1/
          <year>2021</year>
          .nlp4if-
          <fpage>1</fpage>
          .4.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>C.</given-names>
            <surname>Kruengkrai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Yamagishi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>A multi-level attention model for evidence-based fact checking</article-title>
          , in: C.
          <string-name>
            <surname>Zong</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Xia</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Navigli</surname>
          </string-name>
          (Eds.),
          <article-title>Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021</article-title>
          ,
          <article-title>Association for Computational Linguistics</article-title>
          , Online,
          <year>2021</year>
          , pp.
          <fpage>2447</fpage>
          -
          <lpage>2460</lpage>
          . URL: https://aclanthology.org/
          <year>2021</year>
          .findings-acl.
          <volume>217</volume>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2021</year>
          .findings-acl.
          <volume>217</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>A. U.</given-names>
            <surname>Dey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Llabrés</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Valveny</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Karatzas</surname>
          </string-name>
          ,
          <article-title>Retrieval augmented verification: Unveiling disinformation with structured representations for zero-shot real-time evidence-guided factchecking of multi-modal social media posts</article-title>
          ,
          <source>arXiv preprint arXiv:2404.10702</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>J.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Xian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Yin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <article-title>The future of combating rumors? retrieval, discrimination</article-title>
          , and generation,
          <year>2024</year>
          . arXiv:
          <volume>2403</volume>
          .
          <fpage>20204</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>H.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Soroush</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. G.</given-names>
            <surname>Nestor</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Park</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Idnay</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Fang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Pan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Liao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bernard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Peng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Weng</surname>
          </string-name>
          ,
          <article-title>Retrieval augmented scientific claim verification</article-title>
          ,
          <source>JAMIA Open 7</source>
          (
          <year>2024</year>
          )
          <article-title>ooae021</article-title>
          . doi:
          <volume>10</volume>
          .1093/jamiaopen/ooae021.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>P.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Perez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Piktus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Petroni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Karpukhin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Küttler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lewis</surname>
          </string-name>
          , W.-t. Yih,
          <string-name>
            <given-names>T.</given-names>
            <surname>Rocktäschel</surname>
          </string-name>
          , et al.,
          <article-title>Retrieval-augmented generation for knowledge-intensive nlp tasks</article-title>
          ,
          <source>Advances in Neural Information Processing Systems</source>
          <volume>33</volume>
          (
          <year>2020</year>
          )
          <fpage>9459</fpage>
          -
          <lpage>9474</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Xie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Dai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Dai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Zha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Wei</surname>
          </string-name>
          , T. Pfister,
          <article-title>Diferentiable top-k with</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>