<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Multi-Modal Fact Verification Dataset</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Shreyash Mishra</string-name>
          <email>shreyash.m19@iiits.in</email>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>S Suryavardan</string-name>
          <email>suryavardan.s19@iiits.in</email>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Amrit Bhaskar</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Parul Chopra</string-name>
          <email>parulcho@andrew.cmu.edu</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Aishwarya Reganti</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Parth Patwa</string-name>
          <xref ref-type="aff" rid="aff6">6</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Amitava Das</string-name>
          <xref ref-type="aff" rid="aff7">7</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tanmoy Chakraborty</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Amit Sheth</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Asif Ekbal</string-name>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Chaitanya Ahuja</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Vancouver, Canada</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>AI Institute, University of South Carolina</institution>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Arizona State University</institution>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Carnegie Mellon University</institution>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>IIIT Delhi</institution>
          ,
          <country country="IN">India</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>IIIT Sri City</institution>
          ,
          <country country="IN">India</country>
        </aff>
        <aff id="aff5">
          <label>5</label>
          <institution>IIT Patna</institution>
          ,
          <country country="IN">India</country>
        </aff>
        <aff id="aff6">
          <label>6</label>
          <institution>University of California Los Angeles</institution>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff7">
          <label>7</label>
          <institution>Wipro AI labs</institution>
          ,
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2021</year>
      </pub-date>
      <abstract>
        <p>Combating fake news is one of the burning societal crisis. It is dificult to expose false claims before they create a lot of damage. Automatic fact/claim verification has recently become a topic of interest among diverse research communities. Forums like FEVER, FNC [1, 2] aim to discuss automatic fact-checking on text. Research eforts and datasets on text fact verification could be found, but there is not much attention towards multi-modal or cross-modal fact-verification. In order to bring the attention of the research community towards understanding multimodal misinformation, we release a multimodal fact checking dataset named FACTIFY. It is notably the largest multimodal fact verification public dataset consisting of 50K data points, covering news from India and the US. FACTIFY contains images, textual claims, reference textual documents and images labeled with three broad categories namely - support, no-evidence, and refute.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        In the recent years, automatic fact checking has emerged to be an important problem in the AI
community, since dangers of fraudulent claims masquerading as declarations of reality have
become common. Although the birth of this problem goes back to the initial years of printing
press, it has attracted increasing interest with the usage of social media. The rapid distribution
of news across numerous media sources has resulted in the fast development of erroneous
and fake content. It is tough to uncover misleading statements before they cause significant
harm. According to statistics [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], about 67% of the American population believes that fake news
produces a lot of uncertainty, and 10% of them knowingly propagate fake news. On the contrary,
only 26% of respondents said they feel confidence in their ability to recognize bogus news.
      </p>
      <p>
        The scarcity of available training data has been a fundamental obstacle in automated
factchecking. Recently, significant progress has been made with the release of two of the largest
datasets - FEVER [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] and LIAR [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], among several others. LIAR contains 12.8K claims along
with their meta-data (i.e., speaker of the claim, political afiliations of the speaker, medium
through which the claim was first published) collected from the real fact-checking websites like
Politifcat. Huge advancements have been achieved since the release of LIAR. A significantly
larger dataset - FEVER includes proof and extensive meta-data to contextualize the claims
even more. FEVER consists of 185K claims which were manually curated from Wikipedia.
Although FEVER is a large dataset, it was purpose-made for research and this limits its ability
to capture patterns from the real-world data. We release a multimodal fact checking dataset,
called FACTIFY, which would aid in resolving this problem as it consists of original samples
with no post-processing or manual data creation involved. Additionally, the visual cues that
support textual claims would help the system to detect fake content with greater confidence.
The dataset is released at https://competitions.codalab.org/competitions/35153 and the baselines
are available at https://github.com/Shreyashm16/Factify .
      </p>
      <p>
        Although there are research initiatives [
        <xref ref-type="bibr" rid="ref5 ref6 ref7">5, 6, 7</xref>
        ] and datasets [
        <xref ref-type="bibr" rid="ref1 ref4">1, 4</xref>
        ], on textual fact verification,
there is less focus on multi-modal or cross-modal fact verification. The majority of the present
fact-checking research relies on unimodal techniques, synthetic data production, and limited
annotated datasets. Therefore, we believe that FACTIFY can serve as a stepping stone to build
novel multimodal fact verification systems. The dataset contains images, textual claim, reference
textual document/image. The task is to tag support, no-evidence, and refute between given
claims; each of these categories are explained in the next section. The first two categories are
further sub-divided into text and multimodal components. Thus, in total, all the data samples
are labeled with one out of five choices. We choose twitter handles of popular news channels
from the two large nations – the US and India. Therefore, the dataset entirely consists of real
samples gathered from diferent social media news handles popular in India and the US.
      </p>
      <p>To summarize, in this paper, we release a novel multimodal fact-checking dataset that can
be used as a benchmark for researchers. We also propose unimodal and multimodal baseline
models for our dataset. The paper is organised as follows: The proposed task is described in
Section 2. Related work is described in Section 3. Data collection and data distribution are
explained in Section 4 while Section 5 demonstrates the baseline model. Section 6 shows the
results of our baseline models. Finally, we summarise our task along with the further scope and
open-ended pointers in Section 7.
2. The Factify Task</p>
      <p>To detect multimodal fake news, we model the task as a multimodal entailment. We assume
that each data point contains a reliable source of information, called “document”, and its
associated image and another source whose validity must be assessed, called the “claim” which
also contains a respective image. The goal is to identify if the claim entails the document.
Since we are interested in a multimodal scenario with both image and text, entailment has two
verticals, namely textual entailment and visual entailment and their respective combinations.
This data format is a stepping stone for the fact checking problem where we have one reliable
source of news and want to identify the fake/real claims given a large set of multimodal claims.
Therefore the task essentially is – given a textual claim, claim image, document and document
image, the system has to classify the data sample into one of the five categories: Support_Text,
Support_Multimodal, Insuficient_Text , Insuficient_Multimodal and Refute. The images are also
supported by the text obtained by running an OCR. The descriptions of the labels are as
follows• Support_Text: the claim text is similar or entailed but images of the document and claim
are not similar.
• Support_Multimodal: both the claim text and image are similar to that of the document.
• Insufficient_Text: both text and images of the claim are neither supported nor refuted
by the document, although it is possible that the text claim has common words with the
document text.
• Insufficient_Multimodal: the claim text is neither supported nor refuted by the
document but images are similar to the document.
• Refute: The images and/or text from the claim and document are completely
contradictory i.e, the claim is false/fake.</p>
    </sec>
    <sec id="sec-2">
      <title>3. Related Work</title>
      <p>Over the last few years, various fact checking and fact verification datasets have been published.
Majority of them being text based and only a few being multi-modal datasets. The textual
datasets can broadly be grouped into two categories based on the information they provide.</p>
      <p>
        The first category includes datasets that aim to predict the veracity based on the claim alone.
LIAR [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] contains 12.8k manually labeled claims from politifact with 6 fine-grained labels and
metadata such as speaker name. CREDBANK [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] focuses on checking credibility by providing
tweets related to 1k events with manual credibility annotation. The Lie Detector dataset [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]
approaches the task with ’true’ and ’deceptive’ text samples of size 600. Another such dataset
uses Claim Matching [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] and has 2k pairs of multi-lingual text with labels based on text pair
similarity. A dataset on Covid-19 fake news is provided by [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
      </p>
      <p>
        The second category includes datasets where the claim is accompanied with documents
annotated with labels indicating whether the document supports the claim or is unrelated to
it. A very well known dataset of this type is FEVER [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. It contains 185k samples with a claim
and a supporting document from Wikipedia, but, these claims were manually generated and
then altered before being classified as ’ Support’, ’Refute’ or ’NotEnoughInfo’. MultiFC [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] is a
multi-domain dataset of size 35k with claims and rich-metadata from 26 diferent websites. It has
a wide range of labels preserved from these websites such as ’correct’, ’incorrect’, ’mis-attributed’
and ’not the whole story’.
      </p>
      <p>
        Textual datasets are no longer enough in the social media age. It is important to consider
both the image and text when detecting fake news. Fakeddit [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] is a multi-modal dataset
providing an image associated with a text. The image can be used as evidence for the text or
vice-versa. Each of its 1 million samples has both high-level and fine-grained labels. It is similar
to a image-caption dataset, which could result in a disjoint claim and image. FakeNewsNet
[
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] contains 23k articles with context and spatio-temporal information focused on fake news
source and mitigation. The data and their labels have been obtained from fact checking websites
such as Politifact and GossipCop. A dataset of fact-checked images shared on WhatsApp
during the 2018 Brazilian and 2019 Indian Elections [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] provides two sets of 135 and 897
images containing misinformation from Brazil and India, respectively. These fact-checked fake
images from WhatsApp are supported by data from fact checking websites and manual expert
annotations. Table 1 summarizes datasets and their statistics. To the best of our knowledge,
Factify is the largest real-world multimodal fake news detection dataset. The dataset has five
categories based on the entailment of the text and image pairs. It supports the automation of
fact checking using an entailment approach.
      </p>
      <sec id="sec-2-1">
        <title>Name</title>
        <p>
          LIAR [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>CREDBANK [8]</title>
      </sec>
      <sec id="sec-2-3">
        <title>The Lie Detector [9]</title>
      </sec>
      <sec id="sec-2-4">
        <title>Claim matching be</title>
        <p>
          yond english [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]
        </p>
      </sec>
      <sec id="sec-2-5">
        <title>FEVER [1]</title>
      </sec>
      <sec id="sec-2-6">
        <title>MultiFC [12]</title>
      </sec>
      <sec id="sec-2-7">
        <title>Fakeddit [13]</title>
      </sec>
      <sec id="sec-2-8">
        <title>Covid-19 Fake</title>
      </sec>
      <sec id="sec-2-9">
        <title>News dataset [11]</title>
      </sec>
      <sec id="sec-2-10">
        <title>FakeNewsNet [14]</title>
      </sec>
      <sec id="sec-2-11">
        <title>Whatsapp factchecking dataset [15]</title>
      </sec>
      <sec id="sec-2-12">
        <title>Factify (ours)</title>
        <p># Claims
# Labels
Data</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>4. Data</title>
      <sec id="sec-3-1">
        <title>4.1. Data Collection</title>
        <p>We collected date-wise tweets from twitter handles of Indian and US news sources: Hindustan
Times 1, ANI2 for India and ABC3, CNN 4 for US based on accessibility, popularity and posts per
day. Moreover, these twitter handles are eminent for their objective and disinterested approach.
From each tweet, we extracted the tweet text and the tweet image(s). Now, for each tweet, we
do the following:
1https://twitter.com/htTweets
2https://twitter.com/ANI
3https://twitter.com/ABC
4https://twitter.com/CNN
• For each tweet of account A, we got similar tweets from account B. Similarity is measured
on the basis of text. Text similarity is measured using Sentence BERT first, and then the
extent of common words is measured as the second metric.
• Next, the the image similarity for the corresponding images of the tweet pair was
calculated. Image similarity is measured using histogram similarity and cosine similarity on a
pre-trained ResNet50 model.
• According to the scores for each of these measures, the tweet pair is classified into
4 categories: Support_Multimodal, Support_Text, Insufficient_Multimodal and
Insufficient_Text. The various thresholds used for classification are listed in
Figure 3.
• From this tweet pair, we selected a tweet (say tweet B) and obtained the url for the
corresponding article published on the source’s website from the tweet text. We then
replaced the tweet text with article contents after scraping it ( document in dataset). We
do this so as to mimic real world fact checking process, i.e., manually comparing claims
with documents or articles.</p>
        <p>• The image OCRs were obtained using Google Cloud Vision API 5.</p>
        <p>Here is the final description for each attribute in the dataset
• Claim: Tweet A text
• Claim_image: Tweet A image
• Claim_ocr: Tweet A image OCR
• Document: Tweet B article text
• Document_image: Tweet B image
• Document_ocr: Tweet B image OCR
• Category
Figure 2 explains the five classes in our dataset.</p>
        <p>For appropriate classification of the dataset, two similarity measures were computed.
Sentence Comparison: We use 2 methods to check similarity amongst sentences:
• Sentence BERT: Sentence BERT [16] is a modification of the BERT model that uses siamese
and triplet network structures to get sentence embeddings. These sentence embeddings
can be compared with each other to get their corresponding similarity score. We use
cosine similarity as the textual similarity metric. We use Sentence BERT (SBERT) over
the pre-trained BERT model and RoBERTa mainly because it is much faster without
compromising the accuracy. For our application, we used the
‘paraphrase-MiniLM-L6v26’ pre-trained model. For each text pair, we derive their corresponding embeddings
using the SBERT model, and check their cosine similarity. We manually decide on a
threshold value T1 for cosine similarity, and classify the text pair accordingly. If the cosine
similarity score is greater than T1, then it is classified into the support category. On the
other hand, if the cosine similarity score is lower than T1, the news may or may not be
the same (the evidence at hand is insuficient to judge whether the news is same or not).</p>
        <p>Hence it is sent for another check before classifying it into Insuficient category.
5https://cloud.google.com/vision/docs/ocr
6https://huggingface.co/sentence-transformers/paraphrase-MiniLM-L6-v2</p>
        <p>• NLTK: If the cosine similarity of the sentence pair is below T1, we use the NLTK library
[17] to check for common words between the two sentences. If the common words score
is above a diferent manually decided threshold T2, only then the news pair is classified
into the insuficient category. Common words are being checked to ensure that the
classification task is challenging. To check for common words, both texts in the pair
are preprocessed, which included stemming and removing stopwords. The processed
texts are then checked for common and similar words, and their corresponding scores
are determined. If the common words score is greater than T2, the pair is classified as
Insuficient else the pair is dropped.</p>
        <p>Image Comparison: We use two metrics for determining whether images are similar or not:
• Histogram Similarity: The images are converted to normalized histogram format and
similarity is measured using the correlation metric.
• Cosine Similarity: The images are converted to feature vectors using pre-trained ResNet50
model, and these feature vectors are used to calculate the cosine similarity score. Manually
decided thresholds, as described in Figure 3, are used to judge whether the text and image
pair is similar or not.</p>
        <p>The text pairs are first classified into either Support or Insuficient categories, and then further
sub-classified into Support_Text/ Support_Multimodal or Insuficient_Text /
Insuficient_Multimodal categories based on the similarity of the image pairs. If the corresponding images for
the texts are similar, then they could be used to judge whether news is the same or not. The
category where both the images and the texts are similar is called Support_Multimodal. The
category where the images are similar but the texts were not is called Insuficient_Multimodal .
If the corresponding images for the texts were not similar, then they could not be used to judge
whether news is the same or not. The category where both the images and the texts are not
similar is called Insuficient_Text . The category where the texts are similar but the images are
not is called Support_Text.</p>
        <p>For the refute category, we scrape several reliable fact-check websites like Vishwas7, Times of
India8, India Today9, AFP India10, AFP USA11, AltNews12, BOOM13, Factly14, NewsChecker15,
NewsMobile16 and WebQoof17. For each article in these websites, we collect the claim (sentence
that states the fake news), document (text that proves claim is false), claim images (fake news
image, may be screenshot of fake-post), document_image (image is proof of fakeness of claim).
The dataset is released at https://competitions.codalab.org/competitions/35153 .</p>
      </sec>
      <sec id="sec-3-2">
        <title>4.2. Data Statistics And Analysis</title>
        <p>In order to understand the nature and distribution, we provide preliminary analysis of the
Factify dataset. The dataset has a total of 50000 samples, and each of the 5 categories has equal
samples. The dataset has a Train-Val-Test split of 70:15:15 2.</p>
        <p>To identify and predict the veracity of the claim, a common method is to collate a given
claim and the corresponding news article or document. We analyze the word occurrence and
distribution of the claims in Figure 4. We can observe that most fake news is related to politics
and religion.</p>
        <p>7https://www.vishvasnews.com
8https://timesofindia.indiatimes.com/times-fact-check
9https://www.indiatoday.in/fact-check
10https://factcheck.afp.com/afp-india
11https://factcheck.afp.com/afp-usa
12https://www.altnews.in/
13https://www.boomlive.in/fact-check
14https://factly.in/category/english/
15https://newschecker.in/
16https://newsmobile.in/
17https://www.thequint.com/news/webqoof
Support_Multimodal
Support_Text
Insufficient_Multimodal
Insufficient_Text
Refute
Total
7000
7000
7000
7000
7000
35000</p>
        <p>Validation
1500
1500
1500
1500
1500
7500</p>
        <p>Test
1500
1500
1500
1500
1500
7500</p>
        <p>Total
10000
10000
10000
10000
10000
50000</p>
        <p>The claims in the dataset are majorly associated with politics and governance. Claims from
both the USA and India mention political parties and leaders, as shown by the top 20 most
frequent entities listed in Table 3. The data captures other past or present afairs such as
”Covid19” aswell.</p>
        <p>We show the number of unique n-grams for the Factify dataset in Table 4. This shows the
lexical diversity of the dataset.</p>
        <p>(a) Support
(b) Insuficient
(c) Refute</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>5. Baseline model</title>
      <p>We explore 2 diferent settings to establish baselines i.e., text-only &amp; multimodal. The goal
is to identify the diference between using only one prime modality which is text and then
augmenting image information to gauge the performance boost.</p>
      <p>Text Only Model: This model (shown in figure 5) ignores the information given by the image.
Instead of focusing on multimodal aspect of the data, this model focuses only on the textual
aspect. To do so, the model creates sentence embeddings of claim and document attributes using
a pretrained Sentence BERT model [16], ’paraphrase-MiniLM-L6-v2’. Then, cosine similarity
is measured on the embeddings. This score is used as the only feature for the dataset, and
classification is performed using traditional machine learning classifiers like Support Vector
Machine and Decision Tree.</p>
      <p>Multi-Modal Model: Information shared online is very often of multi-modal nature. Images
can change the context of a claim and lead to misinformation. Thus, it is important that we
consider both the image and text when classifying the claims. As it is an entailment based
approach, features from both claim and document image-text pairs must be extracted. This
is done using the pre-trained ResNet50 model [18]. The cosine similarity score is computed
between both the claim and document image features. The cosine similarity for the text
embeddings is computed, same as the textual baseline model. The model diagram is shown in
ifgure 6. The final classification F1 score is shown in the table 5 below for diferent classifiers
trained on these two scores as attributes. There is an improvement in performance compared
to the text-only model. The baselines are available at https://github.com/Shreyashm16/Factify .</p>
    </sec>
    <sec id="sec-5">
      <title>6. Results</title>
      <p>The results obtained for each of setting described above are presented in Table 5. We experiment
with various classification models for both the text and multimodal settings. For the text
only setting, our best performing decision tree model achieves an F1-Score of 41.3% on the
test set. While in the multimodal setting achieves a best performance of 53.09%. Note that
there is about ~9% performance improvement when image features are used, which suggests
that the task performance heavily relies on multi-modal information. However, we use quite
naive approaches to establish baselines to encourage more innovative approaches and there
is a huge scope for improvement. The results also indicate that of-the-shelf models don’t
perform very well on the task since the best performing model achieves only 53.09%. More
comprehensive approaches like using vision-language pre-trained models, training on other
related datasets/tasks and fine-tuning on Factify, innovative attention and fusion techniques
will definitely boost performance. We leave such methods as future work.</p>
      <sec id="sec-5-1">
        <title>Method</title>
      </sec>
      <sec id="sec-5-2">
        <title>Text-only</title>
      </sec>
      <sec id="sec-5-3">
        <title>Text-only</title>
      </sec>
      <sec id="sec-5-4">
        <title>Text-only</title>
      </sec>
      <sec id="sec-5-5">
        <title>Text-only</title>
      </sec>
      <sec id="sec-5-6">
        <title>Text-only</title>
      </sec>
      <sec id="sec-5-7">
        <title>Multimodal</title>
      </sec>
      <sec id="sec-5-8">
        <title>Multimodal</title>
      </sec>
      <sec id="sec-5-9">
        <title>Multimodal</title>
      </sec>
      <sec id="sec-5-10">
        <title>Multimodal</title>
      </sec>
      <sec id="sec-5-11">
        <title>Multimodal</title>
      </sec>
      <sec id="sec-5-12">
        <title>Algorithm</title>
      </sec>
      <sec id="sec-5-13">
        <title>Logistic Regression KNN SVM</title>
      </sec>
      <sec id="sec-5-14">
        <title>Decision Tree</title>
      </sec>
      <sec id="sec-5-15">
        <title>Random Forest</title>
      </sec>
      <sec id="sec-5-16">
        <title>Logistic Regression KNN SVM</title>
      </sec>
      <sec id="sec-5-17">
        <title>Decision Tree</title>
      </sec>
      <sec id="sec-5-18">
        <title>Random Forest</title>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>7. Conclusion and Future Work</title>
      <p>In this work, we take a leap towards developing machine learning techniques for the multimodal
fact verification by releasing a large real-world dataset with cues from two modalities namely
text and image. We also release unimodal and multimodal baselines to emphasize on the
dificulty of the problem and scope for improvement. However, our work only scratches the
surface and many follow-up research directions can be pursued. In the current dataset, we
assume that claims have a binary class i.e., either fake or true but there can be cases where
the claim can be partially true or fake. We aim to incorporate these classes in our future work.
We also envision to understand deeper relationships between text and image with the help of
attention methods in the future.
elections, 2020. arXiv:2005.02443.
[16] N. Reimers, I. Gurevych, Sentence-bert: Sentence embeddings using siamese
bertnetworks, in: Proceedings of the 2019 Conference on Empirical Methods in
Natural Language Processing, Association for Computational Linguistics, 2019. URL: https:
//arxiv.org/abs/1908.10084.
[17] S. Bird, E. Klein, E. Loper, Natural language processing with Python: analyzing text with
the natural language toolkit, ” O’Reilly Media, Inc.”, 2009.
[18] P. Kasnesis, R. Heartfield, X. Liang, et al., Transformer-based identification of stochastic
information cascades in social networks using text and image similarity, in: Journal of
Applied Soft Computing, 2021.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J.</given-names>
            <surname>Thorne</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Vlachos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Christodoulopoulos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mittal</surname>
          </string-name>
          ,
          <article-title>Fever: a large-scale dataset for fact extraction and verification</article-title>
          , arXiv preprint arXiv:
          <year>1803</year>
          .
          <volume>05355</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Hanselowski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>PVS</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Schiller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Caspelherr</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Chaudhuri</surname>
          </string-name>
          , C. M. Meyer, I. Gurevych,
          <article-title>A retrospective analysis of the fake news challenge stance-detection task</article-title>
          ,
          <source>in: Proceedings of the 27th International Conference on Computational Linguistics</source>
          , Association for Computational Linguistics, Santa Fe, New Mexico, USA,
          <year>2018</year>
          , pp.
          <fpage>1859</fpage>
          -
          <lpage>1874</lpage>
          . URL: https://aclanthology.org/C18-1158.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Watson</surname>
          </string-name>
          ,
          <article-title>Fake news in the u</article-title>
          .s. - statistics &amp; facts,
          <string-name>
            <surname>Statista</surname>
          </string-name>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>W. Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          , ”
          <article-title>liar, liar pants on fire”: A new benchmark dataset for fake news detection</article-title>
          ,
          <source>arXiv preprint arXiv:1705.00648</source>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Hanselowski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Sorokin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Schiller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Schulz</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Gurevych</surname>
          </string-name>
          ,
          <article-title>Ukp-athene: Multi-sentence textual entailment for claim verification</article-title>
          , arXiv preprint arXiv:
          <year>1809</year>
          .
          <volume>01479</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Xiong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <article-title>Fine-grained fact verification with kernel graph attention network</article-title>
          , arXiv preprint arXiv:
          <year>1910</year>
          .
          <volume>09796</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>P.</given-names>
            <surname>Patwa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bhardwaj</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Guptha</surname>
          </string-name>
          , G. Kumari,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sharma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. PYKL</given-names>
            ,
            <surname>A. Das</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ekbal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Akhtar</surname>
          </string-name>
          , T. Chakraborty,
          <article-title>Overview of constraint 2021 shared tasks: Detecting english covid-19 fake news and hindi hostile posts</article-title>
          ,
          <source>in: Proceedings of the First Workshop on Combating Online Hostile Posts in Regional Languages during Emergency Situation (CONSTRAINT)</source>
          , Springer,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>T.</given-names>
            <surname>Mitra</surname>
          </string-name>
          , E. Gilbert,
          <article-title>Credbank: A large-scale social media corpus with associated credibility annotations</article-title>
          ,
          <source>in: ICWSM</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>R.</given-names>
            <surname>Mihalcea</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Strapparava</surname>
          </string-name>
          ,
          <article-title>The lie detector: Explorations in the automatic recognition of deceptive language</article-title>
          ,
          <source>in: Proceedings of the ACL-IJCNLP 2009 Conference Short Papers, ACLShort '09</source>
          ,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computational Linguistics, USA,
          <year>2009</year>
          , p.
          <fpage>309</fpage>
          -
          <lpage>312</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>A.</given-names>
            <surname>Kazemi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Garimella</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Gafney</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. A.</given-names>
            <surname>Hale</surname>
          </string-name>
          ,
          <article-title>Claim matching beyond english to scale global fact-checking</article-title>
          ,
          <year>2021</year>
          . arXiv:
          <volume>2106</volume>
          .
          <fpage>00853</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>P.</given-names>
            <surname>Patwa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sharma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Pykl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Guptha</surname>
          </string-name>
          , G. Kumari,
          <string-name>
            <given-names>M. S.</given-names>
            <surname>Akhtar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ekbal</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. Das</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Chakraborty</surname>
          </string-name>
          ,
          <article-title>Fighting an infodemic: Covid-19 fake news dataset, in: Combating Online Hostile Posts in Regional Languages during Emergency Situation (CONSTRAINT) 2021</article-title>
          , Springer,
          <year>2021</year>
          , p.
          <fpage>21</fpage>
          -
          <lpage>29</lpage>
          . URL: http://dx.doi.org/10.1007/978-3-
          <fpage>030</fpage>
          -73696-
          <issue>5</issue>
          _3. doi:
          <volume>10</volume>
          . 1007/978- 3-
          <fpage>030</fpage>
          - 73696-
          <issue>5</issue>
          _
          <fpage>3</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>I.</given-names>
            <surname>Augenstein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Lioma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. Chaves</given-names>
            <surname>Lima</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Hansen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Hansen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. Grue</given-names>
            <surname>Simonsen</surname>
          </string-name>
          ,
          <article-title>Multifc: A real-world multi-domain dataset for evidence-based fact checking of claims</article-title>
          , in: EMNLP, Association for Computational Linguistics,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>K.</given-names>
            <surname>Nakamura</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Levy</surname>
          </string-name>
          , W. Y. Wang, r/fakeddit:
          <article-title>A new multimodal benchmark dataset for ifne-grained fake news detection</article-title>
          , arXiv preprint arXiv:
          <year>1911</year>
          .
          <volume>03854</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>K.</given-names>
            <surname>Shu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Mahudeswaran</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Lee</surname>
          </string-name>
          , H. Liu,
          <article-title>Fakenewsnet: A data repository with news content, social context and spatialtemporal information for studying fake news on social media</article-title>
          ,
          <year>2019</year>
          . arXiv:
          <year>1809</year>
          .01286.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>J. C. S.</given-names>
            <surname>Reis</surname>
          </string-name>
          , P. de Freitas Melo,
          <string-name>
            <given-names>K.</given-names>
            <surname>Garimella</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Almeida</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Eckles</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Benevenuto</surname>
          </string-name>
          ,
          <article-title>A dataset of fact-checked images shared on whatsapp during the brazilian</article-title>
          and indian
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>