<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>S Suryavardan</string-name>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Shreyash Mishra</string-name>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Parth Patwa</string-name>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Megha Chakraborty</string-name>
          <xref ref-type="aff" rid="aff4">4</xref>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anku Rani</string-name>
          <xref ref-type="aff" rid="aff4">4</xref>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Aishwarya Reganti</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Aman Chadha</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Amitava Das</string-name>
          <email>amitava@mailbox.sc.edu</email>
          <xref ref-type="aff" rid="aff4">4</xref>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Amit Sheth</string-name>
          <xref ref-type="aff" rid="aff4">4</xref>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Manoj Chinnakotla</string-name>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Asif Ekbal</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Srijan Kumar</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>IIIT Sri City</string-name>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>India</string-name>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Amazon AI</string-name>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Microsoft</string-name>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Fake News, Fact Verification, Multimodality, Dataset, Machine Learning, Entailment</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Carnegie Mellon University</institution>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Georgia Tech</institution>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>IIT Patna</institution>
          ,
          <country country="IN">India</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Stanford</institution>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>University of South Carolina</institution>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff5">
          <label>5</label>
          <institution>Washington</institution>
          ,
          <addr-line>DC</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The internet gives the world an open platform to express their views and share their stories. While this is very valuable, it makes fake news one of our society's most pressing problems. Manual fact checking process is time consuming, which makes it challenging to disprove misleading assertions before they cause significant harm. This is he driving interest in automatic fact or claim verification. Some of the existing datasets aim to support development of automating fact-checking techniques [1, 2], however, most of them are text based. Multi-modal fact verification has received relatively scant attention. In this paper, we provide a multi-modal fact-checking dataset called FACTIFY 2, improving Factify 1 by using new data sources and adding satire articles. Factify 2 has 50,000 new data instances. Similar to FACTIFY 1.0, we have three broad categories - support, no-evidence, and refute, with sub-categories based on the entailment of visual and textual data. We also provide a BERT and Vison Transformer based baseline, which acheives 65% F1 score in the test set. The baseline codes and the dataset will be made available at https://github.com/surya1701/Factify-2.0.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>With social media platforms taking center stage as news mediums, shifting facts from fake news
has become a cause for concern. Fake news articles typically manifest as fabricated stories</p>
      <p>
        †Work does not relate to position at Amazon.
(A. Das)
© 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
with no verifiable facts, sources, or quotes. Sometimes these stories may be propaganda that
is intentionally designed to mislead the reader or may be designed as “clickbait” written for
economic incentives. The technological ease of copying, pasting, clicking, and sharing content
online has helped misinformation and disinformation to proliferate. This has causes several
challenges in events like Covid-19 [
        <xref ref-type="bibr" rid="ref3 ref4 ref5">3, 4, 5</xref>
        ], elections [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] etc. In some cases, stories are designed
to provoke an emotional response and placed on certain sites to entice readers into sharing
them widely. In other cases, “fake news” articles may be generated and disseminated by “bots”
computer algorithms that are designed to act like people sharing information, but can do so
quickly and automatically [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Although there are a few large-scale eforts to identify fake news,
like FEVER [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] and LIAR[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], these datasets do not account for the evolution of fake news in
the real world. Another hindrance to fake news detection on social media platforms is the fact
that online information is very diverse, covering a large number of subjects, which contributes
complexity of this task. Often times, the truth and intent of any statement are challenging
to be verified by computers alone, so eforts must depend on collaboration between humans
and technology, à la human-in-the-loop setting [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Additionally, the visual cues that support
textual claims would help the system to detect fake content with greater confidence. These
concerns were addressed in the previous iteration - FACTIFY 1, which released a multimodal
fact-checking dataset for multimodal fact verification. The dataset contains images, textual
claims, and reference textual documents/images. It proposed a multimodal entailment task to
tag these claims against the verified document/image using 3 classes, i.e., support, no-evidence,
and refute; each of these categories is explained in the next section. The first two categories are
further sub-divided into text and multimodal components. Thus, in total, all the data samples
are labeled with one out of five choices. The data was obtained from twitter handles of popular
news channels from two large nations – the US and India. Factify 2 is the latest iteration of
factify, where we release new data of 50k instances including satirical articles, which utilize a
diferent manner of presentation of fake news.
      </p>
      <p>The paper is organized as follows: Related work is described in section 2. The proposed task
is described in section 3. Data collection and data distribution are explained in section 4 while
section 5 demonstrates the baseline model. Section 6 shows the results of our baseline models.
Finally, we summarise our task along with the further scope and open-ended pointers in section
7.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        Text based dataset: In recent years, a number of textual datasets for fact-checking and
factverification have been released. The LIAR [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] dataset contains 13k statements from politiFact
[
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] annotated into 6 fine-grained labels. FEVER provides manually updated 185k instances of
Wikipedia claims and associated supporting documents, categorised as Support, Refute, or
NotEnoughInfo. Patwa et al. [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] released a dataset of 10k tweets/articles on Covid-19 annotated
as true or false. A dataset for evidence extraction, document retrieval, stance detection, and
claim validation is proposed in [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] create a dataset to diferentiate fake news from satire.
The PUBHEALTH [13] data has 12k public health claims along with explanations by journalists
to support the fact-check labels. Other datasets include [14, 15, 16, 17]. Common methods to
detect text based fake news involve the use of CNN [18], RNN [19, 20], BERT [21, 22, 23], etc.
      </p>
      <p>Multimodal datasets: Text-only databases are inadequate in the social media era. It is
crucial to go beyond and consider additional modalities like image and video to detect fake
news. The fakeddit [24] dataset contains one million text+image instances taken from reddit
and labeled into 6 fine-grained classes for fake news detection. FakeNewsNet provides
spatiotemporal and visual data along with news and social context for analysing and detecting fake
news. It contains twitter user data such as location, replies, retweets, timestamps, etc. for
about 20k multimodal articles from PolitiFact and GossipCop. A multimodal fact-checking
dataset called MOCHEG [25] consists of 21,184 assertions, each of which is given a veracity
label (support, refute, and not enough information) and an explanation statement. A video
dataset consisting 180 verfied and 200 debunked videos is provided by [ 26]. Some other datasets
are [27, 28, 29].</p>
      <p>Modelling approaches to this task are varied and unique in their use of classifiers, adversarial
training, attention, etc. SpotFake [30] derives textual and visual representations from BERT
and VGG, respectively, before concatenating them for classification. EANN [ 31] trains a fake
news classifier adversarially by adding a event discriminator that ensures that the input data is
event-invariant such that newly emerging events can also be verified. CARM-N [ 32] proposes a
multichannel nonvolutional neural network that can mitigate the influence of noise information
which may be generated by crossmodal attention fusion by extracting textual feature
representation from original data and fused textual information simultaneously. Other methods include
use of BERT-based CapsNet [33], Cross-modal similarity [34] and Variational Autoencoders
[35] among others [36, 37, 38, 39].</p>
      <p>Factify 1: FACTIFY [40], is one of the largest multimodal fact-verification public datasets,
which includes 50k data points and covers news from India and the US. Images, texts, and
reference texts are all part of FACTIFY. They are categorised into three primary groups: Support,
Insufficient, and Refute, with additional groups dependent on the inclusion of visual and
textual data. FACTIFY 2 follows a similar pattern and releases additional 50k instances which
incorporate data from satirical articles and new data sources.</p>
      <p>For factify 1, researchers used methods like BERT [41], RoBERTa [42], and BigBird [43] for
textual features and ResNet [43], DeiT [44], EficientNet [ 45], and VGG [42] for visual features.
Please refer to [46] for details of all the methods.</p>
    </sec>
    <sec id="sec-3">
      <title>3. The Factify Task</title>
      <p>Fact verification is a dificult task to completely automate, especially in the case of multimodal
data, given the inherent challenges in doing a holistic evaluation with both the vision and text
modalities to ascertain the veracity of the claim. To this end, we model fake news verification
as a multimodal entailment task such that the veracity of both the text and image is verified.</p>
      <p>The formulation of the task is similar to the previously presented Factify 1. Each sample
contains a claim that has to be verified or fact checked. Each claim is accompanied by a supporting
document that is to be used to determine the veracity through a comparison or entailment based
approach. The claim and document are multi-modal i.e. they have textual and visual data
enabling multi-modal entailment for fact verification. Each sample has pairs of text, image and OCR.</p>
      <p>Support_Multimodal</p>
      <p>(a)
Insuficient_Multimodal
(d)</p>
      <p>Support_Text
(b)
Insuficient_Text
(e)
Refute
(f)
We define the following five categories to describe the entailment of the claim and document:
Support_Text, Support_Multimodal, Insufficient_Text, Insufficient_Multimodal, and
Refute. The specific description of these categories is as follows:
• Support_Text: the textual data for the claim and document are entailed but their images
are not entailed.
• Support_Multimodal: the textual data is entailed and the images are also similar for the
claim and document.
• Insufficient_Text: the textual data is not entailed but the claim and document may
have several common words, and the images are not entailed.
• Insufficient_Multimodal: the claim and document text are not entailed but they may
have common words and the images are also entailed in this case.
• Refute: The document text and image both contradict or refute the claim text and image,
thus, indicating that the given claim is false.</p>
      <sec id="sec-3-1">
        <title>Some examples from the dataset are given in Figure 1.</title>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Data</title>
      <sec id="sec-4-1">
        <title>In this section, we describe the data collection and data analysis.</title>
        <sec id="sec-4-1-1">
          <title>4.1. Data Collection</title>
          <p>The collection process includes two separate pipelines: (i) to collect real news articles for
support and no-evidence classes, and (ii) to collect fake news articles for no-evidence and refute
classes. The end goal was to curate a dataset with text and image for both claims and their
corresponding supporting documents.</p>
          <p>The first part of the collection was similar to FACTIFY 1. We collected tweets date-wise from
renowned twitter news handles, namely Hindustan Times, ANI and ABC, CNN for India and
USA, respectively. The nature and format of these handles as well as their tweets aided our
objective of collecting real news claims and articles. To improve the diversity and functionality
of the dataset, we compared tweets across the news handles to identify tweets that were
reporting the same or similar news. For this, we followed steps similar to FACTIFY 1, where we
compare tweet texts using Sentence BERT [47] and using a threshold we categorise whether
it is the same news or not. Specifically, we use the pre-trained paraphrase-MiniLM-L6-v2
[48] variant of Sentence BERT (SBERT) [49] instead of alternatives such as BERT or RoBERTa,
owing to its rich sentence embeddings yielding superior performance [47], while being much
more time-eficient. If the news is not the same, we compare common words using the NLTK
library [50] to categorise the tweets as similar or dissimilar. This helps define the support and
no-evidence category respectively as described in Section 3. The similarity between images in
the compared tweet pairs are also used to further categorise the data based on visual entailment.
Thresholds were set for image similarity to categorise them as entailed or not, based on two
metrics: cosine similarity between ResNet50 embeddings and Histogram similarity. With this
collected data, we treated the tweet from one handle as the claim and the news article associated
with the tweet from the other handle as the supporting document.</p>
          <p>The second part is the collection from several diferent websites. A part of the data for refute
category was collected from fact checking websites, similar to FACTIFY 1. We scraped data
from Snopes [51], Factly [52] and Boom [53]. These websites provided a well-defined claim and
a document disproving the given claim. We added an additional data source in this iteration of
the task, we collected satirical articles that were fake in nature but were written in a way that
seems real to the reader. While the websites we scraped from i.e. Fauxy [54] and EmpireNews
[55], specify that their articles are not true, we added them to the support category. This is
because, as aforementioned, the articles support their claim despite the claim being fake in
nature. To make the claim multi-modal, we scraped images by searching for the headline of the
article. We also manually annotated some articles we collected from the search results of these
headlines to add data to the no-evidence and refute category in cases where the articles were
about these satirical claims.</p>
        </sec>
        <sec id="sec-4-1-2">
          <title>4.2. Data Statistics And Analysis</title>
          <p>The second iteration of FACTIFY has the same categories as FACTIFY 1, with 50,000 data samples.
The samples are equally divided among all five categories with a split of 70:15:15 into train,
validation, and test sets respectively.</p>
          <p>Key words can be vital when identifying or predicting the veracity of a given claim. By
analyzing the claim and their documents, we find the most frequently occurring words in Figure
2. Most of the words relate to politics, indicating the bias in the news articles.</p>
          <p>The political inclination of the dataset is re-iterated by the word cloud for the support and
no-evidence category in Figure 3. However, in the same image, the refute category has a more
general distribution of words, with several words related to social media present in the refute
Support_Multimodal
Support_Text
Insufficient_Multimodal
Insufficient_Text
Refute
Total
category word cloud. We further present unique n-gram examples for the FACTIFY 2 dataset in
Table 2 to show the lexical diversity of the dataset.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Baseline model</title>
      <p>Several media are regularly used in online information exchange. Pictures have the power
to misrepresent a claim and propagate erroneous information. We must consider both the
(a) Support
(b) Insuficient
(c) Refute
image and the text in order to appropriately classify the claims. Features must be obtained
from claim and document image-text pairings because it is an entailment-based technique.
The visual features are obtained from the pre-trained Vision transformer model (ViT) [56].
Thanks to the positional embedding of picture patches carried out by ViT, the ViT model can
surpass conventional CNNs in terms of computation and accuracy. Using a pretrained Sentence
BERT model (specifically, the stsb-mpnet-base-v2 variant), the model generates sentence
embeddings of claim and document attributes. The Sentence-BERT embedding is concatenated
with the pooled output from the ViT model. After passing through an MLP, the combined
features are then categorised. The multi-modal characteristics are employed for all three of the
sub-tasks after modifications to the MLP. The model architecure is displayed in Figure 4. The
codes will be made available at https://github.com/surya1701/Factify-2.0.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Results</title>
      <p>Baseline results in Table 3 show Macro F1 scores for some muti-modal modelling approaches
mentioned below. Using ViT for extracting visual features and Sentence-BERT for the textual
features, the baseline model scores 0.6499. We also compare the baseline model (ViT +
SBERTMPNet) with other methods, such as ViT + SBERT-RoBERTa, in which the SBERT-RoBERTa
model is used in place of SBERT-RoBERTa for generating text embeddings. For the Resnet50 +
SBERT-RoBERTa and Resnet50 + SBERT-MPNet, a simple ResNet50 model is used to extract
visual features. The improvement on using the Vision transformer over the ResNet model
signifies the importance of images for the task.</p>
      <p>Method
Resnet50 + SBERT-RoBERTa
Resnet50 + SBERT-MPNet
ViT + SBERT-RoBERTa
ViT + SBERT-MPNet</p>
    </sec>
    <sec id="sec-7">
      <title>7. Conclusion and Future Work</title>
      <p>By publishing a sizable real-world dataset containing inputs from two modalities, namely text
and image, we make a significant step towards creating machine learning approaches for the
multimodal fact verification in this study. To underline the dificulties of the issue and the scope
for improvement, we conduct data analysis and release multimodal baselines. However, there
are a lot of additional research possibilities that can be explored since our work merely touches
the surface. One potential research direction could be to enrich the dataset with reasoning that
why is a particular news fake. Another possibility is to use synthetic data that matches the
general data distribution, thus adding complexity to the refute category.
doi:10.1145/3201064.3201100.
[13] N. Kotonya, F. Toni, Explainable automated fact-checking for public health claims,
in: Proceedings of the 2020 Conference on Empirical Methods in Natural Language
Processing (EMNLP), Association for Computational Linguistics, Online, 2020, pp.
7740–7754. URL: https://aclanthology.org/2020.emnlp-main.623. doi:10.18653/v1/2020.
emnlp- main.623.
[14] R. Mihalcea, C. Strapparava, The lie detector: Explorations in the automatic recognition
of deceptive language, in: Proceedings of the ACL-IJCNLP 2009 Conference Short Papers,
ACLShort ’09, Association for Computational Linguistics, USA, 2009, p. 309–312.
[15] A. Kazemi, K. Garimella, D. Gafney, S. A. Hale, Claim matching beyond english to scale
global fact-checking, 2021. arXiv:2106.00853.
[16] T. Mitra, E. Gilbert, Credbank: A large-scale social media corpus with associated credibility
annotations, in: ICWSM, 2015.
[17] I. Augenstein, C. Lioma, D. Wang, L. Chaves Lima, C. Hansen, C. Hansen, J. Grue Simonsen,
Multifc: A real-world multi-domain dataset for evidence-based fact checking of claims, in:
EMNLP, Association for Computational Linguistics, 2019.
[18] H. Saleh, A. Alharbi, S. H. Alsamhi, Opcnn-fake: Optimized convolutional neural network
for fake news detection, IEEE Access 9 (2021) 129471–129489. doi:10.1109/ACCESS.2021.
3112806.
[19] O. Ajao, D. Bhowmik, S. Zargari, Fake news identification on twitter with hybrid cnn
and rnn models, in: Proceedings of the 9th International Conference on Social Media and
Society, SMSociety ’18, Association for Computing Machinery, New York, NY, USA, 2018, p.
226–230. URL: https://doi.org/10.1145/3217804.3217917. doi:10.1145/3217804.3217917.
[20] J. A. Nasir, O. S. Khan, I. Varlamis, Fake news detection: A hybrid cnn-rnn based deep
learning approach, International Journal of Information Management Data Insights 1
(2021) 100007. URL: https://www.sciencedirect.com/science/article/pii/S2667096820300070.
doi:https://doi.org/10.1016/j.jjimei.2020.100007.
[21] R. K. Kaliyar, A. Goswami, P. Narang, Fakebert: Fake news detection in social media
with a bert-based deep learning approach, Multimedia tools and applications 80 (2021)
11765–11788.
[22] P. Patwa, M. Bhardwaj, V. Guptha, G. Kumari, S. Sharma, S. PYKL, A. Das, A. Ekbal,
S. Akhtar, T. Chakraborty, Overview of constraint 2021 shared tasks: Detecting english
covid-19 fake news and hindi hostile posts, in: Proceedings of the First Workshop on
Combating Online Hostile Posts in Regional Languages during Emergency Situation
(CONSTRAINT), Springer, 2021.
[23] A. Glazkova, M. Glazkov, T. Trifonov, g2tmn at constraint@AAAI2021: Exploiting
CTBERT and ensembling learning for COVID-19 fake news detection, in: Combating Online
Hostile Posts in Regional Languages during Emergency Situation, Springer International
Publishing, 2021, pp. 116–127. doi:10.1007/978- 3- 030- 73696- 5_12.
[24] K. Nakamura, S. Levy, W. Y. Wang, r/fakeddit: A new multimodal benchmark dataset for
ifne-grained fake news detection, arXiv preprint arXiv:1911.03854 (2019).
[25] B. M. Yao, A. Shah, L. Sun, J.-H. Cho, L. Huang, End-to-end multimodal fact-checking and
explanation generation: A challenging dataset and models, arXiv preprint arXiv:2205.12487
(2022).
[26] O. Papadopoulou, M. Zampoglou, S. Papadopoulos, Y. Kompatsiaris, A corpus of debunked
and verified user-generated videos, Online Information Review 43 (2019) 72–88.
[27] J. C. S. Reis, P. de Freitas Melo, K. Garimella, J. M. Almeida, D. Eckles, F. Benevenuto,
A dataset of fact-checked images shared on whatsapp during the brazilian and indian
elections, 2020. arXiv:2005.02443.
[28] S. Jindal, R. Sood, R. Singh, M. Vatsa, T. Chakraborty, Newsbag: A multimodal benchmark
dataset for fake news detection, in: CEUR Workshop Proc., volume 2560, 2020, pp. 138–145.
[29] D. Zlatkova, P. Nakov, I. Koychev, Fact-checking meets fauxtography: Verifying claims
about images, in: Proceedings of the 2019 Conference on Empirical Methods in Natural
Language Processing and the 9th International Joint Conference on Natural Language
Processing (EMNLP-IJCNLP), Association for Computational Linguistics, Hong Kong,
China, 2019, pp. 2099–2108. URL: https://aclanthology.org/D19-1216. doi:10.18653/v1/
D19-1216.
[30] S. Singhal, R. R. Shah, T. Chakraborty, P. Kumaraguru, S. Satoh, Spotfake: A multi-modal
framework for fake news detection, in: 2019 IEEE Fifth International Conference on
Multimedia Big Data (BigMM), 2019, pp. 39–47. doi:10.1109/BigMM.2019.00-44.
[31] Y. Wang, F. Ma, Z. Jin, Y. Yuan, G. Xun, K. Jha, L. Su, J. Gao, Eann: Event adversarial neural
networks for multi-modal fake news detection, 2018, pp. 849–857. doi:10.1145/3219819.
3219903.
[32] C. Song, N. Ning, Y. Zhang, B. Wu, A multimodal fake news detection model based on
crossmodal attention residual and multichannel convolutional neural networks, Information
Processing &amp; Management 58 (2020). doi:10.1016/j.ipm.2020.102437.
[33] B. Palani, S. Elango, V. Viswanathan K, Cb-fake: A multimodal deep learning framework
for automatic fake news detection using capsule neural network and bert, Multimedia
Tools Appl. 81 (2022) 5587–5620. URL: https://doi.org/10.1007/s11042-021-11782-3. doi:10.
1007/s11042-021-11782-3.
[34] X. Zhou, J. Wu, R. Zafarani, Safe: Similarity-aware multi-modal fake news detection,
in: Advances in Knowledge Discovery and Data Mining: 24th Pacific-Asia Conference,
PAKDD 2020, Singapore, May 11–14, 2020, Proceedings, Part II, Springer, 2020, pp. 354–367.
[35] D. Khattar, J. Singh, M. Gupta, V. Varma, Mvae: Multimodal variational autoencoder for
fake news detection, 2019, pp. 2915–2921. doi:10.1145/3308558.3313552.
[36] V. Pérez-Rosas, B. Kleinberg, A. Lefevre, R. Mihalcea, Automatic detection of fake news,
2017. arXiv:1708.07104.
[37] J. Ma, W. Gao, K.-F. Wong, Rumor detection on Twitter with tree-structured recursive
neural networks, in: Proceedings of the 56th Annual Meeting of the Association for
Computational Linguistics (Volume 1: Long Papers), Association for Computational
Linguistics, Melbourne, Australia, 2018, pp. 1980–1989. URL: https://aclanthology.org/P18-1184.
doi:10.18653/v1/P18-1184.
[38] S. R. Sahoo, B. B. Gupta, Multiple features based approach for automatic fake news
detection on social networks using deep learning, Applied Soft Computing 100 (2021)
106983.
[39] Z. Guo, M. Schlichtkrull, A. Vlachos, A survey on automated fact-checking, Transactions
of the Association for Computational Linguistics 10 (2022) 178–206.
[40] S. Mishra, S. Suryavardan, A. Bhaskar, P. Chopra, A. Reganti, P. Patwa, A. Das,
T. Chakraborty, A. Sheth, A. Ekbal, et al., Factify: A multi-modal fact verification dataset,
in: Proceedings of the First Workshop on Multimodal Fact-Checking and Hate Speech
Detection (DE-FACTIFY), 2022.
[41] A. Dhankar, O. Zaiane, F. Bolduc, Uofa-truth at factify 2022: A simple approach to
multi-modal fact-checking (2022).
[42] Y. Zhuang, Y. Zhang, Yet at factify 2022: Unimodal and bimodal roberta-based models for
fact checking, in: Proceedings of De-Factify: Workshop on Multimodal Fact Checking and
Hate Speech Detection, CEUR, 2022.
[43] J. Gao, H.-F. Hofmann, S. Oikonomou, D. Kiskovski, A. Bandhakavi, Logically at factify
2022: Multimodal fact verification (2022).
[44] W.-Y. Wang, W.-C. Peng, Team yao at factify 2022: Utilizing pre-trained models and
co-attention networks for multi-modal fact verification (2022).
[45] N. Hulke, B. R. Siva, A. Raj, A. A. Saifee, Tyche at factify 2022: Fusion networks for
multi-modal fact-checking (2021).
[46] P. Patwa, S. Mishra, S. Suryavardan, A. Bhaskar, P. Chopra, A. Reganti, A. Das,
T. Chakraborty, A. Sheth, A. Ekbal, et al., Benchmarking multi-modal entailment for
fact verification, in: Proceedings of De-Factify: Workshop on Multimodal Fact Checking
and Hate Speech Detection, CEUR, 2022.
[47] N. Reimers, I. Gurevych, Sentence-bert: Sentence embeddings using siamese
bertnetworks, in: Proceedings of the 2019 Conference on Empirical Methods in
Natural Language Processing, Association for Computational Linguistics, 2019. URL: https:
//arxiv.org/abs/1908.10084.
[48] paraphrase-minilm-l6-v2, https://huggingface.co/sentence-transformers/
paraphrase-MiniLM-L6-v2, 2019. Accessed: 2022.
[49] N. Reimers, I. Gurevych, Sentence-bert: Sentence embeddings using siamese bert-networks,
arXiv preprint arXiv:1908.10084 (2019).
[50] S. Bird, E. Klein, E. Loper, Natural language processing with Python: analyzing text with
the natural language toolkit, ” O’Reilly Media, Inc.”, 2009.
[51] Snopes, https://www.snopes.com/, 1994. Accessed: 2022.
[52] Factly, https://factly.in/category/english/, 2016. Accessed: 2022.
[53] Boomlive, https://www.boomlive.in/fact-check, 2014. Accessed: 2022.
[54] Fauxy, https://thefauxy.com/, 2018. Accessed: 2022.
[55] Empirenews, https://empirenews.net/, 2014. Accessed: 2022.
[56] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M.
Dehghani, M. Minderer, G. Heigold, S. Gelly, et al., An image is worth 16x16 words:
Transformers for image recognition at scale, arXiv preprint arXiv:2010.11929 (2020).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J.</given-names>
            <surname>Thorne</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Vlachos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Christodoulopoulos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mittal</surname>
          </string-name>
          ,
          <article-title>Fever: a large-scale dataset for fact extraction and verification</article-title>
          , arXiv preprint arXiv:
          <year>1803</year>
          .
          <volume>05355</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>W. Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          , ”
          <article-title>liar, liar pants on fire”: A new benchmark dataset for fake news detection</article-title>
          ,
          <source>arXiv preprint arXiv:1705.00648</source>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>N.</given-names>
            <surname>Karimi</surname>
          </string-name>
          , J. Gambrell,
          <article-title>hundreds die of poisoning in iran as fake news suggests methanol cure for virus, 2020</article-title>
          . URL: https://www.timesofisrael.
          <article-title>com/ hundreds-die-of-poisoning-in-iran-as-fake-news-suggests-methanol-cure-for-virus/.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bae</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Gandhi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Kothari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Shankar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bae</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Patwa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Sukumaran</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Chharia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Adhikesaven</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Rathod</surname>
          </string-name>
          , I. Nandutu,
          <string-name>
            <surname>S. TV</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Misra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Murali</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Saxena</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Jakimowicz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Sharma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Iyer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mehra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Radunsky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Katiyar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>James</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Dalal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Anand</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Advani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Dhaliwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Raskar</surname>
          </string-name>
          ,
          <article-title>Challenges of equitable vaccine distribution in the covid-</article-title>
          19 pandemic,
          <year>2022</year>
          . arXiv:
          <year>2012</year>
          .12263.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>M.</given-names>
            <surname>Morales</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Barbar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Gandhi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Landage</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bae</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Vats</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Kothari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Shankar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Sukumaran</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Mathur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Misra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Saxena</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Patwa</surname>
          </string-name>
          , S. T. V.,
          <string-name>
            <given-names>M.</given-names>
            <surname>Arseni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Advani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Jakimowicz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Anand</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Katiyar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mehra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Iyer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Murali</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mahindra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dmitrienko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Srivastava</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gangavarapu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Penrod</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Sharma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Raskar</surname>
          </string-name>
          , Covid-19 tests gone rogue: Privacy, eficacy, mismanagement and misunderstandings,
          <year>2021</year>
          . arXiv:
          <volume>2101</volume>
          .
          <fpage>01693</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>S.</given-names>
            <surname>Muhammed</surname>
          </string-name>
          <string-name>
            <surname>T</surname>
          </string-name>
          , S. K. Mathew,
          <article-title>The disaster of misinformation: a review of research in social media</article-title>
          ,
          <source>International Journal of Data Science and Analytics</source>
          <volume>13</volume>
          (
          <year>2022</year>
          )
          <fpage>271</fpage>
          -
          <lpage>285</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S.</given-names>
            <surname>Desai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Mooney</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Oehrli</surname>
          </string-name>
          ,
          <article-title>Fake news, lies and propaganda: How to sort fact from ifction</article-title>
          , Subjects: News &amp;
          <string-name>
            <given-names>Current</given-names>
            <surname>Events</surname>
          </string-name>
          . The University of Michigan Library (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. A.</given-names>
            <surname>Ghorbani</surname>
          </string-name>
          ,
          <article-title>An overview of online fake news: Characterization, detection, and discussion</article-title>
          ,
          <source>Information Processing &amp; Management</source>
          <volume>57</volume>
          (
          <year>2020</year>
          )
          <fpage>102025</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Politifact</surname>
          </string-name>
          , https://www.politifact.com,
          <year>2007</year>
          . Accessed:
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>P.</given-names>
            <surname>Patwa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sharma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Pykl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Guptha</surname>
          </string-name>
          , G. Kumari,
          <string-name>
            <given-names>M. S.</given-names>
            <surname>Akhtar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ekbal</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. Das</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Chakraborty</surname>
          </string-name>
          ,
          <article-title>Fighting an infodemic: Covid-19 fake news dataset, in: Combating Online Hostile Posts in Regional Languages during Emergency Situation (CONSTRAINT) 2021</article-title>
          , Springer,
          <year>2021</year>
          , p.
          <fpage>21</fpage>
          -
          <lpage>29</lpage>
          . URL: http://dx.doi.org/10.1007/978-3-
          <fpage>030</fpage>
          -73696-
          <issue>5</issue>
          _3. doi:
          <volume>10</volume>
          . 1007/978- 3-
          <fpage>030</fpage>
          - 73696-
          <issue>5</issue>
          _
          <fpage>3</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>A.</given-names>
            <surname>Hanselowski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Stab</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Schulz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Gurevych</surname>
          </string-name>
          ,
          <article-title>A richly annotated corpus for diferent tasks in automated fact-checking</article-title>
          ,
          <source>in: Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)</source>
          ,
          <article-title>Association for Computational Linguistics</article-title>
          , Hong Kong, China,
          <year>2019</year>
          , pp.
          <fpage>493</fpage>
          -
          <lpage>503</lpage>
          . URL: https://aclanthology.org/K19-1046. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>K19</fpage>
          - 1046.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>J.</given-names>
            <surname>Golbeck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mauriello</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Auxier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. H.</given-names>
            <surname>Bhanushali</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Bonk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Bouzaghrane</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Buntain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Chanduka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Cheakalos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. B.</given-names>
            <surname>Everett</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Falak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Gieringer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Graney</surname>
          </string-name>
          ,
          <string-name>
            <surname>K. M. Hofman</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Huth</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          <string-name>
            <surname>Ma</surname>
            ,
            <given-names>M.</given-names>
            Jha, M.
          </string-name>
          <string-name>
            <surname>Khan</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Kori</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          <string-name>
            <surname>Lewis</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Mirano</surname>
          </string-name>
          , W. T.
          <string-name>
            <surname>Mohn</surname>
            <given-names>IV</given-names>
          </string-name>
          , S. Mussenden,
          <string-name>
            <given-names>T. M.</given-names>
            <surname>Nelson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Mcwillie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Pant</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Shetye</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Shrestha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Steinheimer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Subramanian</surname>
          </string-name>
          , G. Visnansky,
          <article-title>Fake news vs satire: A dataset and analysis</article-title>
          ,
          <source>in: Proceedings of the 10th ACM Conference on Web Science</source>
          , WebSci '18,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2018</year>
          , p.
          <fpage>17</fpage>
          -
          <lpage>21</lpage>
          . URL: https://doi.org/10.1145/3201064.3201100.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>