<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>ORCID:</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Google Snippets and Twitter Posts; Examining Similarities to Identify Misinformation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Saud Althabiti</string-name>
          <email>salthabiti@kau.edu.sa</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mohammad Ammar Alsalka</string-name>
          <email>m.a.alsalka@leeds.ac.uk</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Eric Atwell</string-name>
          <email>e.s.atwell@leeds.ac.uk</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>King Abdulaziz University</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Leeds</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>of the Spanish Society for Natural Language Processing</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>1918</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>Despite numerous efforts to address the persistent issue of fake news, its proliferation continues due to the vast volume of information circulating on social media platforms. This poses a significant challenge to manual fact-checking processes. To explore a potential solution, this study investigates the applicability of Google search and its results as a practical tool for detecting fake news on platforms like Twitter. The research focuses explicitly on comparing Google search result snippets with tweets to assess their similarity and determine if such similarity can serve as an indicator of misinformation. However, the study reveals that the observed similarity between tweets and snippets does not necessarily correlate with news credibility. Consequently, alternative techniques, such as retrieving complete news articles and assessing sources, may be necessary to effectively tackle the challenge of fake news detection on social media. This research spots light on the limitations of relying solely on snippet similarity. In addition, it suggests the importance of considering comprehensive content analysis and source credibility in future works to combat misinformation. Sentence similarity, sBERT. Misinformation detection, Google snippets, Automatic fact checking, Cosine similarity,</p>
      </abstract>
      <kwd-group>
        <kwd>Misinformation</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Many individuals use social networks site to obtain news and information. These platforms, like
Twitter, Facebook, and others, have become popular news and information sources for many people
[1]–[4]. This is because these media allow users to easily access and share information and often offer
real-time updates on events and issues [5]. Further, many news organizations and journalists use social
media to disseminate their stories and updates, which makes it easy for people to access news and
information on these platforms [6]. However, it is important to note that not all information on these
media is trustworthy [7].</p>
      <p>Misleading information, also known as "fake news," is a type of information presented as factual,
but is actually false or intentionally deceptive. It is often circulated on social media platforms and can
be challenging to differentiate from legitimate news sources [8]. This can be detrimental, as it can lead
individuals to make decisions based on inaccurate information [5]. Therefore, it is vital for individuals
to evaluate the information they come across on social media critically and to verify its accuracy from
multiple sources before acknowledging it or sharing it with others.</p>
      <p>There have been considerable efforts by social media platforms, researchers, and other organizations
to enhance the quality of information on the internet and reduce the spread of misleading information
NLP-MisInfo 2023: SEPLN 2023 Workshop on NLP applied to Misinformation, held as part of SEPLN 2023: 39th International Conference</p>
      <p>2020 Copyright for this paper by its authors.
[2]. For instance, many networking sites have implemented policies and algorithms to detect and
remove false content. In addition, they often provide users with tools to flag or report this type of content
[1]. Besides, researchers and fact-checking organizations have evolved techniques for detecting and
debunking incorrect information. Furthermore, they often work with social media platforms to assist
them in identifying and removing such content [9]–[12]. Also, many initiatives and organizations focus
on educating people on how to evaluate the information they encounter online critically [13], so there
are considerable efforts underway to improve the online information environment and reduce the spread
of false or misleading information.</p>
      <p>Nevertheless, plenty of deliberately fabricated or manipulated rumors have been propagated on the
internet, raising concern that many naive users believe them without checking their authenticity [14].
Individuals and organizations use this common way to spread propaganda, promote political agendas,
or simply cause confusion [15], [16]. Unfortunately, in various cases, this type of information can be
challenging to detect because it is often designed to look like legitimate news or information, and it
may be shared by seemingly reputable sources [2]. This is why it's essential for users to be critical of
the information they encounter online and to verify its accuracy from multiple sources before accepting
it as accurate and making a decision or drawing a conclusion.</p>
      <p>One way to manually fact-check information is to use a search engine, such as Google, to look for
information or sources that support or contradict the posted information to evaluate the credibility of a
source or piece of information, see Figure 1.</p>
      <p>This experiment's main objective is to ascertain the effectiveness of Google's search for detecting
misinformation published on social networking sites and to develop a feature-based approach for
comparing Google's results (snippets) to tweets to determine their similarity.</p>
      <p>Our contributions comprise two main components. First, we present the first Arabic snippets dataset,
which was created by collecting snippets from Google search results based on a previously published
tweets dataset. Second, we introduce a new methodology for evaluating the similarity between tweets
and snippets extracted from Google.</p>
      <p>The remainder of the paper is organized as follows. First, we present related works in Section 2 and
explain our methodology in Section 3. While Section 4 discussed and analyzed observations. Finally,
we conclude our work in Section 5.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related work</title>
      <p>Several techniques have been used to detect fake news on social media platforms. These include
utilizing machine learning algorithms to classify patterns in the language and style of the content and
checking the information source to verify its credibility. Other methods are to check fact-checking
organizations to verify the accuracy of the information or use community feedback and reporting tools
to flag false content [2], [17]–[21]. Additionally, some platforms, such as Facebook, have developed
measures like alerting labels or reduced visibility for content that was identified as misleading or
potentially untrusted [4].</p>
      <p>An essential step in the process of detecting fake news is extracting features. It includes determining
the data's most relevant and informative characteristics that can be used to differentiate between real
and fake news. These features can then be fed to a classification model for training and prediction [22],
[23]. For instance, some studies have used NLP techniques to extract features such as the existence of
particular words or phrases and the sentiment or emotion expressed in the text [24]–[26]. These features
can be combined with other content-based and source-based features to improve the accuracy of fake
news detection models. Also, some researchers have used machine learning and transformer-based
algorithms to examine the responses or reactions to a particular post to assess its credibility [27], [28].
For example, a posted tweet that has many negative comments may be more likely to be fake news.
Similarly, a widely conveyed and discussed tweet is more likely genuine.</p>
      <p>Some experimenters have turned to approaches such as comparing the news to other sources of
information. One way to do this is to search for a given story or news on Google and see if multiple
reputable sources have reported it. Based on the study [29], it is suggested that using the cosine
similarity score, which is calculated after conducting topic modeling on a collection of texts, can
improve the accuracy of classification. In one such process, the cosine similarity was calculated between
headlines and contents as a new feature by comparing normalized TF-IDF vectors in Investigation [30].
In another experiment by [31], they developed a method for translating news titles into multiple
languages, searching for related articles using the Google Search API, and evaluating the similarity
between the initial news and the search results. Likewise, a study [32] proposed a similar approach
involving a comparison of user-submitted articles with those from reliable sources. Also, in a paper
published by [33], they tested their system on a set of 100 news and found that a matching value of at
least 70 percent for more than three articles indicated credibility. While study [34], for example,
calculated and used the highest similarity score and its associated title to be presented to users so they
could compare and check credibility with the most related source. On the other hand, we aim to assess
the effectiveness of Google's search results, which contains sources, titles, and snippets, in detecting
misinformation on social media by comparing extracted snippets to tweets and evaluating their
similarity. Table 1 briefly compares these studies to our proposed methodology. The news articles
column represents both the title and the contents of the used dataset, while snippets are the results from
Google searches. To the best of our knowledge, this is the first study investigating the efficacy of
similarity between Google snippets and tweets to detect fake news.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Methodology</title>
    </sec>
    <sec id="sec-4">
      <title>3.1. Tweets dataset</title>
      <p>Since our primary goal in this study is to compare news from social media with Google snippets, we
used a published dataset from Twitter. Specifically, we chose the ArCOV19-Rumors dataset, which
contains Arabic Twitter posts about COVID-19 misinformation [35]. It includes 138 verified claims,
mostly from reliable fact-checking online sources, and 9.4K tweets that relate to those claims. The
tweets have been annotated with information about their truthfulness in order to support research on
detecting false information. The collection covers the period from January 27 to April 2020. We used
this dataset to collect similar news from Google as described in Subsection 3.3.</p>
    </sec>
    <sec id="sec-5">
      <title>Sentence and cosine similarity</title>
      <p>A Sentence similarity is a degree to which two texts are semantically alike or equal in meaning [36]. In
contrast, cosine similarity is a computation of similarity between two non-zero vectors of an inner
product space that calculates the cosine of the angle between them [37]. In natural language processing,
cosine similarity is often used to measure the similarity between a pair of texts by treating each piece
of text as a vector [38].</p>
      <p>sBERT is a variant of BERT (Bidirectional Encoder Representations from Transformers) [39]. It is
specifically developed to process and understand the meaning of individual sentences rather than entire
documents or paragraphs. This makes it particularly useful for experiments that need a deep
understanding of the meaning and context of individual sentences. The sBERT model can be fine-tuned
for specific NLP tasks by training it on labelled data. It has been shown to perform well on a variety of
NLP tasks, including sentiment analysis, text classification, and language translation [39], [40].
3.3.</p>
    </sec>
    <sec id="sec-6">
      <title>System description and collecting snippets</title>
      <p>We utilized the ArCOV19-Rumors and followed the following steps to conduct our experiment:</p>
      <p>Pre-processing: Due to the existence of URLs embedded within tweets, a google search does not
provide any results when searching for a particular tweet. This necessitates the removal of unwanted
data, such as URLs, usernames, and hashtags. However, it yielded in the existence of many repeated
tweets after preprocessing, so we also removed duplicates. The reason is that the dataset was collected
based on 138 claims from Twitter, so many similar tweets were collected with different hashtags and
usernames. As a result, only 2821 tweets are included in the experiment after this step.</p>
      <p>Google search: The second step is to harvest additional data. We automatically query each tweet in
the Google search engine to scrape results using requests-HTML. Each extracted output contains the
website title, the link or source, and a snippet, which is the text that appears in Google's search results
to describe a website's content quickly and briefly. This query returned 2267 responses, so we only
experimented with tweets that have responses retrieved.</p>
      <p>Machine translation: We aim to investigate various open-source libraries, some of which do not
support the Arabic language, such as spaCy. Therefore, we automatically translated all tweets and the
retrieved titles and snippets of each query using the Google translator library. In this paper, we refer to
the translated tweet from Arabic to English as ET and the translated Google snippets as ES.</p>
      <p>Calculating similarities: We utilized the pre-trained model sBERT to compute the embedding of
an English tweet (ET) and English snippets (ES). Then, calculate the cosine similarity between two
embeddings (ETi and ESij), where i represents a particular tweet and j represents a retrieved snippet
queried by tweet i. In addition, we also calculate the sentence similarity between ETi and ESij using
spaCy. As a result, the calculated mean value of the computed similarities between each tweet and their
corresponding snippets , as in (1), are considered new features.</p>
      <p>̅ =

1

∑ cos(
 =1
 ,   )
(1)
The final dataset named ‘TweetsWithSnippets‘ can be found on GitHub2, which includes the following
columns:
●
●
●
●
●
●
●</p>
      <p>Tweet ID
Label: True or False tweet.</p>
      <p>Tweet text: Original tweet from the ArCOV19-Rumors.</p>
      <p>Cleaned text: Tweet text after pre-processing.</p>
      <p>Snpt_titles: A list of titles retrieved from Google results in both languages.</p>
      <p>Snpt_links: A list of URLs retrieved from Google results.</p>
      <p>Snpt: A list of snippets retrieved from Google results in both languages.
2 https://github.com/althabiti/TweetsWithSnippets
●
●</p>
      <p>SenSi_ET-ES: A list of the calculated sentence similarity between ET and ES.</p>
      <p>CoSi_ET-ESTxt: A list of the calculated cosine similarity between ET and ES</p>
    </sec>
    <sec id="sec-7">
      <title>4. Results and discussion</title>
      <p>After calculating the average similarity between ET and ES, we analyzed whether it was possible to
take advantage of these extracted features to verify the credibility of the news. As shown to us in the
Figure below, which shows the average cosine similarities and sentence similarities between the True
news and their snippets in blue line, as well as in red for false news. However, upon examining Figure
4and Figure 5, we observed mixed fluctuating lines in both the true and fake news samples. There is no
clear indication that one group consistently displayed higher or lower similarities than the other.
Therefore, it cannot be concluded that the similarities between a tweet and related snippets can be used
to predict the credibility of news. Further research is required to explore alternative approaches to detect
fake news.
Following that, we manually analyzed several examples to ascertain the reasons for the dissimilarity in
many cases by categorizing whether the snippet conveys the same information as the searched tweet.
We found that cosine helps to detect if it is talking about the same matter but cannot determine the
semantic similarity or the entire meaning of a sentence. Therefore, based on the randomly selected
examples, 50% and above may indicate they are discussing the same topic. It is possible to benefit from
those snippets to proceed with the experiment. So, we eliminated all snippets with less than 50% cosine.
In addition, snippets with high similarity percentages indicate that the tweet is a copy of elsewhere
content or vice versa, as shown in Table 2. The following Tables provide examples of true and false
tweeted information, searched Google snippets, sentence similarity using the spaCy (SS), cosine
similarity (CS), and whether each snippet conveys and relates to the searched tweet.</p>
      <p>Snippets from Google SS CS
ةحفاكم هنكمي يموي لكشب موثلا لوانت نأ ةديدع ةيملع تاسارد تفشك :انوروك ةحفاكم 0.78 0.69</p>
      <p>'.%63 ةبسنب دجتسملا انوروك سوريف ةحفاكمو دربلا تلازن ضارعأ فلتخم
دق يحص ءاذغ موثلا ؟ديدجلا انوروك ب ةباصلإا عنم يف موثلا لوانت دعاسي نأ نكمي له' 0.9 0.89 Related,
ليلد يأ دجوي لا ،كلذ عمو .تابوركيملل ةداضملا صئاصخلا ضعب ىلع يوتحي … disagree
عمتجملا دارــفأ ن يب محلا تلل ًاديسجت ،ةـلودـلا .سﺆـمـلـل ةدـناـسـمـلا ت اــمدــخــلا مـيدـقـت' 0.83 0.3 Not
'.)١٩-ديفوك( انوروك سوريف ةمزأ ةﻬجاوم يف ةيمسرلا .تاﻬجلاو related
label: True
Tweet text: ناريا عم ةيدودحلا ذفانملا قغلاإب اﻬتبلاطم يفنت ةحصلا ةرازو
English Translation: The Ministry of Health denies its demand to close the border crossings with
Iran</p>
      <sec id="sec-7-1">
        <title>Our analyses Related, agrees</title>
      </sec>
      <sec id="sec-7-2">
        <title>Snippets from Google</title>
        <p>Feb .ناريإ عم ةيدودحلا ذفانملا قغلاإب اﻬتبلاطم ،سيمخلا مويلا ،ةحصلا ةرازو تفن</p>
        <p>... :)عاو( ةيقارعلا ءابنلأا ةلاكول ةرازولا يف يمعلالإا بتكملا لاقو
نيصخش ةافو ىلا تدا انوروك سوريافب تاباصا تلجس ةيناريلاا ةحصلا ةرازو نا ركذي
.سوريافلا راشتنا عنمل ةيروفلا تاءارجلاا ةيناريلاا تاطلسلا تذختا اميف ،نلاا ةياغل
تلاقو . ناريإ عم ةيدودحلا ذفانملا قغلاإب اﻬتبلاطم ،سيمخلا مويلا ،ةحصلا ةرازو تفن</p>
        <p>...ءابنلاا" نا :" ايديم لاتجيد ـل هعبات نايب يف ةرازولا
."ةيمسرلا رداصملا نم اﻬئاقستساو ةمولعملا لقن يف ةقدلا يخوت" ىلا ةرازولا تعدو</p>
        <p>سوريفب تاباصا تلجس ةيناريلاا ةحصلا ةرازو نا ركذي ...
ركذو .ناريإ عم ةيدودحلا ذفانملا قغلاإب اﻬتبلاطم ،سيمخلا مويلا ،ةحصلا ةرازو تفن
... هبش )عاو( ةلاكول ةرازولا يف يمعلالإا بتكملا
label: False
Tweet text: اﻬنتم ىلعو نيصلا نيكب ىلإ يبد نم هعلقم ةيتاراملاا طوطخلا ةرئاطل ةعمجلا مويلا حابص ةروصلا هذه تطقتلإ
هطرع هعاضب يرتشي حيار هينميلا ةيسنجلا لمحي طقف دحاو بكار
English translation: This picture was taken this Friday morning of an Emirates Airlines plane
taking off from Dubai to Beijing, China, with only one passenger on board of Yemeni nationality,
going to buy Arta goods.</p>
        <p>Snippets from Google SS CS Our analyses
ىلإ يبد نم هعلقم ةيتاراملاا طوطخلا ةرئاطل ةعمجلا مويلا حابص ةروصلا هذه تطقتلإ 0.98 0.98 Copied tweet
.هطرع هعاضب يرتشي حيار هينميلا ةيسنجلا لمحي طقف دحاو بكار اﻬنتم ىلعو ... نيكب
ىلإ يبد نم هعلقم ةيتاراملاا طوطخلا ةرئاطل دحلاا مويلا حابص ةروصلا هذه تطقتلإ 0.98 0.94 Copied tweet
يرتشي حيار هينميلا ةيسنجلا لمحي طقف دحاو بكار اﻬنتم ىلعو ) نيصلا ( نيكب</p>
        <p>... هعاضب
Additional samples have been considered and manually explored. The following cases we found in
most examples seem to be the main challenges preventing this experiment from taking advantage of the
extracted features.</p>
        <p>● Not all tweets convey a story, such as “The fact that an Egyptian young man died of Corona in
China via news”.
● Many tweets are talking about the same claim. Because the dataset is built based on 18 claims,
they searched for tweets that talked about the same or negated the claim.
● An unclear tweet that is challenging to be compared with Google snippets. For example, the
tweet is a question, which is meant to negate specific news, such as ” Did you believe the rumor of
the evacuation of Oman for Yemeni students?”
● The result is a copied tweet.
● A tweet might convey more than one claim. In some cases, one claim is correct and the other
one is false.
● Snippets agree with Fake once. Although we hypothesized that snippets should agree with
correct information and disagree with rumors, it was not always the case in many examples.</p>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>5. Conclusion and future work</title>
      <p>Although there are many studies to combat fake news, the problem still exists, as the large volume of
information on social media makes it challenging to fact-check such news manually. Our aim in this
research is to study Google search and results if it can serve as a valuable tool in detecting
misinformation on Twitter and similar platforms. The experiment involved comparing snippets, brief
summaries of a web page that appear below its URL and title in Google Search, with tweets to check
their similarity, and our research question was whether the similarity between a tweet and the retrieved
snippets could be used as a feature to help detect fake news. The study found that the similarity between
tweets and snippets does not indicate a clear difference. Thus, this approach is not enough to predict
news credibility. We also manually examined random samples to find possible reasons discussed in the
results and discussion section. Additional research is required to investigate alternative methods for
extracting more features. One possibility is to retrieve complete news articles and their sources, instead
of relying only on brief descriptions or snippets. Moreover, we aim to explore different approaches
rather than relying exclusively on similarities. For instance, one potential method is to extract
abstractive summaries from the newly gathered news articles and assess their impact as supplementary
information for detecting misinformation.</p>
    </sec>
    <sec id="sec-9">
      <title>Acknowledgements</title>
      <p>We would like to express our sincere gratitude to the Ministry of Education in Saudi Arabia, King
Abdulaziz University, and the University of Leeds for their support. We also want to express our
appreciation to the reviewers who devoted their time and expertise to review this paper, offering
valuable feedback and insights.
[7] M. Viviani and G. Pasi, “Credibility in social media: opinions, news, and health information—
a survey,” Wiley Interdiscip Rev Data Min Knowl Discov, vol. 7, no. 5, p. e1209, 2017.
[8] X. Zhou and R. Zafarani, “A Survey of Fake News: Fundamental Theories, Detection Methods,
and Opportunities,” ACM Comput Surv, 2020, doi: 10.1145/3395046.
[9] R. Oshikawa, J. Qian, and W. Y. Wang, “A survey on natural language processing for fake news
detection,” in LREC 2020 - 12th International Conference on Language Resources and
Evaluation, Conference Proceedings, 2020.
[10] X. Zeng, A. S. Abumansour, and A. Zubiaga, “Automated fact‐checking: A survey,” Lang</p>
      <p>
        Linguist Compass, vol. 15, no. 10, p. e12438, 2021.
[11] S. Kumar, S. Kumar, P. Yadav, and M. Bagri, “A Survey on Analysis of Fake News Detection
Techniques,” in Proceedings - International Conference on Artificial Intellige
        <xref ref-type="bibr" rid="ref4">nce and Smart
Systems, ICAIS 2021</xref>
        , 2021. doi: 10.1109/ICAIS50930.2021.9395978.
[12] M. K. Elhadad, K. Fun Li, and F. Gebali, “Fake News Detection on Social Media: A Systematic
Survey,” in 2019 IEEE Pacific Rim Conference on Communications, Computers and Signal
Processing, PACRIM 2019 - Proceedings, 2019. doi: 10.1109/PACRIM47961.2019.8985062.
[13] N. M. Lee, “Fake news, phishing, and fraud: a call for research on digital media literacy
education beyond the classroom,” Commun Educ, vol. 67, no. 4, pp. 460–466, 2018.
[14] S. K. Uppada, K. Manasa, B. Vidhathri, R. Harini, and B. Sivaselvan, “Novel approaches to fake
news and fake account detection in OSNs: user social engagement and visual content centric
model,” Soc Netw Anal Min, vol. 12, no. 1, p. 52, 2022.
[15] Y. M. Rocha, G. A. de Moura, G. A. Desidério, C. H. de Oliveira, F. D. Lourenço, and L. D. de
Figueiredo Nicolete, “The impact of fake news on social media and its influence on health during
the COVID-19 pandemic: A systematic review,” J Public Health (Bangkok), pp. 1–10, 2021.
[16] S. A. Khan, M. H. Alkawaz, and H. M. Zangana, “The use and abuse of social media for
spreading fake news,” in 2019 IEEE International Conference on Automatic Control and
Intelligent Systems (I2CACIS), IEEE, 2019, pp. 145–148.
[17] “Fake News Detection on Social Media A Data Mining Perspective,” International Journal of
      </p>
      <p>Innovative Technology and Exploring Engineering, 2020, doi: 10.35940/ijitee.i7098.079920.
[18] Z. Guo, M. Schlichtkrull, and A. Vlachos, “A survey on automated fact-checking,” Trans Assoc</p>
      <p>Comput Linguist, vol. 10, pp. 178–206, 2022.
[19] X. Zhou and R. Zafarani, “Fake News: a survey of research, Detection Methods, and</p>
      <p>
        Opportunitie
        <xref ref-type="bibr" rid="ref3">s,” ACM Comput Surv, 2018</xref>
        .
[20] S. Althabiti, M. Alsalka, and E. Atwell, “SCUoL at CheckThat! 2021: An AraBERT model for
check-worthiness of Arabic tweets,” i
        <xref ref-type="bibr" rid="ref4">n CEUR Workshop Proceedings, 2021</xref>
        .
[21] S. Althabiti, M. A. Alsalka, and E. Atwell, “SCUoL at CheckThat! 2022: fake news detection
using transformer-based models,” in CEUR Workshop Proceedings, CEUR Workshop
Proceedings, 2022, pp. 428–433.
[22] G. Jardaneh, H. Abdelhaq, M. Buzz, and D. Johnson, “Classifying Arabic tweets based on
credibility using content and user features,” in 2019 IEEE Jordan International Joint
Conference on Electrical Engineering and Information Technology, JEEIT 2019 - Proceedings,
2019. doi: 10.1109/JEEIT.2019.8717386.
[23] E. A. Hassan and F. Meziane, “A survey on automatic fake news identification techniques for
online and socially produced data,” in Proceedings of the International Conference on
Computer, Control, Electrical, and Electronics Engineering 2019, ICCCEEE 2019, 2019. doi:
10.1109/ICCCEEE46830.2019.9070857.
[24] X. Zhang, J. Cao, X. Li, Q. Sheng, L. Zhong, and K. Shu, “Mining dual emotion for fake news
detection,” in Proceedi
        <xref ref-type="bibr" rid="ref4">ngs of the Web Conference 2021</xref>
        , 2021, pp. 3465–3476.
[25] M. A. Alonso, D. Vilares, C. Gómez-Rodríguez, and J. Vilares, “Sentiment analysis for fake
news detectio
        <xref ref-type="bibr" rid="ref4">n,” Electronics (Switzerland). 2021</xref>
        . doi: 10.3390/electronics10111348.
[26] B. Bhutani, N. Rastogi, P. Sehgal, and A. Purwar, “Fake News Detection Using Sentiment
Analysis,” in 2019 12th International Conference on Contemporary Computing, IC3 2019,
2019. doi: 10.1109/IC3.2019.8844880.
[27] S. S. Alanazi and M. B. Khan, “Arabic Fake News Detection In Social Media Using Readers’
Comments: Text Mining Techniques In Action,” INTERNATIONAL JOURNAL OF
COMPUTER SCIENCE AND NETWORK SECURITY, 2020.
      </p>
      <p>S. Althabiti, M. A. Alsalka, and E. Atwell, “Detecting Arabic Fake News on Social Media using
Sarcasm and Hate Speech in Comments”.</p>
      <p>
        S. Mhatre and A. Masurkar, “A hybrid method for fake news detection using cosi
        <xref ref-type="bibr" rid="ref4">ne similarity
scores,” in 2021</xref>
        International Conference on Communication information and Computi
        <xref ref-type="bibr" rid="ref4">ng
Technology (ICCICT), IEEE, 2021</xref>
        , pp. 1–6.
      </p>
      <p>A. P. S. Bali, M. Fernandes, S. Choubey, and M. Goel, “Comparative performance of machine
learning algorithms for fake news detection,” in International conference on advances in
computing and data sciences, Springer, 2019, pp. 420–430.</p>
      <p>D. Dementieva and A. Panchenko, “Fake news detection using multilingual evidence,” in 2020
IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA), IEEE,
2020, pp. 775–776.</p>
      <p>
        S. H. Long and M. P. Bin Hamzah, “Fake news detection,” in Computational Scie
        <xref ref-type="bibr" rid="ref4">nce and
Technology, Springer, 2021</xref>
        , pp. 295–303.
      </p>
      <p>
        S. D. Samantaray and G. Jodhani, “Fake news detection using text similarity approach,”
International Journal of Science and Research (IJSR), vol. 8, no. 1, pp. 1126–1132, 2019.
B. Al Asaad and M. Erascu, “A tool for fake new
        <xref ref-type="bibr" rid="ref3">s detection,” in 2018</xref>
        20th International
Symposium on Symbolic and Numeric Algorithms for Scientific Computing (
        <xref ref-type="bibr" rid="ref3">SYNASC), IEEE,
2018</xref>
        , pp. 379–386.
      </p>
      <p>F. Haouari, M. Hasanain, R. Suwaileh, and T. Elsayed, “ArCOV19-rumors: Arabic COVID-19
twitter dataset for misinformation detection,” arXiv preprint arXiv:2010.08768, 2020.
M. T. R. Laskar, X. Huang, and E. Hoque, “Contextualized embeddings based transformer
encoder for sentence similarity modeling in answer selection task,” in Proceedings of the Twelfth
Language Resources and Evaluation Conference, 2020, pp. 5505–5514.</p>
      <p>P. Sunilkumar and A. P. Shaji, “A survey on semantic similarity,” in 2019 International
Conference on Advances in Computing, Communication and Control (ICAC3), IEEE, 2019, pp.
1–8.</p>
      <p>V. Zhelezniak, A. Savkov, A. Shen, and N. Y. Hammerla, “Correlation coefficients and semantic
textual similarity,” arXiv preprint arXiv:1905.07790, 2019.</p>
      <p>N. Reimers and I. Gurevych, “Sentence-bert: Sentence embeddings using siamese
bertnetworks,” arXiv preprint arXiv:1908.10084, 2019.</p>
      <p>F. Feng, Y. Yang, D. Cer, N. Arivazhagan, and W. Wang, “Language-agnostic bert sentence
embedding,” arXiv preprint arXiv:2007.01852, 2020.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>H.</given-names>
            <surname>Allcott</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Gentzkow</surname>
          </string-name>
          , “
          <article-title>Social media and fake news in the 2016 election,” Journal of economic perspectives</article-title>
          , vol.
          <volume>31</volume>
          , no.
          <issue>2</issue>
          , pp.
          <fpage>211</fpage>
          -
          <lpage>236</lpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhou</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Zafarani</surname>
          </string-name>
          , “
          <article-title>Fake News: a survey of research, Detection Methods</article-title>
          , and Opportunities,
          <source>” ACM Comput Surv</source>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>40</lpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <given-names>S.</given-names>
            <surname>Sommariva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Vamos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mantzarlis</surname>
          </string-name>
          , L. U.
          <string-name>
            <surname>-L. Đào</surname>
            , and
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Martinez</surname>
            <given-names>Tyson</given-names>
          </string-name>
          , “
          <article-title>Spreading the (fake) news: exploring health messages on social media and the implications for health professionals using a case study,”</article-title>
          <source>Am J Health Educ</source>
          , vol.
          <volume>49</volume>
          , no.
          <issue>4</issue>
          , pp.
          <fpage>246</fpage>
          -
          <lpage>255</lpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <given-names>N.</given-names>
            <surname>Krishnan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Tromble</surname>
          </string-name>
          , and
          <string-name>
            <given-names>L. C.</given-names>
            <surname>Abroms</surname>
          </string-name>
          , “Research note:
          <article-title>Examining how various social media platforms have responded to COVID-19 misinformation,” Harvard Kennedy School Misinformation Review</article-title>
          , vol.
          <volume>2</volume>
          , no.
          <issue>6</issue>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>25</lpage>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <given-names>K.</given-names>
            <surname>Shu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sliva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Tang</surname>
          </string-name>
          , and H. Liu, “
          <article-title>Fake news detection on social media: A data mining perspective,” ACM SIGKDD explorations newsletter</article-title>
          , vol.
          <volume>19</volume>
          , no.
          <issue>1</issue>
          , pp.
          <fpage>22</fpage>
          -
          <lpage>36</lpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <given-names>A.</given-names>
            <surname>Hermida</surname>
          </string-name>
          , “
          <article-title>Social media and journalism,” The Sage handbook of social media</article-title>
          , pp.
          <fpage>497</fpage>
          -
          <lpage>511</lpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>