<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Can LLMs identify the Disinformation they create? Exploring the Role of Large Language Models in Disinformation detection</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Pavan Sanjay Nichani</string-name>
          <email>pavannichani@uni-koblenz.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ayaan Ahmad Siddiqui</string-name>
          <email>asiddiqui@uni-koblenz.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sakshi Tiwarii</string-name>
          <email>sakshi@uni-koblenz.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ark Ikhu</string-name>
          <email>aikhu@uni-koblenz.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marina Ernst</string-name>
          <email>marinaernst@uni-koblenz.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Universität Koblenz</institution>
          ,
          <addr-line>Universitätsstr. 1, 56070 Koblenz</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The rapid advancements in Artificial Intelligence, especially in the development of Large Language Models, have introduced a domain abundant with opportunities while presenting considerable challenges. Although LLMs are demonstrating impressive abilities in understanding and generating human-like text, they have also contributed to a rise in disinformation. An outstanding example of this influence is COVID-19 pandemic, when fake news and conspiracy theories impacted individuals and society. This paper aims to determine whether LLMs can efectively rephrase human-like text and, utilize their knowledge base in detecting rephrased disinformation, a subtle and increasingly prevalent form of manipulated content. Furthermore, this research investigates the criteria and reasoning behind how LLMs classify text as real or fake by LLMs, aiming to assess whether LLMs can be used to counter disinformation while examining the challenges associated with their use for this purpose.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Disinformation</kwd>
        <kwd>Disinformation detection</kwd>
        <kwd>LLM as Detector</kwd>
        <kwd>Gemini Llama</kwd>
        <kwd>LIWC</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        LLMs are evolving rapidly, bringing along transformation in a variety of fields. They have inspired a
range of applications that not only help in understanding human behavior but also provide insights into
LLMs themselves by drawing upon human societal contexts [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. A particularly intriguing aspect of LLMs
is its paradoxical role in both generating and detecting fake news. On the one hand, LLMs can produce
convincingly realistic yet false content, making it more challenging to identify disinformation [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. On
the other hand, these models can be utilized to build powerful tools for detecting and combating the
very misinformation they may help create [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. The significance of detecting fake news became crucial
during the COVID-19 pandemic, where misinformations posed significant risks to individuals and
society.
      </p>
      <p>This paper aims to explore two primary research questions:
Disinformation, Misinformation and Learning in the Age of Generative AI: Joint Proceedings of the 1st International Workshop
on Disinformation and Misinformation in the Age of Generative AI (DISMISS-FAKE’25) and the 4th International Workshop on
Investigating Learning during Web Search (IWILDS’25) co-located with 18th International ACM WSDM Conference on Web Search
and Data Mining (WSDM 2025)
* Corresponding author.
focuses on the results related to RQ1, providing insights into the ability of LLMs to generate human-like
text. Section 5 discusses the findings for RQ2, highlighting the efectiveness of LLMs in detecting fake
text and common types of errors observed. Finally, Section 6 ofers a conclusion and suggests directions
for future research.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        LLMs, being trained on vast and diverse data, possess advanced linguistic capabilities, making it dificult
to distinguish between human-generated and LLM-generated content. A comprehensive survey on the
challenges of diferentiating between these two types of texts was presented in a 2020 paper, following
the introduction of GPT-2 [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. As LLMs evolve, the efectiveness of prompt engineering has been
demonstrated in guiding models to produce more accurate responses [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. However, studies suggest that
the format and structure of input prompts significantly influence the performance and output quality
of LLMs [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. In order to analyze LLM-generated text from a linguistic perspective, techniques like
Linguistic Inquiry and Word Count (LIWC) has been used to assess the presence of specific linguistic
features within generated texts [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Cosine similarity can be applied to compare these linguistic features
between human-written and LLM-generated content, helping to assess how closely the LLM-generated
text mirrors human writing. Additionally, methods like Levenshtein distance have been utilized to filter
out highly similar LLM texts before being passed to advanced attribution models such as BERT and
Random Forest[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>
        In the domain of disinformation detection, fine-tuned models based on BERT have shown success
in identifying LLM-generated text [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. Other approaches, such as hybrid CNN-RNN deep learning
frameworks, have been explored, though the efectiveness of these methods is highly dependent on the
dataset and the training process [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. The ability of LLMs to generate and detect disinformation has
reached a point where, with the right prompts, models can now be tasked with determining whether a
piece of content was generated by AI or written by a human [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. In particular, the rise of automated
fake news generation has driven interest in developing models [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] or with help of prompts using
LLMs [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] to detect such content.
      </p>
      <p>In light of the research questions posed in this study, our work aims to build on these findings by: (1)
examining the extent to which LLMs can generate human-like text and its impact on detection, and (2)
evaluating the efectiveness of these models in detecting disinformation, particularly in the context of a
COVID-19-related dataset.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Methodology</title>
      <p>Figure 1 illustrates our approach, which begins with a human-written dataset. Using prompting
techniques, we generated a rephrased LLM-generated database. Rephrasing was selected as the method
for generating LLM text to reflect common strategies used by adversarial actors, who often tries to
modify and make existing misinformation to appear more factual to evade detection. This method
ensures that the original disinformation persists within the newly generated content.</p>
      <p>During the linguistic and syntactic analysis phase, we assess any potential similarities between
human-written and LLM-generated data (RQ1). Finally, we employ LLMs as a detection method, using
prompts to evaluate their ability to identify disinformation (RQ2).</p>
      <p>
        We used two diferent LLMs, Gemini-1.5-flash and LLaMA-3 8b-8192, to leverage their unique
strengths and ensure an unbiased evaluation of disinformation detection capabilities. This approach
also allows us cross-validation of findings across diferent architectures, while including both commercial
and opens-source solutions. Gemini-1.5-flash uses self-attention mechanism which enables it to process
nuanced language and identify subtle indicators of disinformation by considering the full context of
a passage. Its architecture is particularly suited for tasks requiring detailed contextual analysis [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
LLaMA-3 8b-8192 generates text by conditioning on previous tokens, allowing it to maintain logical
coherence across extended passages. This makes it highly efective at both producing human-like text
and identifying inconsistencies indicative of disinformation [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
      </p>
      <sec id="sec-3-1">
        <title>3.1. Human-written Dataset</title>
        <p>
          Our dataset comprises manually annotated dataset of 10,700 social media posts and articles related
to real (5,600) and fake (5,100) COVID-19 news [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]. The dataset was pre-processed by removing
hyperlinks, and posts with 12 words or fewer were excluded to avoid short tweets. This filtering resulted
in a final dataset of 8,880 posts, comprising 5,275 real and 3,605 fake entries.
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. LLM-Generated Dataset</title>
        <p>
          For LLM dataset generation, we employed Gemini-1.5-flash and LLaMA-3 8b-8192. To create diverse
outputs, we used two rephrasing prompting strategies: paraphrase generation and rewriting
generation [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]. Tables 1 and 2 show examples of the generated output, with bold words indicating the
specific terms or phrases added by the LLMs during the rephrasing process.
        </p>
        <p>Prompt 1: Para- You are a journalist. Given a “passage”, please paraphrase it. Do not generate any
phrase Generation other word. The “passage” is &lt;text&gt;.</p>
        <p>Original Text The main mode of transmission of #Novel #Coronavirus is through droplets and it
is possible that infected smokers may blow droplets carrying the virus when they
exhale. Regardless of #COVID19 you should steer clear of second-hand smoke as it
may cause various health problems.
gemini-1.5-flash The novel coronavirus spreads primarily through droplets, which infected smokers
may exhale. Secondhand smoke is harmful and should be avoided.
llama3-8b-8192 The primary way the new coronavirus spreads is through tiny droplets, and it’s
conceivable that smokers who are infected may release virus-laden droplets when
exhaling. It’s not limited to COVID-19; second-hand smoke can lead to various
health issues, regardless of the virus.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Generating disinformation with LLMs</title>
      <p>
        To assess the similarity between human-written text and the text generated by LLMs, we conducted a
linguistic analysis. This analysis aimed to determine whether our prompting methods can modify
humanwritten content in a meaningful way and whether they may inadvertently introduce disinformation.
We employed the Linguistic Inquiry and Word Count (LIWC) [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] for this purpose, leveraging its robust
LIWC2015 Dictionary. As the focus of research is on rephrased disinformation, identifying subtle
changes in linguistic features such as word choice, tone, syntax is important to evaluate the similarities
Prompt 2: Re- You are a journalist. Given a “passage”, please rewrite it to make it more convincing.
writing Generation The content should be the same. The style should be serious, calm and informative.
      </p>
      <p>The “passage” is: &lt;text&gt; .</p>
      <p>Original Text The main mode of transmission of #Novel #Coronavirus is through droplets and it
is possible that infected smokers may blow droplets carrying the virus when they
exhale. Regardless of #COVID19 you should steer clear of second-hand smoke as it
may cause various health problems.
gemini-1.5-flash The primary route of transmission for the novel coronavirus is through respiratory
droplets. Individuals infected with COVID-19 may release these droplets,
potentially containing the virus, when they exhale, including when smoking. While
the potential for transmission through secondhand smoke remains under
investigation, it is important to note that secondhand smoke is known to cause a
variety of health issues. As a precautionary measure, it is strongly advised to
avoid exposure to secondhand smoke in all situations.
llama3-8b-8192 The primary means of transmission for the novel coronavirus is via airborne droplets.</p>
      <p>
        Furthermore, research suggests that individuals infected with the virus may
inadvertently spread it through the exhalation of droplets when they breathe. It is
essential to note that this phenomenon is not exclusive to the current
COVID19 pandemic. Exposure to second-hand smoke, a common phenomenon in
various settings, has been linked to a range of serious health issues. As a result,
it is crucial to maintain a safe distance from second-hand smoke, regardless of
the prevailing health situation, to minimize the risk of developing respiratory
problems and other illnesses.
between human-written and LLM generated text. The LIWC2015 Dictionary contains approximately
6,400 words, word stems, and select emoticons, categorized under linguistic or psychological
subdictionaries. Words may belong to multiple categories. For example, the word ’cried’ falls into five
categories: sadness, negative emotion, overall afect, verbs, and past focus. If this word appears in a
text, it increases the count for each of these categories accordingly [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ].
      </p>
      <p>(a) Human-Written Text
(b) Text Rephrased by Llama3
(c) Text Rephrased by Gemini</p>
      <p>We analyzed the LIWC distributions across human-written and LLM-generated text. While LIWC
features for the text generated by both LLMs with Prompt 1 are distributed very similar to the
humanwritten ones, for Prompt 2 distributions defer significantly. Figures 2 and 3 illustrate these diference
for both "Real" and "Fake" tweets respectively. The box plots depict the distribution of LIWC features in
the original human-written texts and their corresponding rephrased versions generated by the LLMs.</p>
      <p>Prompt 2, which rewrites the text to make it more persuasive with a more serious and informative
tone, a number of LIWC categories demonstrate significant changes with notable shifts highlighted
in red. Llama-generated texts generally difer more from human-generated ones. While occurrences
of categories such as function words, prepositions and articles are easy to understand, others such
as ’cognitive processes’ (cogproc) and ’social processes’ (social) require deeper analysis. It is worth
noting that although Gemini stick more closely to human-written text, when it comes to fake tweets,
the ’cognitive processes’ category becomes more frequent.</p>
      <p>To further quantify the similarity between human-written and LLM-generated text, the cosine
similarity of their LIWC feature values was calculated. Among 8,880 tweets analyzed, 5,854 tweets
(4,168 labeled "Real" and 1,686 "Fake"), had cosine similarity scores exceeding 0.8 across both LLMs
and prompts. A high cosine similarity score suggests that the LLM-generated texts maintained similar
feature distributions to the human-written texts, despite potential diferences in the absolute frequencies
of individual features. In essence, while the LLMs may have altered the intensity or frequency of certain
linguistic elements, the overall pattern and structure of the feature distributions remained consistent
with the human-written content. Despite high cosine similarity, manual verification of some
LLMgenerated text revealed factual inaccuracies. This indicates that while LLMs can produce text with
linguistic patterns that closely mimic human writing, their rephrased content may still introduce factual
errors, raising concerns about the risk of disinformation.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Detecting LLM-generated disinformation</title>
      <p>In this section, we assess how efectively the models distinguish between real and fake information and
analyze the misclassifications to identify potential patterns. Initially, the LLM-generated outputs for
both prompts were manually reviewed, and a balanced subset of 800 tweets (400 real and 400 fake) was
selected. This manual review ensured that no factual inaccuracies were introduced during paraphrasing
process in sampled data.</p>
      <p>To further utilize LLMs for disinformation detection, we conducted experiments with the same models
using identical parameters to those used during the text generation phase. The following prompt was
used for detection tasks:</p>
      <p>Prompt: Given a passage, determine whether it is a piece of disinformation. Only output ’YES’ or ’NO’.
The passage is: &lt;text&gt;</p>
      <p>Detection performance of bot models is shown in table 3 Gemini demonstrated better performance
than Llama3 in identifying LLM-generated disinformation across both prompt types. As shown in
Appendix A, Gemini initially exhibited balanced performance, achieving a True Positive Rate (TPR)
of 65.5%and a True Negative Rate (TNR) of 90.6% for human-written texts. After rephrasing, Gemini
improved in detecting real tweets, particularly with prompt GP1, where it reached a TPR of 81.8%,
though its ability in detecting fake tweets slightly declined.</p>
      <p>For Llama3, rephrasing improved real tweets detection, particularly with prompt GP2, achieving
88.4% TPR. However, its ability to detect fake tweets declined significantly, highlighting the model’s
sensitivity to rephrasing.</p>
      <sec id="sec-5-1">
        <title>5.1. Error Analysis</title>
        <p>To identify patterns in the model’s misclassifications, we conducted a detailed error analysis.</p>
        <p>Rephrasing Impact: Llama3 model exhibited a slightly higher number of changes when rephrasing
was applied compared to the Gemini model (Table 14). This suggests that Llama3 may be more sensitive
or responsive to rephrased text. Notably, the Llama3 was more successful at correcting errors through
rephrasing than the Gemini model, as indicated by a higher number of corrected misclassifications.
However, Llama3 also displayed a greater tendency to amplify errors under certain prompts compared
to Gemini. This variation in model performance highlights the importance of prompt engineering in
determining how rephrasing influences both error correction and exacerbation.</p>
        <p>LIWC Analysis: We also explored whether linguistic features contribute to misclassification (refer
Figure 4). High counts in categories such as "Article," "Pronoun," and "Drives" in TP suggest these
are characteristic of genuine tweets correctly classified. In contrast, categories like "Relativ," "Prep,"
and "Verb" appeared less frequently in FPs than in TPs and TNs, suggesting that while present, these
features are less prominent in genuine tweets misclassified as fake. Similarly, "Article," "Drives," and
"Cogproc" (Cognitive Processes) were significant in FNs, similar to their presence in TPs. This points to
the challenge of distinguishing fake content that closely mimics the structure of real tweets.</p>
        <p>LLM Analysis: We further refined our analysis by focusing on tweets where both Gemini and
Llama3, regardless of the prompt used, changed their classification from the gold label. This filtering
resulted in a subset of 29 tweets. To understand the reasoning behind these classification changes,
we employed the chain-of-thought prompting technique, instructing Gemini and Llama to explain
their decision-making processes. The models were prompted to consider factors such as tone, context,
wording, or implied meaning that influenced their updated classifications.</p>
        <p>Prompt: You are analyzing tweets to classify them based on their content. Below are the details of an
original tweet and its rephrased version. Please read carefully and answer the following:
Original Tweet: &lt;text&gt; Original Classification by you: &lt;text&gt;
Now, consider the rephrased version:
Rephrased Tweet: &lt;text&gt;
Rephrased Classification: &lt;text&gt;</p>
        <p>Question: Explain why you changed your classification for the rephrased tweet compared to the original
tweet. Discuss any diferences in tone, context, wording, or implied meaning that led to the new classification.</p>
        <p>Manual review of the models’ responses, revealed that certain aspects of the rephrased tweets, such
as inclusion of words like ’joke’, ’disturbing’ or mentions of news agencies, prompted the model to
alter their classification. Similarly, provocative or emotionally charged language (e.g., exaggerated
claims) often led to a change in labels. In contrast, rephrased tweets with neutral, factual tone or
including credible source also triggered change in classification. For detailed overview, refer to Figure 5
in Appendix B.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>In conclusion, LLMs such as Llama3 and Gemini can generate human-like text with high linguistic
similarity to human-written content. While these models show potential for detecting disinformation,
their susceptibility to rephrased content indicate the need for further development. Our experiments
confirm LLMs’ ability to create human-like text, though their performance in detecting disinformation
is more complex. While Gemini showed greater accuracy in identifying real content, but both models
displayed weaknesses, particularly when dealing with rephrased text. Factors such as prompt selection,
text length, and model type can impact their efectiveness. The results from linguistic analysis using
LIWC suggests that while LLMs can closely replicate the linguistic patterns of human writing, but they
remain prone to errors, especially when subtle text modifications are introduced. This raises concerns
about their reliability as standalone detectors, as adversarial actors could exploit these vulnerabilities
by employing techniques like rephrasing.</p>
    </sec>
    <sec id="sec-7">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the author(s) used ChatGPT, DeepL Write in order to: Grammar
and spelling check, Paraphrase and reword, Improve writing style. After using this tool/service, the
author(s) reviewed and edited the content as needed and take(s) full responsibility for the publication’s
content.</p>
    </sec>
    <sec id="sec-8">
      <title>A. LLM as Detector: Performance Analysis</title>
      <p>This section presents the confusion matrix for the LLMs used as detectors. We provided the original
human-written text to both Gemini and Llama, followed by the output generated from Prompt 1 by
both models, and then the output from Prompt 2 by both models, efectively creating a cross-evaluation.
The tables below illustrate the confusion matrix for each scenario.</p>
    </sec>
    <sec id="sec-9">
      <title>B. Error Analysis</title>
      <p>The table below illustrates the impact of text rephrasing on classification by the LLMs.</p>
      <p>Figure 5: Error Analysis: Change in classification</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Leng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yuan</surname>
          </string-name>
          ,
          <article-title>Do llm agents exhibit social behavior?</article-title>
          , arXiv,
          <year>2024</year>
          . URL: https://arxiv.org/abs/ 2312.15198.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>E.</given-names>
            <surname>Papageorgiou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Chronis</surname>
          </string-name>
          , I. Varlamis,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Himeur</surname>
          </string-name>
          ,
          <article-title>A survey on the use of large language models (llms) in fake news</article-title>
          ,
          <source>Future Internet</source>
          <volume>16</volume>
          (
          <year>2024</year>
          ). URL: https://www.mdpi.com/1999-5903/16/8/298. doi:
          <volume>10</volume>
          .3390/fi16080298.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>D.</given-names>
            <surname>Sallami</surname>
          </string-name>
          ,
          <article-title>From deception to detection: The dual roles of large language models in fake news</article-title>
          , arXiv,
          <year>2024</year>
          . URL: https://arxiv.org/abs/2409.17416.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>G.</given-names>
            <surname>Jawahar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Abdul-Mageed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Lakshmanan</surname>
          </string-name>
          ,
          <string-name>
            <surname>V.S.</surname>
          </string-name>
          ,
          <article-title>Automatic detection of machine generated text: A critical survey</article-title>
          , in: D.
          <string-name>
            <surname>Scott</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Bel</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          Zong (Eds.),
          <source>Proceedings of the 28th International Conference on Computational Linguistics</source>
          ,
          <source>International Committee on Computational Linguistics</source>
          , Barcelona,
          <source>Spain (Online)</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>2296</fpage>
          -
          <lpage>2309</lpage>
          . URL: https://aclanthology.org/
          <year>2020</year>
          .coling-main.
          <volume>208</volume>
          /. doi:
          <volume>10</volume>
          .18653/v1/
          <year>2020</year>
          .coling-main.
          <volume>208</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J.</given-names>
            <surname>Wei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Schuurmans</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bosma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Ichter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Xia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. H.</given-names>
            <surname>Chi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q. V.</given-names>
            <surname>Le</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <article-title>Chainof-thought prompting elicits reasoning in large language models</article-title>
          ,
          <source>in: Proceedings of the 36th International Conference on Neural Information Processing Systems</source>
          , NIPS '22, Curran Associates Inc.,
          <string-name>
            <surname>Red</surname>
            <given-names>Hook</given-names>
          </string-name>
          ,
          <string-name>
            <surname>NY</surname>
          </string-name>
          , USA,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J.</given-names>
            <surname>White</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Fu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Hays</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sandborn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Olea</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Gilbert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Elnashar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Spencer-Smith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. C.</given-names>
            <surname>Schmidt</surname>
          </string-name>
          ,
          <article-title>A prompt pattern catalog to enhance prompt engineering with chatgpt</article-title>
          ,
          <year>2023</year>
          . URL: https://arxiv.org/abs/2302.11382. arXiv:
          <volume>2302</volume>
          .
          <fpage>11382</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>M.</given-names>
            <surname>Sandler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Choung</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ross</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>David</surname>
          </string-name>
          ,
          <article-title>A linguistic comparison between human and chatgptgenerated conversations</article-title>
          ,
          <year>2024</year>
          . URL: https://arxiv.org/abs/2401.16587. arXiv:
          <volume>2401</volume>
          .
          <fpage>16587</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>K.</given-names>
            <surname>Jones</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. R.</given-names>
            <surname>Nurse</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <article-title>Are you robert or roberta? deceiving online authorship attribution models using neural text generators</article-title>
          ,
          <source>Proceedings of the International AAAI Conference on Web and Social Media</source>
          <volume>16</volume>
          (
          <year>2022</year>
          )
          <fpage>429</fpage>
          -
          <lpage>440</lpage>
          . URL: https://ojs.aaai.org/index.php/ICWSM/article/view/19304. doi:
          <volume>10</volume>
          .1609/icwsm.v16i1.
          <fpage>19304</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M.</given-names>
            <surname>Szczepański</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Pawlicki</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kozik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Choraś</surname>
          </string-name>
          ,
          <article-title>New explainability method for bert-based model in fake news detection</article-title>
          ,
          <source>Scientific Reports</source>
          <volume>11</volume>
          (
          <year>2021</year>
          ).
          <source>doi: 10.1038/s41598-021-03100-6.</source>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Nasir</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O. S.</given-names>
            <surname>Khan</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Varlamis</surname>
          </string-name>
          ,
          <article-title>Fake news detection: A hybrid cnn-rnn based deep learning approach</article-title>
          ,
          <source>International Journal of Information Management Data Insights</source>
          <volume>1</volume>
          (
          <year>2021</year>
          )
          <article-title>100007</article-title>
          . URL: https://www.sciencedirect.com/science/article/pii/S2667096820300070. doi:https://doi.org/ 10.1016/j.jjimei.
          <year>2020</year>
          .
          <volume>100007</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>A.</given-names>
            <surname>Bhattacharjee</surname>
          </string-name>
          , H. Liu,
          <article-title>Fighting fire with fire: Can chatgpt detect ai-generated text?</article-title>
          ,
          <source>SIGKDD Explor. Newsl</source>
          .
          <volume>25</volume>
          (
          <year>2024</year>
          )
          <fpage>14</fpage>
          -
          <lpage>21</lpage>
          . URL: https://doi.org/10.1145/3655103.3655106. doi:
          <volume>10</volume>
          .1145/ 3655103.3655106.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>I.</given-names>
            <surname>Vykopal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Pikuliak</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Srba</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Moro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Macko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bielikova</surname>
          </string-name>
          ,
          <source>Disinformation capabilities of large language models</source>
          ,
          <year>2024</year>
          , pp.
          <fpage>14830</fpage>
          -
          <lpage>14847</lpage>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2024</year>
          .
          <article-title>acl-long</article-title>
          .
          <volume>793</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>B.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Tan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Nirmal</surname>
          </string-name>
          , H. Liu,
          <source>Disinformation Detection: An Evolving Challenge in the Age of LLMs</source>
          ,
          <year>2024</year>
          , pp.
          <fpage>427</fpage>
          -
          <lpage>435</lpage>
          . doi:
          <volume>10</volume>
          .1137/1.9781611978032.50.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>S. H.</given-names>
            <surname>Baskaran</surname>
          </string-name>
          ,
          <article-title>A Comparison of Transformer and Autoregressive LLM Designs</article-title>
          ,
          <source>International Journal of Research Publication and Reviews</source>
          ,
          <year>2023</year>
          , pp. Vol
          <volume>4</volume>
          , no 11, pp
          <fpage>19</fpage>
          -
          <lpage>26</lpage>
          . URL: https://ijrpr. com/uploads/V4ISSUE11/IJRPR19003.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>P.</given-names>
            <surname>Patwa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sharma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Pykl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Guptha</surname>
          </string-name>
          , G. Kumari,
          <string-name>
            <given-names>M. S.</given-names>
            <surname>Akhtar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ekbal</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. Das</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Chakraborty</surname>
          </string-name>
          ,
          <source>Fighting an Infodemic: COVID-19 Fake News Dataset</source>
          , Springer International Publishing,
          <year>2021</year>
          , p.
          <fpage>21</fpage>
          -
          <lpage>29</lpage>
          . URL: http://dx.doi.org/10.1007/978-3-
          <fpage>030</fpage>
          -73696-
          <issue>5</issue>
          _3. doi:
          <volume>10</volume>
          .1007/ 978-3-
          <fpage>030</fpage>
          -73696-
          <issue>5</issue>
          _
          <fpage>3</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>C.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Shu</surname>
          </string-name>
          ,
          <string-name>
            <surname>Can</surname>
          </string-name>
          llm-generated
          <source>misinformation be detected?</source>
          ,
          <year>2024</year>
          . URL: https://arxiv.org/abs/ 2309.13788. arXiv:
          <volume>2309</volume>
          .
          <fpage>13788</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>A.</given-names>
            <surname>Koutsoumpis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. K.</given-names>
            <surname>Oostrom</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Holtrop</surname>
          </string-name>
          , W. van Breda,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ghassemi</surname>
          </string-name>
          , R. E. de Vries,
          <article-title>The Kernel of Truth in Text-Based Personality Assessment: A Meta-Analysis of the Relations Between the Big Five and the Linguistic Inquiry and Word Count (LIWC</article-title>
          ), American Psychological Association,
          <year>2022</year>
          , pp.
          <volume>148</volume>
          (
          <issue>11</issue>
          -
          <fpage>12</fpage>
          ),
          <fpage>843</fpage>
          -
          <lpage>868</lpage>
          . URL: https://psycnet.apa.org/fulltext/2023-55252-004.html. doi:https: //doi.org/10.1037/bul0000381.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>J. W.</given-names>
            <surname>Pennebaker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. L.</given-names>
            <surname>Boyd</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Jordan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Blackburn</surname>
          </string-name>
          ,
          <source>The development and psychometric properties of liwc2015</source>
          ,
          <year>2015</year>
          . URL: https://repositories.lib.utexas.edu/server/api/core/bitstreams/ b0d26dcf-2391-
          <fpage>4701</fpage>
          -88d0-3cf50ebee697/content.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>