<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Team lm-detector at PAN: Can NLI be an Appropriate Approach to Machine-Generated Text Detection</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Guojun Wu</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Qinghao Guan</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Zurich</institution>
          ,
          <addr-line>Zurich, 8050</addr-line>
          ,
          <country country="CH">Switzerland</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2024</year>
      </pub-date>
      <abstract>
        <p>The ability to accurately detect machine-generated text is becoming increasingly important in various fields, including academia, journalism, and online security. In this study, we propose a novel method for detecting machine-generated text, predicated on the hypothesis that the probability of reasoning from human-generated text to machine-generated text is inherently higher. Our approach is inspired by the principles of Natural Language Inference (NLI), leveraging the diferences in logical consistency and contextual coherence between human and machine-generated texts. However, our experimental results indicate that this method may not be as efective as anticipated. Despite the theoretical foundation, the practical application of our method revealed significant limitations, suggesting that it might not be a reliable solution for detecting machine-generated text. Further research and refinement are necessary to enhance the eficacy of detection techniques.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Machine-Generated Text Detection</kwd>
        <kwd>Natural Language Inference</kwd>
        <kwd>Probability</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The rapid advancement of artificial intelligence has led to the widespread use of machine-generated
text in various domains [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Recent development of Large Language Models, such as ChatGPT [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ],
LLaMA2 [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], can generate human-like texts for various downstream tasks. The performance has been
proven to be better than humans in some specific tasks. From automated news articles to customer
service chatbots, these texts are becoming indistinguishable from those written by humans [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. While
this technological progress brings many benefits, it also poses significant challenges, particularly in the
realm of text authenticity and content verification.
      </p>
      <p>Detecting machine-generated text is crucial for maintaining the integrity of information. In academia,
it helps prevent plagiarism and ensures the originality of scholarly work. In journalism, it safeguards
against the dissemination of fake news and misinformation. In online platforms, it enhances security by
identifying automated accounts and reducing the spread of malicious content. Despite the growing need
for efective detection methods, current techniques often fall short. Traditional approaches typically
focus on stylistic and linguistic features, which can be easily manipulated by advanced language models.
As a result, there is a pressing need for more robust and reliable detection methods.</p>
      <p>In this study, we propose a novel approach inspired by Natural Language Inference (NLI). Our method
is based on the hypothesis that we are able to judge which text is generated by human by comparing the
probability of reasoning (See Section 4). By leveraging the logical consistency and contextual coherence
diferences between human and machine-generated texts, we aim to develop a more accurate detection
model.</p>
      <p>However, our experimental results suggest that this method may not be as efective as initially
anticipated. Despite its theoretical promise, practical application revealed significant limitations,
highlighting the complexity of the detection problem. This paper presents our findings and discusses
the implications for future research in this area.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Background</title>
      <p>
        This work was developed for the PAN task——Generative AI Authorship Verification [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]
——where we are given two texts, one authored by a human, and another by a machine, and our target
is to pick out the one generated by a human. The dataset was generated by the PAN organizers which
is another PAN task, where the participants were asked to build models that can create texts as similar
as human-written. The bootstrap dataset consists of multiple text genres, including news articles,
Wikipedia texts, and fiction.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Previous Work</title>
      <p>
        Several studies have approached the detection of AI-generated text as a binary classification problem
using neural network-based detectors [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. For instance, OpenAI has fine-tuned RoBERTa-based GPT-2
detector models to diferentiate between human-generated and GPT-2-generated texts [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. Also, some
researchers explored the zero-shot detection method for AI-generated text, such as [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], who noted that
AI-generated passages typically exhibit negative curvature in the log probability of texts and proposed
DetectGPT, a zero-shot detection method that capitalizes on this observation.
      </p>
      <p>
        However, relying on neural networks for detection can expose these methods to adversarial and
poisoning attacks [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] [13]. To address this, some researchers have explored watermarking AI-generated
texts to facilitate detection [14] [15]. Watermarking involves embedding specific patterns in the text,
making detection easier. This method provides consistent detection across various contexts and model
updates, maintaining its efectiveness without the need for frequent re-training. Watermarking is
computationally eficient, requiring minimal additional resources during text generation and enabling
quick verification processes. Furthermore, it enhances security by complicating adversarial attempts
to alter the text undetected and supports traceability by linking the generated content back to the
specific model instance, aiding in accountability and auditing eforts. Overall, watermarking presents
a low-overhead, resilient, and scalable approach to managing the challenges of AI-generated text
detection.
      </p>
    </sec>
    <sec id="sec-4">
      <title>4. System Overview</title>
      <p>NLI is an NLP task involving determining the relationship between two sentences: whether one sentence
(the hypothesis) can be inferred from another sentence (the premise). It has been proven that NLI can be
used for inconsistency detection in summarization where the source document acts as the premise, and
the generated summary acts as the hypothesis [16]. The NLI model evaluates whether the information
in the summary can logically be inferred from the source document.</p>
      <p>Inspired by the usage of NLI in the summarization task, we detect the machine-generated text by
ways of detecting the logical relationship between the premise and hypothesis.</p>
      <p>The model checks for three possible relationships between the premise and hypothesis:
Entailment: The hypothesis (summary) logically follows from the premise (source text).
Contradiction: The hypothesis contradicts the premise.</p>
      <p>Neutral: There is no clear logical relationship, meaning the hypothesis might add information not
present in the premise or omit critical information.</p>
      <p>Given two texts, text_a and text_b, one authored by a human and the other generated by a machine,
we calculated the probability of reasoning for each text pair independently. Assume that the probability
of reasoning from text_a to text_b is  (text_a → text_b) while the probability of reasoning from text_b
to text_a is  (text_b → text_a). If  (text_a → text_b) is larger than  (text_b → text_a), we could
assume that text_a was written by human. It is worth noting that we did not conduct any pre-processing
(i.e. segmentation) in order to provide suficient contexts for ratiocination by our model. Our hypothesis
is as follows.</p>
      <p>As known, the premise provides the basis or groundwork for a conclusion while the hypothesis, in
a logical structure, is a statement whose validity is supported by the premise. On the one hand, the
machine-generated text in our task was generated based on the human-written text, which means that
the human-generated text provides the foundation thus the human-written text should be the premise.
On the other hand, the text generated by AI may not match human authors in terms of semantic
coherence and logical depth [17]. Accordingly, it is impossible to derive the human-generated text on
the basis of the machine-written one.</p>
      <p>Beside, we define that if the diference between  (text_a → text_b) and  (text_a → text_b) is lower
than 0.05, their relation is neutral, meaning there is no clear logical relationship between these two
texts.</p>
      <p>The language model for NLI was DeBERTa-v3-large-mnli-fever-anli-ling-wanli which is a fine-tuned
model specifically for NLI tasks [ 18] for the reason that this model was fine-tuned on distinct datasets
including FEVER (Fact Extraction and VERification), ANLI (Adversarial NLI), and WANLI
(Weaklysupervised ANLI).</p>
    </sec>
    <sec id="sec-5">
      <title>5. Results</title>
      <p>We compared our model, detector, with the baseline models. The performance metrics indicates that
the detector model significantly underperforms compared to all baseline approaches. Specifically, the
detector (our model) achieved a ROC-AUC of 0.548, which is the lowest among all models, indicating
poor discriminative ability. Its Brier score is 0.622, suggesting less accurate probabilistic predictions,
while its C@1 score of 0.489 is the lowest, reflecting suboptimal performance. The detector’s F1 score
of 0.442 and F0.5u score of 0.461 are also the lowest, indicating poor balance and precision-focused
performance, respectively. In contrast, Baseline Binoculars exhibits the highest performance across
most metrics, with a ROC-AUC of 0.972, a Brier score of 0.957, and C@1, F1, and F0.5u scores all around
0.965. The overall mean score of Baseline Binoculars is 0.965, compared to the detector’s mean of
0.512. The Fast-DetectGPT (Mistral) baseline also performed well, with a ROC-AUC of 0.876 and a
mean score of 0.866. Quantile-based evaluations show the 95-th quantile achieving the highest scores,
with a ROC-AUC of 0.994 and a mean score of 0.990, underscoring the best performance of the top 5
percentages of models.</p>
      <p>Table 2 also shows the results, initially pre-filled with the oficial baselines provided by the PAN
organizers and summary statistics of all submissions to the task (i.e., the maximum, median, minimum,
and 95-th, 75-th, and 25-th percentiles over all submissions to the task).</p>
      <p>We analyzed the reason why our model has such bad performance.</p>
      <p>Firstly, our method relies on a single feature—logical inference—which might be insuficient for a
comprehensive detection mechanism. Successful detection methods typically incorporate multiple
features, including linguistic, syntactic, and semantic analysis, to capture the multifaceted nature
of human versus machine-generated text. It suggests that we should establish more comprehensive
classification features.</p>
      <p>Besides, modern AI models like GPT-3 and GPT-4 are designed to generate text that closely mimics
human writing, including coherence and detail. Consequently, the distinction between detailed
AIgenerated text and detailed human text becomes blurred. Human writers can also produce highly detailed
and coherent text, especially in structured or formal contexts. This overlap reduces the efectiveness of
using coherence and detail as discriminative features.</p>
      <p>Human-generated text can also exhibit inferential relationships, especially in informative or
explanatory writing. For instance, when humans explain concepts or provide detailed descriptions, their
sentences can logically infer one another. As mentioned, the dataset involves multiple genres. Our
method might frequently misclassify detailed and coherent human text (such as news articles) as
AI-generated, leading to a high rate of false positives.</p>
      <p>From the NLI model’s perspective, the method we use is zero-shot which means that our model
has not been specifically trained or fine-tuned on a dataset of human vs. AI-generated texts. Also,
DeBERTa’s strength in recognizing logical relationships might lead it to frequently detect coherent
inferences in both human and AI texts, making it dificult to distinguish between them based solely on
coherence. This means it may not be optimized to distinguish the subtle diferences between the two
types of text.
6. Further Direction
To enhance the performance of AI-generated text detection method, it is crucial to fine-tune the
DeBERTa model specifically on a dataset tailored for distinguishing human and AI-generated text. This
specialized training will help the model learn the unique patterns and nuances of the task. Additionally,
incorporating a broader feature set, including stylistic markers, syntactic complexity, and lexical
diversity, can provide a more robust classification framework. Employing ensemble methods that
combine zero-shot NLI models with supervised models trained on the detection task can further
improve performance by leveraging the strengths of diferent approaches. Regular evaluation and
refinement using diverse and updated datasets will ensure the model adapts to new patterns in text
generation. Lastly, utilizing contextual embedding techniques can capture richer text representations,
enabling deeper contextual analysis beyond simple logical inference.</p>
    </sec>
    <sec id="sec-6">
      <title>7. Conclusion</title>
      <p>In this study, we explored the potential of using Natural Language Inference (NLI) to detect
machinegenerated text by examining the logical relationship between premises and hypotheses. Our hypothesis
was that machine-generated text, being more detailed and coherent due to probabilistic generation,
would difer significantly from human text in inferential relationships. However, our experimental
results revealed significant limitations in this approach. Specifically, our zero-shot method using the
"DeBERTa-v3-large-mnli-fever-anli-ling-wanli" model underperformed across various metrics, including
ROC-AUC, Brier score, C@1, F1, and F0.5u scores, when compared to baseline models. The primary
reasons for this underperformance include the overlap in coherence and detail between human and
AI-generated texts, the limitations of a single-feature approach based solely on logical inference, and
the model’s lack of fine-tuning on a task-specific dataset. Our findings suggest that successful detection
of AI-generated text requires a multifaceted approach, incorporating diverse linguistic features and
specialized training. Future work should focus on fine-tuning models on relevant datasets and integrating
additional classification features to improve the robustness and accuracy of detection methods.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>We appreciate the help from Simon Clematide and Andrianos Michail who provided their suggestions
to improve our work. We would also extend our sincere gratitude to the anonymous reviewer whose
insightful comments and suggestions significantly contributed to the improvement of this manuscript.
preprint arXiv:1412.6572 (2014).
[13] V. S. Sadasivan, M. Soltanolkotabi, S. Feizi, Cuda: Convolution-based unlearnable datasets, arXiv
preprint arXiv:2303.04278 (2023).
[14] J. Kirchenbauer, J. Geiping, Y. Wen, M. Shu, K. Saifullah, K. Kong, K. Fernando, A. Saha, M.
Goldblum, T. Goldstein, On the reliability of watermarks for large language models, arXiv preprint
arXiv:2303.04278 (2023).
[15] X. Zhao, Y.-X. Wang, L. Li, Protecting language generation models via invisible watermarking,
arXiv preprint arXiv:2302.03162 (2023).
[16] P. Laban, T. Schnabel, P. N. Bennett, M. A. Hearst, Summac: Re-visiting nli-based models for
inconsistency detection in summarization, Transactions of the Association for Computational
Linguistics 10 (2022) 163–177.
[17] O. Marchenko, O. Radyvonenko, T. Ignatova, P. Titarchuk, D. Zhelezniakov, Improving text
generation through introducing coherence metrics, Cybernetics and Systems Analysis 56 (2020)
13–21.
[18] P. He, J. Gao, W. Chen, Debertav3: Improving deberta using electra-style pre-training with
gradient-disentangled embedding sharing, arXiv preprint arXiv:2111.09543 (2021).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>R.</given-names>
            <surname>Rombach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Blattmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Lorenz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Esser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Ommer</surname>
          </string-name>
          ,
          <article-title>High-resolution image synthesis with latent difusion models (</article-title>
          <year>2022</year>
          )
          <fpage>10684</fpage>
          -
          <lpage>10695</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <article-title>[2] OpenAI, Chatgpt: Optimizing language models for dialogue (</article-title>
          <year>2022</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>H.</given-names>
            <surname>Touvron</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Martin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. R.</given-names>
            <surname>Stone</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Albert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Almahairi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Babaei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Bashlykov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Batra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Bhargava</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bhosale</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. M.</given-names>
            <surname>Bikel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Blecher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. C.</given-names>
            <surname>Ferrer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Cucurull</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Esiobu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Fernandes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Fu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Fu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Fuller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Goswami</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. S.</given-names>
            <surname>Hartshorn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Hosseini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Hou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Inan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kardas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Kerkez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Khabsa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. M.</given-names>
            <surname>Kloumann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. V.</given-names>
            <surname>Korenev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. S.</given-names>
            <surname>Koura</surname>
          </string-name>
          , M.
          <article-title>-</article-title>
          <string-name>
            <surname>A. Lachaux</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Lavril</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Liskovich</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Lu</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Mao</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          <string-name>
            <surname>Martinet</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Mihaylov</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Mishra</surname>
            , I. Molybog,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Nie</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Poulton</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Reizenstein</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Rungta</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Saladi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Schelten</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Silva</surname>
            ,
            <given-names>E. M.</given-names>
          </string-name>
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Subramanian</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          <string-name>
            <surname>Tan</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Tang</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Taylor</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Williams</surname>
            ,
            <given-names>J. X.</given-names>
          </string-name>
          <string-name>
            <surname>Kuan</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          <string-name>
            <surname>Yan</surname>
            , I. Zarov,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Fan</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Kambadur</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Narang</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Rodriguez</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Stojnic</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Edunov</surname>
          </string-name>
          ,
          <source>T. Scialom, Llama</source>
          <volume>2</volume>
          :
          <article-title>Open foundation and fine-tuned chat models</article-title>
          ,
          <source>arxiv preprint arXiv:2307</source>
          .
          <fpage>09288</fpage>
          . (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>L.</given-names>
            <surname>Dugan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ippolito</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kirubarajan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Shi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Callison-Burch</surname>
          </string-name>
          ,
          <article-title>Real or fake text?: Investigating human ability to detect boundaries between human-written and machine-generated text</article-title>
          ,
          <source>AAAI</source>
          (
          <year>2022</year>
          ). arXiv:
          <volume>2212</volume>
          .
          <fpage>12672</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bevendorf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X. B.</given-names>
            <surname>Casals</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Chulvi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dementieva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Elnagar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Freitag</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Fröbe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Korenčić</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mayerl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mukherjee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Panchenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Rangel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Smirnova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Stamatatos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Taulé</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ustalov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegmann</surname>
          </string-name>
          , E. Zangerle,
          <article-title>Overview of PAN 2024: Multi-Author Writing Style Analysis, Multilingual Text Detoxification, Oppositional Thinking Analysis, and Generative AI Authorship Verification</article-title>
          , in: L.
          <string-name>
            <surname>Goeuriot</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Mulhem</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Quénot</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Schwab</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Soulier</surname>
          </string-name>
          ,
          <string-name>
            <surname>G. M. D. Nunzio</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Galuščáková</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. G. S. de Herrera</surname>
          </string-name>
          , G. Faggioli, N. Ferro (Eds.),
          <source>Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Fifteenth International Conference of the CLEF Association (CLEF</source>
          <year>2024</year>
          ), Lecture Notes in Computer Science, Springer, Berlin Heidelberg New York,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bevendorf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegmann</surname>
          </string-name>
          , E. Stamatatos,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <article-title>Overview of the Voight-Kampf Generative AI Authorship Verification Task at PAN 2024</article-title>
          , in: G.
          <string-name>
            <given-names>F. N.</given-names>
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Galuščáková</surname>
          </string-name>
          , A. G. S. de Herrera (Eds.), Working Notes of CLEF 2024 -
          <article-title>Conference and Labs of the Evaluation Forum, CEUR-WS</article-title>
          .org,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A. A.</given-names>
            <surname>Ayele</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Babakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bevendorf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X. B.</given-names>
            <surname>Casals</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Chulvi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dementieva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Elnagar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Freitag</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Fröbe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Korenčić</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mayerl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Moskovskiy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mukherjee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Panchenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Rangel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Rizwan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Schneider</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Smirnova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Stamatatos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Stakovskii</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Taulé</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ustalov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Yimam</surname>
          </string-name>
          , E. Zangerle,
          <article-title>Overview of PAN 2024: Multi-Author Writing Style Analysis, Multilingual Text Detoxification, Oppositional Thinking Analysis, and Generative AI Authorship Verification</article-title>
          , in: L.
          <string-name>
            <surname>Goeuriot</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Mulhem</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Quénot</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Schwab</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Soulier</surname>
          </string-name>
          ,
          <string-name>
            <surname>G. M. D. Nunzio</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Galuščáková</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. G. S. de Herrera</surname>
          </string-name>
          , G. Faggioli, N. Ferro (Eds.),
          <source>Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Fifteenth International Conference of the CLEF Association (CLEF</source>
          <year>2024</year>
          ), Lecture Notes in Computer Science, Springer, Berlin Heidelberg New York,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M.</given-names>
            <surname>Fröbe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Kolyada</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Grahm</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Elstner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Loebe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hagen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <article-title>Continuous Integration for Reproducible Shared Tasks with TIRA.io</article-title>
          , in: J.
          <string-name>
            <surname>Kamps</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Goeuriot</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Crestani</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Maistro</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Joho</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Davis</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Gurrin</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          <string-name>
            <surname>Kruschwitz</surname>
            ,
            <given-names>A</given-names>
          </string-name>
          . Caputo (Eds.),
          <source>Advances in Information Retrieval. 45th European Conference on IR Research (ECIR</source>
          <year>2023</year>
          ), Lecture Notes in Computer Science, Springer, Berlin Heidelberg New York,
          <year>2023</year>
          , pp.
          <fpage>236</fpage>
          -
          <lpage>241</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>G.</given-names>
            <surname>Jawahar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Abdul-Mageed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. V.</given-names>
            <surname>Lakshmanan</surname>
          </string-name>
          ,
          <article-title>Automatic detection of machine generated text: A critical survey</article-title>
          , arXiv preprint arXiv:
          <year>2011</year>
          .
          <volume>01314</volume>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ott</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Du</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Joshi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Levy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Zettlemoyer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Stoyanov</surname>
          </string-name>
          ,
          <article-title>Roberta: A robustly optimized bert pretraining approach</article-title>
          , arXiv preprint arXiv:
          <year>1907</year>
          .
          <volume>11692</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>E.</given-names>
            <surname>Mitchell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Khazatsky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. D.</given-names>
            <surname>Manning</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Finn</surname>
          </string-name>
          , Detectgpt:
          <article-title>Zero-shot machine-generated text detection using probability curvature</article-title>
          ,
          <source>arXiv preprint arXiv:2301.11305</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>I. J.</given-names>
            <surname>Goodfellow</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Shlens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Szegedy</surname>
          </string-name>
          ,
          <article-title>Explaining and harnessing adversarial examples</article-title>
          , arXiv
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>