<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Team Deloitte at PAN: A Novel Approach for Generative AI Text Detection</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Harika Abburi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nirmala Pudota</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Balaji Veeramani</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Edward Bowen</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sanmitra Bhattacharya</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Deloitte &amp; Touche Assurance &amp; Enterprise Risk Services India Private Limited</institution>
          ,
          <country country="IN">India</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Deloitte &amp; Touche LLP</institution>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Large Language Models (LLMs) have demonstrated remarkable capabilities in generating text that closely resembles human writing across wide range of styles and genres. However, such capabilities are prone to potential misuse, such as fake news generation, spam email creation, and misuse in academic assignments. Hence, it is essential to build automated approaches capable of distinguishing between artificially generated text and human-authored text. In this paper, we propose an architecture which includes three components: transformer model, token-level features, and state-of-the-art embeddings. This approach achieves a mean score of 0.973 on PAN dataset, demonstrating its efectiveness in identifying AI-generated text.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;AI-generated text</kwd>
        <kwd>Large language models</kwd>
        <kwd>Text classification</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The domain of Natural Language Generation (NLG) is witnessing a remarkable transformation with
the emergence of Large Language Models (LLMs) such as Generative Pre-trained Transformer (GPT-4)
[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], Large Language Model Meta AI (LLaMA-3), and Mistral LLMs. LLMs, characterized by their large
parameter size, have shown state-of-the-art (SOTA) capabilities in generating text that closely mirrors
the verbosity and style of human language. They have been shown to outperform traditional Natural
Language Processing (NLP) approaches across applications ranging from question answering to code
completions [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ]. While LLMs’ ability to generate human-like text is impressive, it concurrently
poses a growing risk in various sectors, including the proliferation of misinformation, phishing email
generation, and the preservation of academic integrity [
        <xref ref-type="bibr" rid="ref4 ref5 ref6">4, 5, 6</xref>
        ]. To address these challenges, it has
become increasingly crucial for both humans and automated systems to detect and distinguish
AIgenerated text. This calls for ongoing research and the development of reliable detection methods to
promote the responsible and ethical use of LLMs [
        <xref ref-type="bibr" rid="ref7 ref8">7, 8</xref>
        ].
      </p>
      <p>
        Diverse modeling strategies, ranging from simple statistical techniques to cutting-edge
Transformerbased architectures [
        <xref ref-type="bibr" rid="ref10 ref9">9, 10</xref>
        ], have been investigated to help develop solutions capable of distinguishing
AI-generated text from those written by humans. For instance, Gehrmann et al. [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] proposed
straightforward statistical methods which capitalize on the assumption that AI systems tend to rely on a limited
set of language patterns with high confidence scores. Liu et al. [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] proposed a model which extracts
Robustly optimized Bidirectional Encoder Representations from Transformers (BERT) approach (RoBERTa)
embeddings and combines them with sentence-level graph representations. In contrast to individual
detection models, we recently proposed ensemble modeling approaches for detecting AI-generated text
where the probabilities from various constituent pre-trained language models are concatenated and
passed as a feature vector to machine learning classifiers [
        <xref ref-type="bibr" rid="ref10 ref13">10, 13</xref>
        ]. This approach resulted in improved
predictions compared to individual classifiers, highlighting the benefits of combining multiple models.
      </p>
      <p>Recently, there has been a notable increase in research focused on zero-shot detection techniques for
AI-generated text. These methods predominantly involve the analysis of outputs from LLMs, utilizing
features such as entropy, log-probability scores, and perplexity [14, 15, 16, 17] to help distinguish
between human-authored and AI-generated content. However, the zero-shot detection methods can be
more efective when there is direct access to the internal specifics of the LLM that generated the text.
This limits the robustness of zero-shot detection methods across diferent scenarios [18, 19].</p>
      <p>To boost this area of research further, PAN 2024 workshop introduced ‘generative AI authorship
verification’ shared task, which focuses on determining whether a given text is human-authored or
AI-generated. In response to this challenge, we proposed an architecture which leverages a pre-trained
RoBERTa-base AI-text detector [20], a Bidirectional Long Short-Term Memory (BiLSTM) attention layer
for processing token-level perplexity and word-frequency features, and a state-of-the-art EmbEddings
from bidirEctional Encoder rEpresentations (E5) model [21]. Our experiments show that our proposed
approach outperforms several state-of-the-art approaches based on established metrics.</p>
    </sec>
    <sec id="sec-2">
      <title>2. PAN Dataset</title>
      <p>The PAN dataset, released by the PAN shared task organizers, contains both human-authored and
AI-generated text. It includes a total of 15,190 samples, consisting of 1,087 human-authored texts
and 14,103 AI-generated texts produced using thirteen diferent LLMs, namely: (i) alpaca-7b, (ii)
bigscience-bloomz-7b1, (iii) chavinlo-alpaca-13b, (iv) gemini-pro, (v) meta-llama-llama-2-70b-chat-hf,
(vi) meta-llama-llama-2-7b-chat-hf, (vii) mistralai-mistral-7b-instruct-v0.2, (viii)
mistralai-mistral-8X7binstruct-v0.1, (ix) qwen-qwen1.5-72b-chat-8bit, (x) text-bison-002, (xi) vicgalle-gpt2-open-instruct-v1,
(xii) gpt-3.5-turbo-0125, and (xiii) gpt-4-turbo-preview. More details about the dataset can be found in
the PAN overview paper [22].</p>
    </sec>
    <sec id="sec-3">
      <title>3. Proposed Approach</title>
      <p>Our framework consists of three major components for feature representations of input text:
(i) RoBERTa base Open AI detector [20] produces document-level representations that capture the
overall content’s meaning;</p>
      <p>(ii). Token-level features[23] are extracted from various GPT2 variants (DistilGPT2, GPT-2, GPT-2
Medium, and GPT-2 Large) to analyze both the predictability of the word sequence and word frequency.
The token level features include: log-probability of the observed token, log-probability of the most
likely token, entropy of the token probability distribution at a given position, and word frequency. A
BiLSTM with attention layer processes these token-level features to create combined document-level
representations;
(iii). Document-level feature representation are also extracted using the E5 model.</p>
      <p>Finally, the three document-level representations are concatenated into a single representation. The
combined representation is then fed into a fully connected layer to generate the final probabilities.
Baseline Binoculars [17]
Baseline Fast-DetectGPT (Mistral)
Baseline Fast-DetectGPT [16]</p>
      <p>ROC-AUC Brier C@1
0.987
3.1. Results
In this section, we present an evaluation of our AI-generated text detection experiments. In our
experiments, 20% of the training data was used for validation. For test run submissions, the validation
set was merged back with the training set. We report results using well established metrics [22], and
compare our model’s performance with state-of-the-art models, as shown in Table 1. After predicting
the label for each input, we produced the final scores of each text pair as recommended by the organizers
[22]. The results indicate that our method surpasses the state-of-the-art models, achieving a modest
improvement of around 1% over the mean score of Binoculars model.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusion</title>
      <p>In this paper, we described our submission to the PAN shared task for detecting the generative AI
content. Our experiments demonstrated that our generative AI text detection approach performs well
compared to other state-of-the-art approaches in this domain. For future work, we aim to enhance the
generalizability of our model by testing it on diverse datasets to evaluate its robustness in real-world
applications.
[14] J. Su, T. Y. Zhuo, D. Wang, P. Nakov, Detectllm: Leveraging log rank information for zero-shot
detection of machine-generated text, arXiv preprint arXiv:2306.05540 (2023).
[15] E. Mitchell, Y. Lee, A. Khazatsky, C. D. Manning, C. Finn, Detectgpt: Zero-shot machine-generated
text detection using probability curvature, arXiv preprint arXiv:2301.11305 (2023).
[16] G. Bao, Y. Zhao, Z. Teng, L. Yang, Y. Zhang, Fast-detectgpt: Eficient zero-shot detection of
machine-generated text via conditional probability curvature, arXiv preprint arXiv:2310.05130
(2023).
[17] A. Hans, A. Schwarzschild, V. Cherepanova, H. Kazemi, A. Saha, M. Goldblum, J. Geiping, T.
Goldstein, Spotting llms with binoculars: Zero-shot detection of machine-generated text, arXiv preprint
arXiv:2401.12070 (2024).
[18] Y.-F. Zhang, Z. Zhang, L. Wang, R. Jin, Assaying on the robustness of zero-shot machine-generated
text detectors, arXiv preprint arXiv:2312.12918 (2023).
[19] X. Yang, L. Pan, X. Zhao, H. Chen, L. Petzold, W. Y. Wang, W. Cheng, A survey on detection of
llms-generated content, arXiv preprint arXiv:2310.15654 (2023).
[20] I. Solaiman, M. Brundage, J. Clark, A. Askell, A. Herbert-Voss, J. Wu, A. Radford, G. Krueger, J. W.</p>
      <p>Kim, S. Kreps, et al., Release strategies and the social impacts of language models, arXiv preprint
arXiv:1908.09203 (2019).
[21] L. Wang, N. Yang, X. Huang, B. Jiao, L. Yang, D. Jiang, R. Majumder, F. Wei, Text embeddings by
weakly-supervised contrastive pre-training, arXiv preprint arXiv:2212.03533 (2022).
[22] J. Bevendorf, M. Wiegmann, J. Karlgren, L. Dürlich, E. Gogoulou, A. Talman, E. Stamatatos,
M. Potthast, B. Stein, Overview of the “Voight-Kampf” Generative AI Authorship Verification
Task at PAN and ELOQUENT 2024, in: G. Faggioli, N. Ferro, P. Galuščáková, A. G. S. de Herrera
(Eds.), Working Notes of CLEF 2024 - Conference and Labs of the Evaluation Forum, CEUR
Workshop Proceedings, CEUR-WS.org, 2024.
[23] P. Przybyła, N. Duran-Silva, S. Egea-Gómez, I’ve seen things you machines wouldn’t believe:
Measuring content predictability to identify automatically-generated text, in: Proceedings of the
Iberian Languages Evaluation Forum (IberLEF 2023). CEUR Workshop Proceedings, CEUR-WS,
Jaén, Spain, 2023.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1] OpenAI, Gpt-4
          <source>technical report</source>
          ,
          <year>2023</year>
          . arXiv:
          <volume>2303</volume>
          .
          <fpage>08774</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Tang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ge</surname>
          </string-name>
          , S. Liu,
          <string-name>
            <given-names>T.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Luo</surname>
          </string-name>
          ,
          <article-title>Domain adaptive code completion via language models and decoupled domain databases</article-title>
          ,
          <source>arXiv preprint arXiv:2308.09313</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>T. H.</given-names>
            <surname>Kung</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Cheatham</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Medenilla</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Sillos</surname>
          </string-name>
          , L. De Leon,
          <string-name>
            <given-names>C.</given-names>
            <surname>Elepaño</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Madriaga</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Aggabao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Diaz-Candido</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Maningo</surname>
          </string-name>
          , et al.,
          <article-title>Performance of chatgpt on usmle: Potential for ai-assisted medical education using large language models</article-title>
          ,
          <source>PLoS digital health 2</source>
          (
          <year>2023</year>
          )
          <article-title>e0000198</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Uchendu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Ma</surname>
          </string-name>
          , T. Le,
          <string-name>
            <given-names>R.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <article-title>Turingbench: A benchmark environment for turing test in the age of neural text generation</article-title>
          ,
          <source>arXiv preprint arXiv:2109.13296</source>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>M.</given-names>
            <surname>Weiss</surname>
          </string-name>
          ,
          <article-title>Deepfake bot submissions to federal public comment websites cannot be distinguished from human submissions</article-title>
          ,
          <source>Technology Science</source>
          <volume>2019121801</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>L.</given-names>
            <surname>Weidinger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Mellor</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Rauh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Grifin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Uesato</surname>
          </string-name>
          , P.-S. Huang, M. Cheng, M. Glaese,
          <string-name>
            <given-names>B.</given-names>
            <surname>Balle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kasirzadeh</surname>
          </string-name>
          , et al.,
          <article-title>Ethical and social risks of harm from language models</article-title>
          ,
          <source>arXiv preprint arXiv:2112.04359</source>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Zhan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yuan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. F.</given-names>
            <surname>Wong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. S.</given-names>
            <surname>Chao</surname>
          </string-name>
          ,
          <article-title>A survey on llm-gernerated text detection: Necessity, methods, and future directions</article-title>
          ,
          <source>arXiv preprint arXiv:2310.14724</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Mansurov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Ivanov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Su</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Shelmanov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Tsvigun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O. M.</given-names>
            <surname>Afzal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mahmoud</surname>
          </string-name>
          , G. Puccetti,
          <string-name>
            <given-names>T.</given-names>
            <surname>Arnold</surname>
          </string-name>
          , et al.,
          <article-title>M4gt-bench: Evaluation benchmark for black-box machine-generated text detection</article-title>
          ,
          <source>arXiv preprint arXiv:2402.11175</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>V.</given-names>
            <surname>Verma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Fleisig</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Tomlin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Klein</surname>
          </string-name>
          , Ghostbuster:
          <article-title>Detecting text ghostwritten by large language models</article-title>
          ,
          <source>arXiv preprint arXiv:2305.15047</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>H.</given-names>
            <surname>Abburi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Suesserman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Pudota</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Veeramani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Bowen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bhattacharya</surname>
          </string-name>
          ,
          <article-title>Generative ai text classification using ensemble llm approaches</article-title>
          , in: IberLEF@ SEPLN,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>S.</given-names>
            <surname>Gehrmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Strobelt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. M.</given-names>
            <surname>Rush</surname>
          </string-name>
          , Gltr:
          <article-title>Statistical detection and visualization of generated text</article-title>
          , arXiv preprint arXiv:
          <year>1906</year>
          .
          <volume>04043</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>X.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Lan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <article-title>Coco: Coherence-enhanced machine-generated text detection under data limitation with contrastive learning</article-title>
          ,
          <source>arXiv preprint arXiv:2212.10341</source>
          (
          <year>2022</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>H.</given-names>
            <surname>Abburi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Roy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Suesserman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Pudota</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Veeramani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Bowen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bhattacharya</surname>
          </string-name>
          ,
          <article-title>A simple yet eficient ensemble approach for AI-generated text detection</article-title>
          ,
          <source>in: Proceedings of the Third Workshop on Natural Language Generation</source>
          , Evaluation, and
          <string-name>
            <surname>Metrics</surname>
          </string-name>
          (GEM),
          <article-title>Association for Computational Linguistics</article-title>
          , Singapore,
          <year>2023</year>
          , pp.
          <fpage>413</fpage>
          -
          <lpage>421</lpage>
          . URL: https://aclanthology.org/
          <year>2023</year>
          .gem-
          <volume>1</volume>
          .
          <fpage>32</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>