<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>RoBERTa and Bi-LSTM for Human vs AI Generated Text Detection Notebook for PAN at CLEF 2024</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Panagiotis Petropoulos</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vasilis Petropoulos</string-name>
        </contrib>
      </contrib-group>
      <abstract>
        <p>We are living in a new era of rapidly evolving AI, where new and increasingly powerful versions of existing models, or even new models, are constantly being produced. Humanity and industry are increasingly investing in these advancements. There are many real-life instances where these models are used by humans either to facilitate themselves or to deceive. Students and scholars who are ceasing to delve deep into knowledge and the production of fake news are two of main occurrences that happen frequently. Hence, there is a need to create a classifier capable of detecting and distinguishing AI-generated text from human-authored text. Several and very good approaches have been done, but they must continue to evolve as LLMs evolve. In this year shared task of PAN at CLEF [1] [2]sheds light on the aforementioned need. In this work an architecture of a combination of RoBERTa[3] and Bi-LSTM on top is proposed in order to solve the task.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;RoBERTa</kwd>
        <kwd>Bi-LSTM</kwd>
        <kwd>NLP</kwd>
        <kwd>AI generated Texts Detection</kwd>
        <kwd>Authorship Analysis</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        In the new era of LLMs there are several approaches that resolve the issue regarding AI generated
text detection. Some approaches use LLMs to distinguish AI generated text from human’s texts. Recent
approach of DetectGPT [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] uses a pretrained T5 encoder-decoder to produce alternations or variations
(perturbations) of a given input and then compares the log probability of the produced samples with the
original samples, in order to determine if the original text is AI generated or not. Another prior work
using RoBERTa transformer model, approaches the problem by modeling the task as a partial
PositiveUnlabeled (PU) [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] problem and formulating a Multiscale Positive Unlabeled (MPU) training
framework, in order to overcome the issue of wrong predictions when input is a short text [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. In
addition, other methodologies propose a zero-shot setting for the Human vs AI text detection, by
computing log perplexity using an “observer” LLM and cross-perplexity using a “performer” LLM for
a given text [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. The performer tries to predict the next token of a sequence and the observer tries to
evaluate the prediction. The ratio of these two metrics, called Binoculars score. Fast-DetectGPT [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] is
also an efficient method for detecting machine generated texts. Based on this approach the proposed
model calculates conditional probability curvature between text passages and language models in a
zero-shot setting.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Experimental setup</title>
      <p>
        To solve the task, a RoBERTa [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] base version model as a backbone is used. As of the previous year
of AV task [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] a model architecture with stacked BERT like transformer and Bi-LSTM on top [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]
made the solution of the task feasible. On top of RoBERTa model, a Bi-LSTM is used to capture
information from both directions of the RoBERTa output embeddings. Only the last 4 encode Layers
of RoBERTa model were unfreezed during 3-epoch training on GPU with 12 GB vRAM. The input
Embeddings Sequence for Bi-LSTM are calculated from the sum of the last 4 encode layers output of
RoBERTa. After Bi-LSTM a dropout Layer with 0.3 probability of an element (Neural Network unit)
to be zeroed is used, followed by a Fully Connected Layer, as classification head, to classify the input
text into Human or machine generated. The Dropout and Fully Connected Layers take as input the
concatenation of both directions from last hidden states of Bi-LSTM output. The Learning rate was
5e5 with AdamW optimizer, and the loss function was categorical crossentropy. The batch size was 32.
The architecture of the proposed model is illustrated in Figure 1.
2.1.
      </p>
    </sec>
    <sec id="sec-3">
      <title>Data preparation and processing</title>
      <p>
        The Voight-Kampff Generative AI Authorship Verification 2024 PAN Challenge [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] provided a train
dataset that consists of multiple jsonl files, that requested via Zenodo 2. One file that consists of news
articles written by humans and 13 jsonl files with texts generated by a known LLMs like GPT-4 , gemini
pro, Llama or Mistral. Both Human and machine generated texts in train dataset contain information
about the same topic. In order to have a ready-to-use dataset and train the proposed model 3 basic steps
are followed:
1. Parse texts from humans and machines.
2. Shuffle them, keeping the original sample id.
3. Perform a basic processing procedure, that includes only the replacement of digits (numbers)
with constant value of '1', keeping the original format.
4. Chunking due to the limitation of the RoBERTa base Model in input sequence length (max seq.
      </p>
      <p>length 512 tokens).</p>
      <p>RoBERTa base model has a limitation on the length of input tokens. For that reason, a chunking
procedure is applied with max sequence length of 512 (including Special Tokens). Between chunks of
the same text an overlapping approach was applied to avoid extremely long padding sequences.
2 https://zenodo.org/records/10718757</p>
    </sec>
    <sec id="sec-4">
      <title>2.1.1. Evaluation</title>
      <p>For the experiments 2 basic approaches are followed, following a random shuffling between human and
machine generated texts. Consider the LLMs to be treated as “Authors”.</p>
      <p>• open-set setup: Validation and Test datasets do not contain the same text and LLMs as Train
dataset. Also, the same approach between Validation set and Test set is followed.
• close-set setup: Train, Validation and Test datasets contain the same machines but different
texts.</p>
      <p>The Train-validation splitting ratio was 70% of chunks for train, 20% for Validation and 10% for
holdout test set. To determine if a text is AI generated or not, the proposed approach of this work uses all
the chunks from input text of Test set by averaging the posterior probabilities from each chunk. The
decision is calculated by the max average probability of binary output. As for the evaluation metrics,
we use the metrics provided in [12]
• AUC: the conventional area-under-the-curve of the precision-recall curve
• F1-score: the harmonic mean of the precision and recall [13]
• c@1: a variant of the conventional F1-score, which rewards systems that leave difficult
problems unanswered (i.e. scores of exactly 0.5) [14]
• overall: the simple average of all previous metrics</p>
    </sec>
    <sec id="sec-5">
      <title>3. Results</title>
      <sec id="sec-5-1">
        <title>Method</title>
        <sec id="sec-5-1-1">
          <title>RoBERTa +</title>
        </sec>
        <sec id="sec-5-1-2">
          <title>Bi-LSTM</title>
        </sec>
      </sec>
      <sec id="sec-5-2">
        <title>Method</title>
        <sec id="sec-5-2-1">
          <title>RoBERTa + Bi</title>
        </sec>
        <sec id="sec-5-2-2">
          <title>LSTM (open-set setup)</title>
        </sec>
        <sec id="sec-5-2-3">
          <title>RoBERTa + Bi</title>
        </sec>
        <sec id="sec-5-2-4">
          <title>LSTM (close-set setup)</title>
          <p>3 https://www.tira.io/
brier
0.909</p>
        </sec>
      </sec>
      <sec id="sec-5-3">
        <title>Overall</title>
        <p>0.906
On Table 1 We can see that the proposed model is performed very well on the dataset provided on Tira
and makes the task feasible.</p>
        <p>From the results it can been seen that for both evaluation methods the scores are almost the same
and too high. In close-set setup there are better scores due to the fact, that the proposed model trained
and evaluated with samples from all LLMs. In contrast, for open-set setup the model trained with texts
from different LLMs.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>4. Conclusion</title>
      <p>
        Based on the results, it is observed that, the proposed architecture and methodology is able to detect
human vs AI generated texts with high accuracy, achieving above 90% on accuracy and F1-score. The
high AUC and C@1 scores indicate that the model is able to reliably distinguish between human and
AI generated texts. In the future, it would be preferable to train a Siamese model architecture within a
contrastive learning framework either with simple contrastive loss [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] or with a triplet loss with online
and fast hard negatives mining[15] in order to train a model to produce different Vectors of Embeddings
for human and LLM texts in a Vector space. This could help improve the model’s generalization ability
to detect AI texts from unseen and future text generators. Also, additional features, such as POS-tags,
can be combined with contextualized word Embeddings, producing Vectors that can be treated as
features of a Classifier. Overall, the proposed approach shows promising results on this task, but
continual research is needed to keep up with the rapid advances in large language models.
      </p>
    </sec>
    <sec id="sec-7">
      <title>5. References</title>
      <p>[11] Fröbe, Maik, et al. "Continuous integration for reproducible shared tasks with TIRA. io."</p>
      <p>European Conference on Information Retrieval. Cham: Springer Nature Switzerland, 2023.
[12] Kestemont, Mike, et al. "Overview of the cross-domain authorship verification task at PAN
2020." Working notes of CLEF 2020-Conference and Labs of the Evaluation Forum, 22-25
September, Thessaloniki, Greece. 2020.
[13] Pedregosa, Fabian, et al. "Scikit-learn: Machine learning in Python." the Journal of machine</p>
      <p>Learning research 12 (2011): 2825-2830.
[14] A. Peñas, A. Rodrigo, A simple measure to assess non-response (2011).
[15] Gajić, Bojana, Ariel Amato, and Carlo Gatta. "Fast hard negative mining for deep metric
learning." Pattern Recognition 112 (2021): 107795.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A. A.</given-names>
            <surname>Ayele</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Babakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bevendorff</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X. B.</given-names>
            <surname>Casals</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Chulvi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dementieva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Elnagar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Freitag</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Fröbe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Korenčić</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mayerl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Moskovskiy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mukherjee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Panchenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Rangel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Rizwan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Schneider</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Smirnova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Stamatatos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Stakovskii</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Taulé</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ustalov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Yimam</surname>
          </string-name>
          , E. Zangerle,
          <article-title>Overview of PAN 2024:Multi-Author Writing Style Analysis, Multilingual Text Detoxification, Oppositional ThinkingAnalysis, and Generative AI Authorship Verification</article-title>
          , in: L.
          <string-name>
            <surname>Goeuriot</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Mulhem</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Quénot</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Schwab</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Soulier</surname>
          </string-name>
          ,
          <string-name>
            <surname>G. M. D. Nunzio</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Galuščáková</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. G. S. de Herrera</surname>
          </string-name>
          , G. Faggioli, N. Ferro(Eds.),
          <source>Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of theFifteenth International Conference of the CLEF Association (CLEF</source>
          <year>2024</year>
          ), Lecture Notes in ComputerScience, Springer, Berlin Heidelberg New York,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bevendorff</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Karlgren</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Dürlich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Gogoulou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Talman</surname>
          </string-name>
          , E. Stamatatos,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <article-title>Overview of the “Voight-Kampff” Generative AI Authorship Verification Taskat PAN</article-title>
          and
          <article-title>ELOQUENT 2024</article-title>
          , in: G. Faggioli,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Galuščáková</surname>
          </string-name>
          , A. G. S. de Herrera (Eds.),Working Notes of CLEF 2024 -
          <article-title>Conference and Labs of the Evaluation Forum, CEUR WorkshopProceedings, CEUR-WS</article-title>
          .org,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <surname>Yinhan</surname>
          </string-name>
          , et al.
          <article-title>"Roberta: A robustly optimized bert pretraining approach." arXiv preprint arXiv:</article-title>
          <year>1907</year>
          .
          <volume>11692</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4] Mitchell,
          <string-name>
            <surname>Eric</surname>
          </string-name>
          , et al.
          <article-title>"Detectgpt: Zero-shot machine-generated text detection using probability curvature."</article-title>
          <source>International Conference on Machine Learning. PMLR</source>
          ,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Bekker</surname>
            , Jessa, and
            <given-names>Jesse</given-names>
          </string-name>
          <string-name>
            <surname>Davis</surname>
          </string-name>
          .
          <article-title>"Learning from positive and unlabeled data: A survey."</article-title>
          <source>Machine Learning 109.4</source>
          (
          <year>2020</year>
          ):
          <fpage>719</fpage>
          -
          <lpage>760</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Tian</surname>
          </string-name>
          ,
          <string-name>
            <surname>Yuchuan</surname>
          </string-name>
          , et al.
          <article-title>"Multiscale positive-unlabeled detection of ai-generated texts</article-title>
          .
          <source>" arXiv preprint arXiv:2305.18149</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Hans</surname>
          </string-name>
          ,
          <string-name>
            <surname>Abhimanyu</surname>
          </string-name>
          , et al.
          <article-title>"Spotting LLMs With Binoculars: Zero-Shot Detection of MachineGenerated Text."</article-title>
          <source>arXiv preprint arXiv:2401.12070</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Bao</surname>
          </string-name>
          ,
          <string-name>
            <surname>Guangsheng</surname>
          </string-name>
          , et al.
          <article-title>"Fast-detectgpt: Efficient zero-shot detection of machine-generated text via conditional probability curvature</article-title>
          .
          <source>" arXiv preprint arXiv:2310.05130</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Efstathios</given-names>
            <surname>Stamatatos</surname>
          </string-name>
          , Krzysztof Kredens, Piotr Pezik, Annina Heini, Janek Bevendorff,
          <string-name>
            <given-names>Martin</given-names>
            <surname>Potthast</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Benno</given-names>
            <surname>Stein</surname>
          </string-name>
          .
          <article-title>Overview of the Authorship Verification Task at PAN 2023</article-title>
          .
          <article-title>CLEF 2023 Labs and Workshops</article-title>
          , Notebook Papers,
          <year>September 2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Petropoulos</surname>
            ,
            <given-names>Panagiotis.</given-names>
          </string-name>
          <article-title>"Contrastive learning for authorship verification using BERT and bi-LSTM in a Siamese architecture</article-title>
          .
          <source>" Working Notes of CLEF</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>