AI Authorship Verification Based On Deberta Model Notebook for the PAN Lab at CLEF 2024 Ye Zhu, Leilei Kong† Foshan University, Foshan, Guangdong, China Abstract Generative AI Authorship Verification is the task of distinguishing between human-authored and machine- generated texts. This paper explores the application of the pre-trained language model Deberta to address this problem. Our approach involves fine-tuning the Deberta model on a curated dataset comprising both human and machine-generated texts. To manage the imbalance in our dataset, we employed a random sampling to ensure a balanced representation of both types of texts during training. Preliminary experiments show that while our method performs comparably with existing approaches, there is significant potential for further optimization and improvement in identifying human-authored texts. Future work will explore advanced techniques and larger datasets to enhance model. Keywords Authorship Verification, Machine-generated Texts, Deberta 1. Introduction With Large Language Models (LLMs) improving at breakneck speed and seeing more widespread adoption every day, it is getting increasingly hard to discern whether a given text was authored by a human being or a machine. These models, such as GPT-3[1], GPT-4[2], and others, generate text that is often indistinguishable from human writing, posing significant challenges for various applications, including academic integrity, content verification, and online misinformation. Many classification approaches have been devised to help humans distinguish between human and machine-authored text. Traditional methods rely on surface-level features such as word frequency, syntactic patterns, and stylistic elements, but these features can be easily mimicked by advanced LLMs[3, 4]. Thus, the task of authorship verification in the context of human vs. machine text remains a critical and challenging problem. Recently, PAN 2024[5] posed a task: given two texts, one written by a human and the other by a machine, identify the human-authored text[6]. We approached this as a binary classification task, which simplifies the challenge and allows us to focus on identifying the most distinctive features of human-authored texts. To address this task, we use the Deberta[7] model as the pre-trained model, an improved version derived from Bert[8], known for its effective text feature encoding. Deberta’s architecture enhances the attention mechanism, making it more adept at capturing intricate patterns in the text. However, sometimes the data we get is not as perfect as we expected.Therefore, in situations where datasets are limited and imbalanced, we adopted a random sampling method that efficiently and economically trains the model by selectively sampling portions of the text data. This approach involves randomly selecting a subset of machine-generated samples for each training epoch while including all human-generated samples. By retaining the most relevant features and reducing the computational load, our method optimizes the training process. Previous studies have shown that specific sampling techniques can significantly improve model performance in imbalanced data scenarios[9, 10, 11]. CLEF 2024: Conference and Labs of the Evaluation Forum, September 09–12, 2024, Grenoble, France † Corresponding author $ kwojmjmqa1744@gmail.com (Y. Zhu); kongleilei@fosu.edu.cn (L. Kong)  0009-0001-2658-9445 (Y. Zhu); 0000-0002-4636-3507 (L. Kong) © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings 2. Data Analysis PAN 2024 provides a guided dataset covering both real and fake news articles from multiple 2021 US news headlines. Each file contains a list of articles, written either by (any number of) human authors or a single machine. Machine text is generated by some large language models such as Gemini Pro[12]. The dataset comprises human-authored text and machine-generated text, with a significant imbal- ance in the ratio between the two categories (1:13). To address this challenge, we adopted a random sampling approach during model training. Due to the limited data availability and the need to balance computational resources, we trained the classification model for two epochs. In the first epoch, we randomly selected 1200 samples from the combined machine-generated text to ensure representation from different sources and topics. For the second epoch, we increased the sample size to 3000 to further enrich the training data. All human-authored samples were included in both epochs to maintain a balanced representation of human and machine texts. Subsequently, in each epoch, we combined these two sets of data into a format where a label corresponds a type of text, classifying the two types of texts separately. We then split the data into training, validation, and test sets in a ratio of 0.95, 0.05, and 0.05, respectively. This partitioning strategy ensured that the model was trained on a diverse range of samples while maintaining sufficient data for evaluation and testing. Finally, we utilized the Deberta-large model architecture for AI authorship verification on the combined dataset. 3. Experiments and Results 3.1. Experiment setup We utilized the Deberta-large model, which is characterized by a vocabulary size of 50,000, a hidden size of 1024, 24 layers, and a total of 3.03 billion parameters. This model was selected for its disentangled attention mechanism and enhanced masked decoder. The classification model was built using PyTorch, with training conducted using a batch size of 2. We did not set a maximum encoder length, fully leveraging the model’s capacity to handle long texts. The AdamW[13] optimizer, with a learning rate of 1e-6, was employed to update the model weights, while cross-entropy[14] was used as the loss function to measure prediction error. The network was trained over 2 epochs to ensure thorough learning without overfitting. In this study, we just utilized the [CLS] token, which is a standard practice in BERT and its derivative models. The [CLS] token is positioned at the onset of the sequence, serving to aggregate information from the entire input sequence, which is crucial for classification tasks. We employed the [CLS] token based on the default settings as per the model’s pre-training, without any modifications to its functionality. All experiments were conducted on an NVIDIA A800 GPU with 80GB of memory, providing the necessary computational power to handle the large model and extensive dataset. Additionally, data augmentation techniques such as random sampling were applied to enhance the training data diversity, thereby improving the model’s generalization ability. Performance metrics, including accuracy, precision, recall, and F1-score, were used to evaluate the model’s effectiveness on the test dataset. 3.2. Results To process each sample, we compare the confidence scores of Text 1 and Text 2. If Text 1’s confidence score is higher, the final confidence score is 1 minus Text 1’s confidence score; if Text 2’s confidence score is higher, the final confidence score is Text 2’s confidence score. Table 1 summarizes the performance of the validation and test sets in this experiment, highlighting high accuracy, precision, recall, and F1 scores.Table 2 shows the summarized results averaged (arithmetic mean) over 10 variants of the test dataset. Each variant uses a different technique to test the robustness of authorship verification approaches, such as switching text encoding, translating text, changing Table 1 This table summarizes the performance of the Deberta model on the validation and test sets, highlighting its robust accuracy, precision, recall, and F1 score across both datasets. Metric Validation Set Test Set Accuracy 0.9829 0.9853 Precision 0.9853 0.9898 Recall 0.9692 0.9752 F1 0.9769 0.9821 Table 2 Overview of the accuracy in detecting if a text is written by a human on PAN 2024, Voight-Kampff Generative AI Authorship Verification. We report ROC-AUC, Brier, C@1, F1, F0.5u, and their mean. Approach ROC-AUC Brier C@1 F1 F0.5𝑢 Mean beige-limit 0.627 0.660 0.590 0.442 0.433 0.555 Baseline Binoculars 0.972 0.957 0.966 0.964 0.965 0.965 Baseline Fast-DetectGPT (Mistral) 0.876 0.8 0.886 0.883 0.883 0.866 Baseline PPMd 0.795 0.798 0.754 0.753 0.749 0.77 Baseline Unmasking 0.697 0.774 0.691 0.658 0.666 0.697 Baseline Fast-DetectGPT 0.668 0.776 0.695 0.69 0.691 0.704 95-th quantile 0.994 0.987 0.989 0.989 0.989 0.990 75-th quantile 0.969 0.925 0.950 0.933 0.939 0.941 Median 0.909 0.890 0.887 0.871 0.867 0.889 25-th quantile 0.701 0.768 0.683 0.657 0.670 0.689 Min 0.131 0.265 0.005 0.006 0.007 0.224 Table 3 Overview of the mean accuracy over 9 variants of the test set. We report the minumum, median, the maximum, the 25-th, and the 75-th quantile, of the mean per the 9 datasets. Approach Minimum 25-th Quantile Median 75-th Quantile Max beige-limit 0.307 0.759 0.845 0.864 0.896 Baseline Binoculars 0.342 0.818 0.844 0.965 0.996 Baseline Fast-DetectGPT (Mistral) 0.095 0.793 0.842 0.931 0.958 Baseline PPMd 0.270 0.546 0.750 0.770 0.863 Baseline Unmasking 0.250 0.662 0.696 0.697 0.762 Baseline Fast-DetectGPT 0.159 0.579 0.704 0.719 0.982 95-th quantile 0.863 0.971 0.978 0.990 1.000 75-th quantile 0.758 0.865 0.933 0.959 0.991 Median 0.605 0.645 0.875 0.889 0.936 25-th quantile 0.353 0.496 0.658 0.675 0.711 Min 0.015 0.038 0.231 0.244 0.252 the domain, and manual obfuscation. Table 3 shows the results, initially pre-filled with the official baselines provided by the PAN organizers and summary statistics of all submissions to the task (i.e., the maximum, median, minimum, and 95-th, 75-th, and 25-th percentiles over all submissions to the task). The evaluations for Table 2 and Table 3 were conducted on the PAN 2024 Generative AI Authorship Verification task training dataset using the TIRA[15] platform. Our method, referred to as "beige-limit" in the tables, is compared against various baselines. 4. Conclusion This paper addresses Generative AI authorship verification using the Deberta model. The goal was to distinguish between human and machine-authored texts. By employing a pre-trained Deberta-large model and random sampling to manage data imbalance, we conducted a series of experiments to evaluate the model’s performance.We ranked 23rd in this task using this method. Our study indicates that advanced pre-trained language models like Deberta have potential for authorship verification tasks. Future research could explore more efficient training strategies and extend this approach to other domains and languages. Acknowledgments This research was supported by the Natural Science Platforms and Projects of Guangdong Province Ordinary Universities (Key Field Special Projects) (No. 2023ZDZX1023) References [1] L. Floridi, M. Chiriatti, Gpt-3: Its nature, scope, limits, and consequences, Minds and Machines 30 (2020) 681–694. [2] J. Achiam, S. Adler, S. Agarwal, et al., Gpt-4 technical report, arXiv preprint arXiv:2303.08774 (2023). [3] D. Ippolito, D. Duckworth, C. Callison-Burch, et al., Automatic detection of generated text is easiest when humans are fooled, arXiv preprint arXiv:1911.00650 (2019). [4] R. Zellers, A. Holtzman, H. Rashkin, et al., Defending against neural fake news, Advances in neural information processing systems 32 (2019). [5] A. A. Ayele, N. Babakov, J. Bevendorff, X. B. Casals, B. Chulvi, D. Dementieva, A. Elnagar, D. Freitag, M. Fröbe, D. Korenčić, M. Mayerl, D. Moskovskiy, A. Mukherjee, A. Panchenko, M. Potthast, F. Rangel, N. Rizwan, P. Rosso, F. Schneider, A. Smirnova, E. Stamatatos, E. Stakovskii, B. Stein, M. Taulé, D. Ustalov, X. Wang, M. Wiegmann, S. M. Yimam, E. Zangerle, Overview of PAN 2024: Multi-Author Writing Style Analysis, Multilingual Text Detoxification, Oppositional Thinking Analysis, and Generative AI Authorship Verification, in: L. Goeuriot, P. Mulhem, G. Quénot, D. Schwab, L. Soulier, G. M. D. Nunzio, P. Galuščáková, A. G. S. de Herrera, G. Faggioli, N. Ferro (Eds.), Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Fifteenth International Conference of the CLEF Association (CLEF 2024), Lecture Notes in Computer Science, Springer, Berlin Heidelberg New York, 2024. [6] J. Bevendorff, M. Wiegmann, J. Karlgren, L. Dürlich, E. Gogoulou, A. Talman, E. Stamatatos, M. Potthast, B. Stein, Overview of the “Voight-Kampff” Generative AI Authorship Verification Task at PAN and ELOQUENT 2024, in: G. Faggioli, N. Ferro, P. Galuščáková, A. G. S. de Herrera (Eds.), Working Notes of CLEF 2024 - Conference and Labs of the Evaluation Forum, CEUR Workshop Proceedings, CEUR-WS.org, 2024. [7] P. He, X. Liu, J. Gao, et al., Deberta: Decoding-enhanced bert with disentangled attention, arXiv preprint arXiv:2006.03654 (2020). [8] J. Devlin, M. W. Chang, K. Lee, et al., Bert: Pre-training of deep bidirectional transformers for language understanding, arXiv preprint arXiv:1810.04805 (2018). [9] N. V. Chawla, K. W. Bowyer, L. O. Hall, et al., Smote: synthetic minority over-sampling technique, Journal of artificial intelligence research 16 (2002) 321–357. [10] N. Japkowicz, S. Stephen, The class imbalance problem: A systematic study, Intelligent data analysis 6 (2002) 429–449. [11] M. Buda, A. Maki, M. A. Mazurowski, A systematic study of the class imbalance problem in convolutional neural networks, Neural networks 106 (2018) 249–259. [12] T. G, R. Anil, S. Borgeaud, et al., Gemini: a family of highly capable multimodal models, arXiv preprint arXiv:2312.11805 (2023). [13] I. Loshchilov, F. Hutter, Decoupled weight decay regularization, arXiv preprint arXiv:1711.05101 (2017). [14] R. Rubinstein, The cross-entropy method for combinatorial and continuous optimization, Method- ology and computing in applied probability 1 (1999) 127–190. [15] M. Fröbe, M. Wiegmann, N. Kolyada, B. Grahm, T. Elstner, F. Loebe, M. Hagen, B. Stein, M. Potthast, Continuous Integration for Reproducible Shared Tasks with TIRA.io, in: J. Kamps, L. Goeuriot, F. Crestani, M. Maistro, H. Joho, B. Davis, C. Gurrin, U. Kruschwitz, A. Caputo (Eds.), Advances in Information Retrieval. 45th European Conference on IR Research (ECIR 2023), Lecture Notes in Computer Science, Springer, Berlin Heidelberg New York, 2023, pp. 236–241. doi:10.1007/ 978-3-031-28241-6_20.