1. Introduction

A Verifying Generative Text Authorship Model With Regularized Dropout

Zijie Lin

Zhongyuan Han

hanzhongyuan@gmail.com 1

Leilei Kong

kongleilei@fosu.edu.cn 1

Miaoling Chen

Shuyi Zhang

shuyipro@foxmail.com 1

Jiangao Peng

Kaiyin Sun

sunkaiyin123@163.com 0 0 Foshan Huaying School , Foshan , China 1 Foshan University , Foshan , China

2024

Generative AI authorship verification aims to identify the text authored by a human within a given pair of texts. This paper presents our method for the PAN 2024 Generative AI Authorship Authentication Task. We framed this task as a binary classification problem for individual texts. Initially, we utilized data augmentation techniques to balance the originally imbalanced dataset and trained the model on single texts. Additionally, we employed the Regularized Dropout method to optimize model training further. For a given pair of texts, the model processed each text individually for inference. Finally, a fully connected layer was used for classification, selecting the text with the higher human-authorship score as the answer. Our method achieved a mean score of 0.99 on the oficial test set.

eol>PAN 2024 Generative AI Authorship Verification Data Augmentation Regularized Dropout

1. Introduction 2. Related Work

Due to the rapid development of large language models (LLMs), their text generation capabilities have reached a level comparable to human writing [ 7 ] . Developing efective methods to verify the authorship of generated texts is crucial for mitigating the misuse of LLMs and reducing the harmful impact of their content. In recent years, numerous studies have focused on machine text detection. For instance, Hans [ 8 ] proposed a method called Binoculars, which compares the scores of two related language models to determine whether a text is human-generated or machine-generated. Bao [ 9 ] introduced "Fast-DetectGPT," a zero-shot detection method for machine-generated text that leverages conditional probability curvature. Although these methods do not require training data and rely solely on analyzing specific textual features for detection, they may be inefective when the characteristics distinguishing human and machine-generated texts are not prominent. Therefore, we adopted the R-Drop method to ensure consistency in the distribution of samples across diferent categories. The core idea of the R-Drop method is to regularize the consistency between the outputs of two diferent sub-models generated through dropout, thereby enhancing the model’s generalization ability and robustness. This method constrains the results of two forward passes obtained by applying dropout to the same input data, ensuring they remain consistent.

3. Method

This section explains how to incorporate R-Drop to optimize our model during the training process. We use the pre-trained language model Bert [ 10 ] for training. We consider this task a binary classification problem for single text samples, thus employing the binary cross-entropy loss function as the foundation. On top of this, we incorporate the R-drop method to construct the final loss function. This final loss function is then used to train the model. The final loss function is expressed as follows: ℒ = (ℒ(1, y) + ℒ(2, y)) + (ℒ(1 ‖ 2) + ℒ(2 ‖ 1)) (1) Where is a hyperparameter that controls the contribution of the KL divergence in the total loss. In this way, we consider the model’s prediction accuracy and enhance the consistency of the model’s results from diferent forward passes, thereby improving the model’s stability and robustness.The specific steps for creating the loss function are as follows:

First, we input the data through the network and apply dropout to obtain two diferent forward propagation results 1 and 2. Then, we calculate the binary cross-entropy loss ℒ for these two results. The formula for binary cross-entropy loss ℒ is as follows: ℒ(p, y) = − ∑︁ [ log + (1 − ) log(1 − )] (2)

where p is the predicted probability distribution of the model, and y is the actual label distribution. Binary cross-entropy loss measures the inconsistency between the actual labels and the predicted distribution and is a common loss function for binary classification problems.

Next, we calculate the Kullback-Leibler (KL) divergence between the two results 1 and 2; the formula is: 1 ℒ(1 ‖ 2) = ∑︁ 1 log 2 (3)

Finally, the above KL divergence is added as a regularization term to the loss function. The final loss function includes the weighted sum of binary cross-entropy loss and KL divergence loss. The application of R-drop in the training process is shown in Figure 1.

We selected the BERT model as the baseline model. We trained BERT using the training data that will be mentioned below and optimized the model using R-Drop. During the inference phase, we ifrst split the input text pair into two separate texts. Each text is then individually fed into the BERT model for classification prediction. Finally, we select the text with the higher probability of being human-generated as the final answer.

label

P!loss

P! softmax loss Dℒ =（ ! || " ） label

P"loss

P" softmax units dropped units text Self-Attention Feed-Forward

4. Experiment 4.1. Data Preprocessing

In this task, we utilized two datasets. The first dataset is the guiding dataset provided by the organizers for the Generative AI Authorship Verification task, known as pan24-generative-authorship-news. The second dataset is sourced from the Kaggle platform, named DAIGT-V4-TRAIN-DATASET1(hereinafter referred to as DAIGT-V4). The guiding dataset encompasses various genuine and fabricated news articles from American headlines in 2021. It comprises 14 JSONL files, with one containing text generated by human authors and the remaining 13 files containing text generated by diferent machine authors. The DAIGT-V4 comprises a collection of CSV files containing text generated by one human author and 11 machine authors, covering topics such as mobile phones and automobiles, with 27370 texts generated by humans and 46203 by machines. The minimum, maximum, and average lengths of texts in both pan24-generative-authorship-news and DAIGT-V4 are presented in Table 1.

Due to the proportion of human authors to machine authors being 1:13 in the dataset provided by the organizers, namely "pan24-generative-authorship-news," to expand the data volume and balance the ratio between human authors and machine authors, we utilized the DAIGT-V4 dataset to augment the original data. The preprocessing of the data involved extracting 1000 texts generated by human 1You can find this dataset at https://www.kaggle.com/datasets/thedrcat/daigt-v4-train-dataset. authors from the pan24-generative-authorship-news dataset while retaining their respective topics. We randomly selected authors based on the same topics for the machine-generated texts. Subsequently, we extracted 20000 texts generated by both human and machine authors from the DAIGT-V4 dataset in a 1:1 ratio. We then combined these two sets of data and divided them into training and test sets at a ratio of 9:1. In the training set, a label of 1 denotes texts generated by human authors; In contrast, a label of 0 denotes texts generated by machine authors. All text will be truncated according to the maximum input length of the model.

4.2. Experimental setting

We conducted the entire experiment using the Pytorch framework. The optimizer used was the Adam optimizer. During training, the loss function was a weighted sum of binary cross-entropy loss and KL divergence, with a weight of 4 for the KL divergence. Dropout was set to 0.3, the maximum text length was 512, the batch size was 32, the learning rate was 3e-5, and the number of epochs was 10. The composition of the dataset used in the experiment is shown in Table 2.

After dividing the dataset, we send a single text to the model for training. We used the same indicators as the oficial PAN 2024 to evaluate our model and took the mean value as the final selection criterion for the model. We obtained the best model in the second epoch.

4.3. Other method

We also employed an ensemble learning approach to complete this task. In addition to the previously mentioned dataset, we expanded our dataset using the SemEval subTask A dataset [11] . We utilized three pre-trained language models: Bert-base-uncased, Roberta-base-uncased [12] , and Deberta-baseuncased [13] . The training process was mainly similar to the method described above. During the inference phase, we split each text pair into two separate texts and input them into the three models. Each model predicts the text separately and obtains two scores; we choose the average score as the ifnal score for a single text and select the one with the higher score as the human-generated text.

4.4. Results

This subsection introduces the experimental results. Our team, Team lam in Table 3, submitted two systems: system − and system − . Table 3 shows an overview of the accuracy of our method and baseline methods in detecting whether humans write text in PAN 2024 (Voight-Kampf Generative AI Authorship Verification) Task 4. Among them, system − is our primary method, and the system − is briefly introduced in Section 4.3. Compared to baseline methods, our methods demonstrate significant improvements across most metrics. For instance, the system − achieves an − of 0.989, markedly higher than the highest value of 0.972 attained by all baseline methods ( ). Additionally, the system − achieves scores of 0.989 or higher in , @1, 1, 0.5, and the average value, indicating exceptionally high classification performance.

Although the system − slightly lags behind the the system − , it maintains all metrics around 0.865, still surpassing most baseline methods. Notably, it performs comparably to the − ( ) method in the 1 and 0.5 metrics (both 0.883). 5. Conclusion blistering-moss acute-wireframe To solve the task of generative artificial intelligence author authentication proposed by PAN 2024, we propose two methods in this article. One is to use data augmentation and R-Drop to train the BERT model. The other is to use the Ensemble learning voting method for author verification.

The method of combining data augmentation with R-Drop yielded promising results. Despite the integrated model’s overall performance potentially being inferior to the former, it demonstrated superior efectiveness on certain test data subsets.

Acknowledgments

This work is supported by the Social Science Foundation of Guangdong Province, China (No.GD24CZY02) [11] Y. Wang, J. Mansurov, P. Ivanov, J. Su, A. Shelmanov, A. Tsvigun, O. M. Afzal, T. Mahmoud, G. Puccetti, T. Arnold, et al., Semeval-2024 task 8: Multidomain, multimodel and multilingual machine-generated text detection, arXiv preprint arXiv:2404.14183 (2024). [12] Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, V. Stoyanov,

Roberta: A robustly optimized bert pretraining approach, 2019. arXiv:1907.11692. [13] P. He, X. Liu, J. Gao, W. Chen, Deberta: Decoding-enhanced bert with disentangled attention, 2021. arXiv:2006.03654.

[1]

Achiam ,

Adler ,

Agarwal ,

Ahmad ,

Akkaya ,

F. L.

Aleman ,

Almeida ,

Altenschmidt ,

Altman ,

Anadkat , et al., Gpt-4 technical report, arXiv preprint arXiv:2303.08774 ( 2023 ).

[2]

Mitchell ,

Lee ,

Khazatsky ,

C. D.

Manning ,

Finn , Detectgpt: Zero-shot machine-generated text detection using probability curvature , in: International Conference on Machine Learning, PMLR , 2023 , pp. 24950 - 24962 .

[3]

A. A.

Ayele ,

Babakov ,

Bevendorf ,

X. B.

Casals ,

Chulvi ,

Dementieva ,

Elnagar ,

Freitag ,

Fröbe ,

Korenčić ,

Mayerl ,

Moskovskiy ,

Mukherjee ,

Panchenko ,

Potthast ,

Rangel ,

Rizwan ,

Rosso ,

Schneider ,

Smirnova ,

Stamatatos ,

Stakovskii ,

Stein ,

Taulé ,

Ustalov ,

Wang ,

Wiegmann ,

S. M.

Yimam , E. Zangerle, Overview of PAN 2024: Multi-Author Writing Style Analysis, Multilingual Text Detoxification, Oppositional Thinking Analysis, and Generative AI Authorship Verification , in: L. Goeuriot , P.

Mulhem , G.

Quénot , D.

Schwab , L.

Soulier , G. M. D. Nunzio , P. Galuščáková , A. G. S. de Herrera , G. Faggioli, N. Ferro (Eds.), Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Fifteenth International Conference of the CLEF Association (CLEF 2024 ), Lecture Notes in Computer Science, Springer, Berlin Heidelberg New York, 2024 .

[4]

Bevendorf ,

Wiegmann ,

Karlgren ,

Dürlich ,

Gogoulou ,

Talman , E. Stamatatos,

Potthast ,

Stein , Overview of the “Voight-Kampf” Generative AI Authorship Verification Task at PAN and ELOQUENT 2024 , in: G. Faggioli,

Ferro ,

Galuščáková , A. G. S. de Herrera (Eds.), Working Notes of CLEF 2024 - Conference and Labs of the Evaluation Forum, CEUR Workshop Proceedings, CEUR-WS.org , 2024 .

[5]

Fröbe ,

Wiegmann ,

Kolyada ,

Grahm ,

Elstner ,

Loebe ,

Hagen ,

Stein ,

Potthast , Continuous Integration for Reproducible Shared Tasks with TIRA.io , in: J. Kamps , L.

Goeuriot , F.

Crestani , M.

Maistro , H.

Joho , B.

Davis , C.

Gurrin , U.

Kruschwitz , A . Caputo (Eds.), Advances in Information Retrieval. 45th European Conference on IR Research (ECIR 2023 ), Lecture Notes in Computer Science, Springer, Berlin Heidelberg New York, 2023 , pp. 236 - 241 . doi: 10 .1007/ 978-3- 031 -28241-6_ 20 .

[6]

Wu ,

Li ,

Wang ,

Meng ,

Qin ,

Chen ,

Zhang , T.-Y. Liu, et al., R-drop: Regularized dropout for neural networks , Advances in Neural Information Processing Systems 34 ( 2021 ) 10890 - 10905 .

[7]

Wu ,

Yang ,

Zhan ,

Yuan ,

Wong ,

Chao , A survey on llm-gernerated text detection: Necessity, methods, and future directions ( 2023 ).

[8]

Hans ,

Schwarzschild ,

Cherepanova ,

Kazemi ,

Saha ,

Goldblum ,

Geiping , T. Goldstein, Spotting llms with binoculars: Zero-shot detection of machine-generated text , 2024 . arXiv: 2401 . 12070 .

[9]

Bao ,

Zhao ,

Teng ,

Yang ,

Zhang , Fast-detectgpt: Eficient zero-shot detection of machine-generated text via conditional probability curvature , in: The Twelfth International Conference on Learning Representations , 2023 .

[10]

Devlin , M.-

Chang ,

Lee ,

Toutanova , Bert: Pre-training of deep bidirectional transformers for language understanding , 2019 . arXiv: 1810 .04805.