1. Introduction

Conference and Labs of the Evaluation Forum, September

Meta-Contrastive Learning for Generative AI Authorship Verification

Jiajun Lv

Yong Han

Leilei Kong

0 0 Foshan University , Foshan , China

2024

0 9 12

This paper proposes a method that combines meta-learning and contrastive learning to address the task of Generative AI Authorship Verification. Our motivation is to leverage supervised contrastive learning to enhance the model's discriminative ability by optimizing the relationships between samples. Additionally, we employ the meta-learning algorithm Reptile to improve the generalization ability on out-of-domain data. Finally, we select the model weights that achieve the best performance on the validation set. We obtained an average score of 0.949 on the test set.

eol>Authorship Verification Contrastive Learning Meta-learning

1. Introduction 2. Related work

Since 2011, the PAN organization has been continuously organizing authorship verification tasks[ 7]. Unlike previous focuses on cross-discourse type authorship verification, PAN 2024 Authorship Verification[ 4 ] aims to address whether generative AI authorship verification can be solved[ 8]. The task requires participants to design classification methods to distinguish between human and machinewritten texts.

In recent work on generative AI detectors, fine-tuning language models and zero-shot learning methods are predominant [ 3 ]. Zero-shot detectors do not require additional training through supervised signals. Major methods include perplexity (PPL) [9], probability curvature [10], and likelihood ratio ranking (LRR) [11]. Currently, supervised fine-tuning of pre-trained language models is very powerful in natural language understanding [12]. Recent works [ 3 ][12][13] further confirm that fine-tuning with pre-trained language models from the BERT family can outperform zero-shot methods in-domain.

To further improve the detection capability of unknown models, contrastive learning has also been applied to LLMs text checking. ConDA [13] proposed a contrastive domain adaptation framework that combines domain adaptation with contrastive learning representations, enhancing the detector’s performance on out-of-domain data. Reviewing last year’s authorship verification task, the first-place team Ibrahim, M.et al[14] and the second-place team Guo, M. et al[15]. both adopted feature encoding and contrastive learning concepts. From these methods, it is evident that contrastive learning might be key to the authorship verification task.

Inspired by [13][16][17], we propose a method that combines contrastive learning and Reptile metalearning[18]. Contrastive learning, by learning the relative distances between samples, avoids mapping texts to a single label. Unlike conventional fine-tuning methods, we use Reptile meta-learning to help the model learn better feature representations, enhancing its generalization ability.

3. Method

The goal of our model is to allow the model to learn the relative distance between samples on the same topic, with diferent authors. Feeding text into the model yields a soft label that encodes the text, the smaller the label value the more likely the text is to be judged as human-authored, and conversely the more likely it is to be judged as AI-generated text.

3.1. Contrastive Learning

Our method revolves around constructing a training task , where is represented as a collection of texts on the same topic written by diferent authors, denoted as {0+, −1 , −2 , . . . , − }. In this collection, 0+ is the only positive example, representing a human author, while −1 , . . . , − are negative examples, representing AI-generated authors.

The text is input to the encoder, and the [] markers of the output vector of the last layer of the encoder are taken as the representation of the text, and we feed the obtained vector to the activation function and the linear layer to obtain the soft labels of the input text . (1) (2) (3) = () ˆ = (ℎ + ℎ) where ∈ Rℎ_× ℎ,ℎ ∈ Rℎ× 1, ℎ is the dimension of the hidden layer of the encoder, and is the bias of the fully connected layer. The () is the nonlinear activation function . We compute the MarginRankingLoss loss function between numerical labels:

= (0, − (ˆ+ − ˆ− ))

Where ˆ+ is the soft label for positive examples, ˆ− is the soft label for negative examples, and spacing boundaries, which indicates the minimum gap between two scores, and if the value is larger, it means that it is expected that ˆ+ is further away from ˆ− .

3.2. Reptile Meta-Learning

We use the batch version of the algorithm, define slow weight as , first copy model parameters as fast weight denoted as , use fast weight to sample n groups of training tasks on the training set to train the updated model, get the updated ˆ, calculate the diference between ˆ and the diference of parameter as the gradient direction of updating , and carry out updating to get 1 by repeated iterations,During training, we adjust the parameter weights of DeBERTa and the linear classification layer,reptile training algorithm1

4. Experiments 4.1. Dataset statistics

We perform sequence length statistics for each author’s data in the training dataset, as shown in Figure1.

Analysing training data box plots human alpaca-7b bloomz-7b1 alpaca-13b gemini-pro gpt-3.5-turb th gpt-4-turbo g n Le llama-2-7b llama-2-70b

mistral-7b mixtral-8x7b qwen1.5-72b vicgalle-gpt2 text-bison 0 200 400 800 1000

1200 600

Author

From the chart, it can be seen that the sequence length of the training dataset is around 500. Among them, the sequence lengths of the alpaca-7b, chavinlo-alpaca-13b, and bigscience-bloomz-7b datasets are significantly below the average.

4.2. Experimental setup

In this study, we chose the DeBERTa-base[19] model as our pre-trained base model. We set the hyperparameters as follows: the batch size is set to 16, the maximum sequence length is set to 512 (with sequences longer than this being truncated), and the margin is set to 0.5. The initial learning rate is set to 2e-5, and we train for 3 epochs. We use AdamW for optimization during each training session. During the training phase, we use the oficially provided labeled dataset to train the model. To evaluate the model’s performance across diferent domains, we use the HC3 dataset [ 20] during the validation phase. The results of our model on our validation set Table1

4.3. Result

We selected the model with the best performance in validation, tested it on TIRA [9], and scored all test tasks separately. The combined results for the test dataset are presented in the following Table3 and Table 2. 5. Conclusions In this paper, we propose a method combining contrastive learning and meta-learning to address the task set by PAN: Voight-Kampf Generative AI Authorship Verification. Our proposed method achieved scores of roc-auc: 0.98, brier: 0.945, c@1: 0.954, F1: 0.93, F0.5u: 0.935, and Mean: 0.949 on the leaderboard. These results validate the efectiveness of our proposed method in the task of Generative AI Authorship Verification.

Acknowledgments

This research was supported by the Natural Science Platforms and Projects of Guangdong Province Ordinary Universities (KeyField Special Projects) (No. 2023ZDZX1023) Generative AI Authorship Verification Task at PAN 2024, in: G. F. N. Ferro, P. Galuščáková, A. G. S. de Herrera (Eds.), Working Notes of CLEF 2024 - Conference and Labs of the Evaluation Forum, CEUR-WS.org, 2024. [7] M. Fröbe, M. Wiegmann, N. Kolyada, B. Grahm, T. Elstner, F. Loebe, M. Hagen, B. Stein, M. Potthast, Continuous Integration for Reproducible Shared Tasks with TIRA.io, in: J. Kamps, L. Goeuriot, F. Crestani, M. Maistro, H. Joho, B. Davis, C. Gurrin, U. Kruschwitz, A. Caputo (Eds.), Advances in Information Retrieval. 45th European Conference on IR Research (ECIR 2023), Lecture Notes in Computer Science, Springer, Berlin Heidelberg New York, 2023, pp. 236–241. doi:10.1007/ 978-3-031-28241-6_20. [8] A. A. Ayele, N. Babakov, J. Bevendorf, X. B. Casals, B. Chulvi, D. Dementieva, A. Elnagar, D. Freitag, M. Fröbe, D. Korenčić, M. Mayerl, D. Moskovskiy, A. Mukherjee, A. Panchenko, M. Potthast, F. Rangel, N. Rizwan, P. Rosso, F. Schneider, A. Smirnova, E. Stamatatos, E. Stakovskii, B. Stein, M. Taulé, D. Ustalov, X. Wang, M. Wiegmann, S. M. Yimam, E. Zangerle, Overview of PAN 2024: Multi-Author Writing Style Analysis, Multilingual Text Detoxification, Oppositional Thinking Analysis, and Generative AI Authorship Verification, in: L. Goeuriot, P. Mulhem, G. Quénot, D. Schwab, L. Soulier, G. M. D. Nunzio, P. Galuščáková, A. G. S. de Herrera, G. Faggioli, N. Ferro (Eds.), Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Fifteenth International Conference of the CLEF Association (CLEF 2024), Lecture Notes in Computer Science, Springer, Berlin Heidelberg New York, 2024. [9] Y. Arase, M. Zhou, Machine translation detection from monolingual web-text, in: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2013, pp. 1597–1607. [10] E. Mitchell, Y. Lee, A. Khazatsky, C. D. Manning, C. Finn, Detectgpt: Zero-shot machine-generated text detection using probability curvature, in: International Conference on Machine Learning, PMLR, 2023, pp. 24950–24962. [11] J. Su, T. Y. Zhuo, D. Wang, P. Nakov, Detectllm: Leveraging log rank information for zero-shot detection of machine-generated text, arXiv preprint arXiv:2306.05540 (2023). [12] A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskever, et al., Language models are unsupervised multitask learners, OpenAI blog 1 (2019) 9. [13] A. Bhattacharjee, T. Kumarage, R. Morafah, H. Liu, Conda: Contrastive domain adaptation for ai-generated text detection, arXiv preprint arXiv:2309.03992 (2023). [14] M. Ibrahim, A. Akram, M. Radwan, R. Ayman, M. Abd-El-Hameed, N. El-Makky, M. Torki, Enhancing Authorship Verification using Sentence-Transformers, in: M. Aliannejadi, G. Faggioli, N. Ferro, M. Vlachos (Eds.), Working Notes of CLEF 2023 - Conference and Labs of the Evaluation Forum, CEUR-WS.org, 2023, pp. 2640–2651. URL: https://ceur-ws.org/Vol-3497/paper-216.pdf. [15] M. Guo, Z. Han, H. Chen, H. Qi, A contrastive learning of sample pairs for authorship verification,

Working Notes of CLEF (2023). [16] M. Boudiaf, J. Rony, I. M. Ziko, E. Granger, M. Pedersoli, P. Piantanida, I. B. Ayed, A unifying mutual information view of metric learning: cross-entropy vs. pairwise losses, in: European conference on computer vision, Springer, 2020, pp. 548–564. [17] T. Chen, S. Kornblith, M. Norouzi, G. Hinton, A simple framework for contrastive learning of visual representations, 2020. arXiv:2002.05709. [18] A. Nichol, J. Achiam, J. Schulman, On first-order meta-learning algorithms, arXiv preprint arXiv:1803.02999 (2018). [19] P. He, X. Liu, J. Gao, W. Chen, Deberta: Decoding-enhanced bert with disentangled attention, arXiv preprint arXiv:2006.03654 (2020). [20] B. Guo, X. Zhang, Z. Wang, M. Jiang, J. Nie, Y. Ding, J. Yue, Y. Wu, How close is chatgpt to human experts? comparison corpus, evaluation, and detection, arXiv preprint arXiv:2301.07597 (2023).

[1]

Extance , Chatgpt has entered the classroom: how llms could transform education , Nature 623 ( 2023 ) 474 - 477 .

[2]

Weidinger ,

Mellor ,

Rauh ,

Grifin ,

Uesato , P.-S. Huang, M. Cheng, M. Glaese,

Balle ,

Kasirzadeh , et al., Ethical and social risks of harm from language models , arXiv preprint arXiv:2112.04359 ( 2021 ).

[3]

Wu ,

Yang ,

Zhan ,

Yuan ,

D. F.

Wong ,

L. S.

Chao , A survey on llm-gernerated text detection: Necessity, methods, and future directions , arXiv preprint arXiv:2310.14724 ( 2023 ).

[4]

Bevendorf ,

X. B.

Casals ,

Chulvi ,

Dementieva ,

Elnagar ,

Freitag ,

Fröbe ,

Korenčić ,

Mayerl ,

Mukherjee ,

Panchenko ,

Potthast ,

Rangel ,

Rosso ,

Smirnova ,

Stamatatos ,

Stein ,

Taulé ,

Ustalov ,

Wiegmann , E. Zangerle, Overview of PAN 2024: Multi-Author Writing Style Analysis, Multilingual Text Detoxification, Oppositional Thinking Analysis, and Generative AI Authorship Verification , in: L. Goeuriot , P.

Mulhem , G.

Quénot , D.

Schwab , L.

Soulier , G. M. D. Nunzio , P. Galuščáková , A. G. S. de Herrera , G. Faggioli, N. Ferro (Eds.), Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Fifteenth International Conference of the CLEF Association (CLEF 2024 ), Lecture Notes in Computer Science, Springer, Berlin Heidelberg New York, 2024 .

[5]

Hospedales ,

Antoniou ,

Micaelli , A . Storkey, Meta-learning in neural networks: A survey , IEEE transactions on pattern analysis and machine intelligence 44 ( 2021 ) 5149 - 5169 .

[6]

Bevendorf ,

Wiegmann , E. Stamatatos,

Potthast ,

Stein , Overview of the Voight-Kampf