Team NYCU-NLP at PAN 2024: Integrating Transformers with Similarity Adjustments for Multi-Author Writing Style Analysis Notebook for the PAN Lab at CLEF 2024

Team NYCU-NLP at PAN 2024: Integrating Transformers with Similarity Adjustments for Multi-Author Writing Style Analysis Notebook for the PAN Lab at CLEF 2024 Tzu-MiLin Institute of Artificial Intelligence Innovation National Yang Ming Chiao Tung University

Taiwan

Yu-HsinWu Institute of Artificial Intelligence Innovation National Yang Ming Chiao Tung University

Taiwan

Lung-HaoLee lhlee@nycu.edu.tw Institute of Artificial Intelligence Innovation National Yang Ming Chiao Tung University

Taiwan

Team NYCU-NLP at PAN 2024: Integrating Transformers with Similarity Adjustments for Multi-Author Writing Style Analysis Notebook for the PAN Lab at CLEF 2024 1613-0073 AEE29FFC802BF02FCCFBA2C05DF9E71F GROBID - A machine learning software for extracting information from scholarly documents Pre-trained Language Models Embedding Similarity Authorship Analysis Plagiarism Detection

This paper describes our NYCU-NLP system design for multi-author writing style analysis tasks of the PAN Lab at CLEF 2024. We propose a unified architecture integrating transformer-based models with similarity adjustments to identify author switches within a given multi-author document. We first fine-tune the RoBERTa, DeBERTa and ERNIE transformers to detect differences in writing style in two given paragraphs. The output prediction is then determined by the ensemble mechanism. We also use similarity adjustments to further enhance multi-author analysis performance. The experimental data contains three difficulty levels to reflect simultaneous changes of authorship and topic. Our submission achieved a macro F1-score of 0.964, 0.857 and 0.863 respectively for the easy, medium and hard levels, ranking first and second, respectively for hard and medium levels out of 16 and 17 participating teams.

Introduction

The PAN Lab hosts a series of shared tasks for digital text forensics [1]. Following the achievements of the past Style Change Detection (SCD) tasks at the PAN Lab [2,3], the goal of this multi-author writing analysis task seeks to identify all positions of writing style change at the paragraph level within a multi-authored document. Given a single document combined from separate comments by different users from the Reddit, the developed system should determine at which positions the author changes at three levels of difficulty: 1) Easy: the document contains multiple paragraphs on multiple topics; 2) Medium: the paragraphs in the document contains fewer topics; and 3) Hard: the document consists of multiple paragraphs on a single topic. All documents may contain an arbitrary number of style changes, which only occur between paragraphs.

This paper describes our developed NYCU-NLP (National Yang Ming Chiao Tung University, Natural Language Processing Lab) system. Our solution explores the use of three pre-trained transformers: RoBERTa, DeBERTa and ERNIE, and then fine-tunes the downstream classification task for the detection of changes to writing style. Finally, the system output is assembled using a majority voting-based assembly mechanism. We also take advantage of the property that sentences belonging to the same topic show greater similarity in the vector semantics space. We use the embedding similarity adjustments to enhance prediction performance at easy and medium levels which include paragraphs on different topics. Our final submission received macro F1-scores of 0.964, 0.857 and 0.863 respectively at the easy, medium and hard levels. These results ranked our method first and second, respectively for the hard and medium levels, out of 16 and 17 participating teams.

The rest of this paper is organized as follows. Section 2 reviews related studies. Section 3 describes our proposed NYCU-NLP system. Section 4 presents evaluation results and performance comparisons.

Related Work

The BERT transformer was used as the paragraph representation to train a random forest classifier for the SCD task [4]. Siamese neural networks were used to measure the paragraph similarities and identify authorship changes [5]. Individual transformers were trained independently and then assembled together for the final authorship change prediction [6]. The SCD task was regarded as a natural language inference task and solved using the DeBRETaV3 transformer [7]. A prompt-based approach was used to train a transformer model for the SCD task [8]. RoBERTa, BERT, and ELECTRA transformers were combined with a binary classification layer to solve the SCD task [9]. The SCD task was also regarded as an authorship verification problem based on the term-document matrix [10]. The mT0-x1 was used as the based teacher model to train the smaller student model based on the knowledge distillation mechanism [11]. A comparative learning method was presented to train the DeBERTa transformer to ensure paragraphs written by the same author are close in the semantic space [12].

In summary, using transformer-based models usually obtained promising results in the previous SCD tasks. Therefore, this motivates us to explore how to use transformers more effectively to solve the multi-author writing style analysis task at PAN-2024.

The NYCU-NLP System

Figure 1 shows our system architecture integrating transformers for multi-author writing style analysis, comprised of three main parts: 1) pre-trained transformers; 2) an assembly mechanism; and 3) similarity adjustments. We first select the following transformers for multi-author writing style analysis:

• a Robust optimized BERT pre-training approach (RoBERTa) [13] RoBERTa enhances BERT [14] by removing the next sentence prediction objective that simplifies the training process, and using a dynamic masking strategy that improves model robustness. Furthermore, RoBERTa benefits from training with significantly larger batch sizes, enhancing the stability and effectiveness of the training process. These modifications result in a more robust pre-trained language model that achieves superior performance on various natural language processing tasks. • Decoding-enhanced BERT with disentangled attention (DeBERTa) [15] DeBERTa improves BERT [14] by using a disentangled attention mechanism and an enhanced mask decoder. Each word is represented using content and position vectors and then disentangled matrices are used to compute attention weights. In the enhanced mask decoder architecture, absolute positions are used to predict the masked tokens for model pre-training. • Enhanced Representation through Knowledge Integration (ERNIE) [16] Inspired by the masking strategy of BERT [14], ERNIE is designed to learn language representations by entity-level masking and phrase-level masking. ERNIE 2.0 is an advanced version of ERNIE [17], which uses continuous multitask learning and a variety of pre-training tasks to enhance language comprehension. A continuous learning methodology is used to progressively integrate multiple tasks, which allows the model to proceed without forgetting what it has learned previously. In addition, ERNIE 2.0 proposes several new pre-training tasks, including word-aware, structure-aware, and semantic-aware tasks to respectively capture lexical information, syntactic information, and semantic information.

We fine-tune the language model of the individual pre-trained transformer and connected Multi-Layer Perceptron (MLP) as a classifier. Each pair of consecutive paragraphs is used for fine-tuning, along with its labeled classes (where '1' means change and otherwise '0'). We then use a voting-based assembly mechanism [18], which each transformer model makes an independent classification (i.e., a vote 0 or 1) for each testing instance. The final system output is determined by a majority of votes.

We suggest that two paragraphs with a similar topic should obtain a higher embedding similarity. Therefore, a multilingual LaBSE [19] embedding is used to represent each paragraph as a semantic vector. We then measure the cosine similarity between two given paragraph embedding vectors. If the similarity exceeds a predefined threshold, the topics of the two paragraphs should have a higher degree of similarity. We modify the assembly prediction from 1 (change) to 0 based on an assumption that paragraphs with similar topics usually reflect no change of author if the cosine similarity exceeds the threshold. In addition, since the paragraphs of a document at the easy and medium levels may contain a variety of topics, we only adopt this similarity adjustment mechanism at these two levels.

Evaluation

Data

The experimental datasets were mainly provided by task organizers [20]. Each level has 4,200 documents for model training and 900 documents for system validation. We also use additional 4,2000 documents each from the SCD-2023 task [3] to fine-tune the transformers for the medium and hard levels.

Settings

The pre-trained RoBERTa1 , DeBERTa2 , and ERNIE 2.03 models were downloaded from HuggingFace [21]. All models were fine-tuned on a server using a Nvidia Titan RTX GPU (24GB memory). The hyper-parameter values were optimized as follows: maximum sequence length of 256; learning rate 0.00005; dropout 0.25; epoch 10 and batch size 60. The LaBSE4 was downloaded from TensorFlow Hub and the similarity adjustment threshold was set to 0.8. The system was deployed on the TIRA platform [22] to evaluate performance on the various difficulty levels using the macro-averaging F1-score.

Results

Table 1 shows the validation set results. Among individual transformer models, DeBERTa-v1 outperformed the other models at the easy and hard levels. At the medium level, ERNIE 2.0 outperformed RoBERTa and DeBERTa. Our NYCU-NLP system used the assembly mechanism and similarity adjustments to obtain the best detection performance.

Table 2 shows the test set results. Our NYCU-NLP system significantly outperformed the baseline prediction for 1 or 0. We achieved a macro-averaging F1-score of 0.964 (ranking ninth of 17 systems) at the easy level; while F1-scores of 0.857 and 0.863 respectively at the medium and hard levels ranked first and second of 17 and 16 participating systems.

Conclusions

This study describes the design, implementation and evaluation of our NYCU-NLP system for the multi-author writing style analysis task at PAN 2024. We selected pre-trained transformer models as the starting points and fine-tuned the corresponding downstream classification tasks. Our unified architecture used a voting-based assembly mechanism to determine final system detection. We also adopted embedding similarity to adjust the system output at the easy and medium levels. Our submitted system ranked first of 17 participating systems at the hard level and second of 16 systems at the medium level.

Figure 1 :1Figure 1: Our proposed NYCU-NLP system architecture

Table 11Results of transformer models on the validation set.Approach Easy level Medium level Hard levelRoBERTa0.94350.84360.8423DeBERTa0.95840.84080.8567ERNIE 2.00.9550.84960.849NYCU-NLP0.97160.86260.8658

Table 22Submission results on the test set.ApproachEasy level Medium level Hard levelNYCU-NLP0.9640.8570.863Baseline Predict 10.4660.3430.320Baseline Predict 00.1120.3230.346

https://huggingface.co/roberta-base https://huggingface.co/microsoft/deberta-base https://huggingface.co/nghuyong/ernie-2.0-base-en https://tfhub.dev/google/LaBSE.

Acknowledgments

This study is partially supported by the National Science and Technology Council, Taiwan, under the grant NSTC 111-2628-E-A49-029-MY3. This work was also financially supported by the Co-creation Platform of the Industry Academia Innovation School, NYCU.

Overview of PAN 2024: Multi-Author Writing Style Analysis, Multilingual Text Detoxification, Oppositional Thinking Analysis, and Generative AI Authorship Verification JBevendorff XBCasals BChulvi DDementieva AElnagar DFreitag MFröbe DKorenčić MMayerl AMukherjee APanchenko MPotthast FRangel PRosso ASmirnova EStamatatos BStein MTaulé DUstalov MWiegmann EZangerle Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Fourteenth International Conference of the CLEF Association (CLEF 2024) Lecture Notes in Computer Science

Berlin Heidelberg New York

Springer 2024 Overview of the Style Change Detection Task at PAN EZangerle MMayerl MPotthast BStein CLEF 2022 Labs and Workshops Notebook Papers GFaggioli NFerro AHanbury MPotthast 2022. 2022 Overview of the Multi-Author Writing Style Analysis Task at PAN EZangerle MMayerl MPotthast BStein CLEF 2023 Labs and Workshops Notebook Papers MAliannejadi GFaggioli NFerro MVlachos 2023. 2023 Style Change Detection Using BERT-Notebook for PAN at CLEF AIyer SVosoughi CLEF 2020 Labs and Workshops Notebook Papers LCappellato CEickhoff NFerro ANévéol 2020. 2020 Style change detection using Siamese neural networks-Notebook for PAN at CLEF SNath CLEF 2021 Labs and Workshops Notebook Papers GFaggioli NFerro AJoly MMaistro FPiroi 2021. 2021 Ensemble Pre-trained Transformer Models for Writing Style Change Detection T.-MLin C.-YChen Y.-WTzeng L.-HLee CLEF 2022 Labs and Workshops Notebook Papers GFaggioli NFerro AHanbury MPotthast 2022 ARC-NLP at PAN 23: Transition-Focused Natural Language Inference for Writing Style Detection IEKucukkaya USahin CToraman CLEF 2023 Labs and Workshops Notebook Papers MAliannejadi GFaggioli NFerro MVlachos 2023 Style Change Detection based on Prompt ZZhang ZHan LKong CLEF 2022 Labs and Workshops Notebook Papers GFaggioli NFerro AHanbury MPotthast 2022 Enhancing Writing Style Change Detection using Transformer-based Models and Data Augmentation AHashemi WShi CEUR-WS.org Working Notes of CLEF 2023 -Conference and Labs of the Evaluation Forum MAliannejadi GFaggioli NFerro MVlachos 2023 Authorship verification machine learning methods for Style Change Detection in texts GJacobo VDehesa DRojas HGómez-Adorno CEUR-WS.org Working Notes of CLEF 2023 -Conference and Labs of the Evaluation Forum MAliannejadi GFaggioli NFerro MVlachos 2023 Encoded Classifier Using Knowledge Distillation for Multi-Author Writing Style Analysis MHuang ZHuang LKong CEUR-WS.org Working Notes of CLEF 2023 -Conference and Labs of the Evaluation Forum MAliannejadi GFaggioli NFerro MVlachos 2023 A Writing Style Embedding Based on Contrastive Learning for Multi-Author Writing Style Analysis HChen ZHan ZLi YHan CEUR-WS.org Working Notes of CLEF 2023 -Conference and Labs of the Evaluation Forum MAliannejadi GFaggioli NFerro MVlachos 2023 Roberta: A robustly optimized bert pretraining approach YLiu MOtt NGoyal JDu MJoshi DChen OLevy MLewis LZettlemoyer VStoyanov 10.48550/arXiv.1907.11692 doi: 2019 Bert: Pre-training of deep bidirectional transformers for language understanding JDevlin M.-WChang KLee KToutanova 10.48550/arXiv.1810.04805 .1810.04805 Proceedings of NAACL-HLT 2019 NAACL-HLT 2019 2019 Deberta: Decoding-enhanced bert with disentangled attention PHe XLiu JGao WChen 10.48550/arXiv.1907.11692 doi: ternational Conference on Learning Representations 2021 YSun SWang YLi SFen HTian HWu HWang arXiv:1907.12412 Ernie 2.0: A continual pre-training framework for language understanding 2019 arXiv preprint Enhanced language representation with informative entities ZZhang XHan ZLiu MSJiang QLiu 10.18653/v1/P19-1139 doi: Proceedings of ACL ACL 2019 2019 Ensemble multi-channel neural networks for scientific language editing evaluation L.-HLee Y.-SWang C.-YChen L.-CYu 10.1109/ACCESS.2021.3130042 2021 Institute of Electrical and Electronics Engineers Access Language-agnostic bert sentence embedding FFeng YYang DCer NArivazhagan WWang 10.18653/v1/2022.acl-long.62 doi: Proceedings of ACL ACL 2022 2022 Overview of the Multi-Author Writing Style Analysis Task at PAN 2024 EZangerle MMayerl MPotthast BStein Working Notes of CLEF 2024 -Conference and Labs of the Evaluation Forum GFaggioli NFerro PGaluščáková AG SHerrera CEUR-WS 2024 Huggingface's transformers: State-of-the-art natural language processing TWolf LDebut VSanh JChaumond CDelangue AMoi PCistac TRault RLouf MFuntowicz JDavison SShleifer PVon Platen CMa YJernite JPlu CXu TLScao SGugger MDrame QLhoest AMRush 10.48550/arXiv.1910.03771 2019 Continuous Integration for Reproducible Shared Tasks with TIRA MFröbe MWiegmann NKolyada BGrahm TElstner FLoebe MHagen BStein MPotthast 10.1007/978-3-031-28241-6_20 doi: Advances in Information Retrieval. 45th European Conference on IR Research (ECIR 2023) Lecture Notes in Computer Science JKamps LGoeuriot FCrestani MMaistro HJoho BDavis CGurrin UKruschwitz ACaputo

Berlin Heidelberg New York

Springer 2023