<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Xiv:</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.1007/978-3-031-28241-6_20</article-id>
      <title-group>
        <article-title>MarSan at PAN: BinocularsLLM, Fusing Binoculars' Insight with the Proficiency of Large Language Models for Machine-Generated Text Detection</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ehsan Tavan</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Maryam Najafi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science and Information Systems, University of Limerick</institution>
          ,
          <addr-line>Castletroy, V94 T9PX Limerick</addr-line>
          ,
          <country country="IE">Ireland</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>NLP Department, Part AI Research Center</institution>
          ,
          <addr-line>Tehran</addr-line>
          ,
          <country country="IR">Iran</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2401</year>
      </pub-date>
      <volume>12070</volume>
      <fpage>09</fpage>
      <lpage>12</lpage>
      <abstract>
        <p>Large Language Models have revolutionized natural language processing, exhibiting remarkable fluency and quality in generating human-like text. However, this advancement also brings challenges, particularly in distinguishing between human and machine-generated content. In this study, we propose an ensemble framework called BinocularsLLM for the PAN 2024 'Voight-Kampf' Generative AI Authorship Verification task. BinocularsLLM integrates supervised fine-tuning of LLMs with a classification head and the Binoculars framework, demonstrating promising results in detecting machine-generated text. Through extensive experimentation and evaluation, we showcase the efectiveness of our approach in addressing this critical task, achieving a perfect ROC-AUC score of 96.1%, a Brier score of 92.8%, a C@1 score of 91.2%, an F1 score of 88.4%, and an F0.5u score of 93.2% across all test datasets. BinocularsLLM outperforms all participants and baseline approaches, indicating its superior ability to generalize efectively and distinguish between human and machine-generated content. Our framework achieves the first rank among 30 teams participating in this competition.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;PAN 2024</kwd>
        <kwd>Large Language Models</kwd>
        <kwd>Machine-Generated Text Detection</kwd>
        <kwd>Instruction Fine-Tuning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        In recent years, Large Language Models (LLMs) have made remarkable advancements, generating
text that closely mimics human language with high fluency and quality. Models such as ChatGPT
[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], GPT-3 [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], LLaMa [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], and Mistral [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] demonstrate impressive performance in a variety of tasks
including question-answering, writing stories, and analyzing program code. These technologies ofer
significant potential to enhance eficiency and scalability across various domains, driving innovation
and productivity [
        <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
        ].
      </p>
      <p>
        Machine-generated text is now used in a wide range of applications, from powerful chatbots [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]
and real-time language translation [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] to analyzing and generating program code [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. However, the
sophistication of these models also introduces new challenges in distinguishing between
humangenerated and machine-generated content.
      </p>
      <p>The ability to reliably detect machine-generated text is crucial. With the rapid expansion of
information on the internet, there is an increased risk of misinformation spreading unchecked. The misuse of
LLMs for generating fake news, fake product reviews, and propaganda pose substantial threats to the
integrity of online communication. Furthermore, malicious activities such as spamming and fraud are
intensified by the advanced capabilities of these models. Efective detection mechanisms are essential to
protect against these risks, ensuring that digital content remains trustworthy and authentic. Developing
tools and strategies to automatically detect machine-generated texts is essential to mitigate the threats
posed by the misuse of LLMs.</p>
      <p>
        In the PAN’24 "Voight-Kampf" Generative AI Authorship Verification task [
        <xref ref-type="bibr" rid="ref10 ref11">10, 11</xref>
        ], participants are
faced with an innovative challenge. Their task involves examining two texts: one authored by a human
and the other by a machine. The goal is to identify the text authored by a human. This task highlights
the ongoing need for robust methods to diferentiate between human and machine-generated content,
underscoring the importance of continued research and development in this area [
        <xref ref-type="bibr" rid="ref10">12, 13, 10</xref>
        ].
      </p>
      <p>In this study, we explore innovative approaches to machine-generated text detection by investigating
several key hypotheses. First, we examine whether leveraging LLMs with instruction fine-tuning can
enhance the efectiveness of detecting machine-generated content. Second, we test the feasibility of
training LLMs with a classification head that utilizes softmax to produce accurate output labels. Lastly,
we investigate whether combining zero-shot techniques, which utilize metrics like perplexity and
entropy, with fine-tuned models can significantly improve the accuracy of machine-generated text
detection. These hypotheses aim to push the boundaries of our current detection capabilities, potentially
leading to breakthroughs in ensuring the authenticity of digital content.</p>
      <p>
        In Section 3, we introduce BinocularsLLM, our proposed ensemble framework, which integrates
ifne-tuned LLama2 [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] and Mistral models with a classification head, while also incorporating the
Binoculars [14] model. This framework undergoes evaluation on both the main and nine additional test
datasets, demonstrating notably promising results.
      </p>
      <p>In this paper, we conduct a comprehensive evaluation of Voight-Kampf Generative AI Authorship
Verification tasks, comparing our proposed framework against both baseline models and state-of-the-art
approaches. We have made our code and data publicly available on our GitHub repository1 and our
ifne-tuned models are available on Hugging Face: Generative-AV-Mistral-v0.1-7b 2 and
Generative-AVLLaMA-2-7b3. Our contributions are organized as follows: Section 2 reviews the relevant background
literature. Section 3 introduces BinocularsLLM. Section 4 details the evaluation metrics and presents
the experimental results.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Background</title>
      <p>The detection of machine-generated text has become a critical area of research, driven by the rapid
advancement and widespread use of large language models (LLMs) such as GPT-4 [15], PaLM [16],
and ChatGPT. This task is typically formulated as a classification problem. This section reviews
existing methodologies categorized into supervised learning approaches, zero-shot detection models,
and watermarking techniques.</p>
      <p>Supervised Learning Approaches: Supervised learning methods train classifiers on labeled datasets
[17, 18, 19]. Models like GPT2 Detector [20] and ChatGPT Detector [21] fine-tunes pre-trained models
such as RoBERTa [22] on the output of GPT2 [23] and the HC3 [21] dataset. While these models
demonstrate high accuracy within their training domains, they often struggle with generalization to
out-of-domain texts [24, 25]. Techniques such as adversarial training [26] and abstention [27] have been
explored to enhance robustness, but challenges remain, particularly in maintaining low false positive
rates across diverse text distributions [28].</p>
      <p>Zero-Shot Detection Models: Another approach to identifying machine-generated text involves
zero-shot detection models, which leverage statistical features in texts without requiring explicit
training on labeled datasets. These models, such as DetectGPT [29] and others [30, 31], analyze
universal features inherent in machine-generated texts. They exploit concepts like entropy, perplexity,
and n-gram frequencies to distinguish between human and machine-generated text. These models ofer
robustness across diferent types of text and languages, circumventing the domain-specific limitations
of supervised classifiers [ 30]. However, the computational demands remain a significant challenge,
particularly in methods relying on probability curvature and extensive perturbations [29, 31].</p>
      <sec id="sec-2-1">
        <title>1https://github.com/MarSanTeam/BinocularsLLM 2https://huggingface.co/Ehsan-Tavan/Generative-AV-Mistral-v0.1-7b 3https://huggingface.co/Ehsan-Tavan/Generative-AV-LLaMA-2-7b</title>
        <p>Watermarking Techniques: Watermarking involves embedding detectable patterns into the
generated text that are imperceptible to humans but identifiable by algorithms. Grinbaum and Adomaitis
[32] and Abdelnabi and Fritz [33] utilized syntax tree manipulation to embed watermarks, while
Kirchenbauer et al. [34] required access to the LLM’s logits to modify token probabilities. Although
efective, these methods necessitate control over the text generation process, limiting their applicability
to scenarios where such control is feasible.</p>
        <p>text 1
text 1
text 1</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. System Overview</title>
      <p>In this section, we present BinocularsLLM, our ensemble framework to address the PAN’24
"VoightKampf" Generative AI Authorship Verification task, with a focus on detecting machine-generated
text. Our goals are twofold: to compare the efectiveness of classification-head fine-tuning with
instruction fine-tuning and to integrate the power of the Binoculars technique with fine-tuned LLMs.
Both approaches utilize QLoRA, ensuring that only the QLoRA and the classification head weights are
trained, not all the parameters of the LLM.</p>
      <p>The Binoculars model4 employs observer and performer models to evaluate perplexity and entropy,
critical metrics for identifying machine-generated text. By integrating these evaluations with the
advanced capabilities of supervised fine-tuning, our ensemble is designed to be capable of distinguishing
between human and machine-generated text.</p>
      <p>Based on our experiments, we observed that LLM models employing a classification head performed
more efectively in detecting machine-generated texts compared to instruction fine-tuning.
Consequently, BinocularsLLM integrates two fine-tuned LLMs, LLaMA2 and Mistral (selected based on
the results in Table 1), alongside the Binoculars approach. This comprehensive approach leverages
the capabilities of statistical metrics and LLM fine-tuning, ensuring robust and accurate detection of
machine-generated text.</p>
      <sec id="sec-3-1">
        <title>3.1. Instruction Fine-Tuning for Machine-Generated Text Detection</title>
        <p>Instruction Fine-Tuning (IT) involves further training LLMs with specific input-output pairs and
accompanying instructions in a supervised manner. This approach has proven efective in enhancing an
LLM’s ability to generalize to new, unseen tasks [35] and is considered a viable strategy for improving
LLM alignment [36, 37].</p>
        <p>In our study on Voight-Kampf Generative AI Authorship Verification, we examine the eficacy of the
IT method. Specifically, we evaluate various LLMs’ performance when fine-tuned with a specific set
of instructions. This process involve creating an instruction dataset,  , comprising instruction pairs
 = (INSTRUCTION, OUTPUT). Each instruction  is generated using a fixed template and samples 
from the training dataset . These samples are labeled  based on their corresponding labels in dataset
. Figure 1 illustrates our instruction fine-tuning process.</p>
        <sec id="sec-3-1-1">
          <title>Here’s an illustration of the instruction format:</title>
          <p>The resulting instruction text detection dataset  consists of instruction pairs along with their source
labels. A label of 0 indicates the first text is human-generated, while a label of 1 indicates the second
text is human-generated. Thus, the instruction text detection dataset  includes pairs along with their
corresponding source labels, formally represented as  = {(instruction, , ) |  ∈ }.</p>
          <p>Instruction: I provide two texts and ask you to determine which one is authored by humans
and which one is authored by machines. Your output is simply a 0 or 1; do not generate any
additional text. 0 indicates Text1 is authored by the machine, and 1 indicates Text2 is authored
by the machine.</p>
          <p>Text1: [_1]
Text2: [_2]
Response: [_]</p>
          <p>Given an LLM with parameters  as the initial model for instruction tuning, training the model on
the constructed instruction dataset  results in adapting the LLM’s parameters from  to  , referred to
as the LLM-Detector. Specifically,   is obtained by maximizing the probability of predicting the next
tokens in the OUTPUT component of each instruction sample , conditioned on the INSTRUCTION.
This process is formulated as follows:</p>
          <p>∈
  = arg max ∑︁ log  (     |      ; ,  )
(1)</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Supervised Fine-Tuning LLMs</title>
        <p>Fine-tuning LLMs involves adjusting model weights using a labeled dataset to enhance performance on
specific tasks. This process can be computationally intensive, requiring significant memory resources,
particularly when dealing with full LLM fine-tuning due to its substantial memory demands. To address
these challenges, Parameter-Eficient Fine-Tuning (PEFT)[ 38] techniques such as LoRA [39] and QLoRA
[40] are employed.</p>
        <p>LoRA fine-tunes only two smaller matrices that approximate the larger weight matrix, reducing
memory requirements and preserving the original LLM weights. Taking a step further, QLoRA enhances
memory eficiency by quantizing these smaller matrices to a lower precision, such as 4-bit, without
compromising efectiveness. Employing these fine-tuning techniques for both the classification head
ifne-tuning and instruction fine-tuning augments the LLM’s capacity to accurately distinguish between
machine-generated and human-generated text.</p>
        <p>The Mistral and Llama2 models are fine-tuned exclusively using the provided bootstrap dataset and
⟨LABEL⟩ that indicates the source of the text. The input format can be represented as:
the QLoRA technique. Each input example consists of a text string ⟨TEXT⟩ and a corresponding label
⟨TEXT⟩ : ⟨LABEL⟩,
where LABEL =
{︃1 for human-generated text</p>
        <p>0 for machine-generated text</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Inference Time</title>
        <p>During the inference phase, the process initiates by receiving two texts as input. Each text is processed
separately via the fine-tuned LLama2 and Mistral models to predict the probability of being
humanwritten. If the probability assigned to the first text surpasses that of the second text, the score for the
input sample is calculated by subtracting the score of the first text from that of the second. Conversely,
if the probability of the second text is greater, the input text is labeled as 0.</p>
        <p>Additionally, the input is also processed with the binoculars model, which generates a score for each
text using its specialized algorithm. If the binoculars score of the first text exceeds that of the second
text, the input score is assigned as 0; otherwise, it is assigned as 1. Figure 2 illustrates BinocularsLLM.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Results</title>
      <p>In this section, we present the implementation details, evaluation metrics, and provide a comprehensive
analysis of the results. We utilize the TIRA [41] platform to evaluate our framework using test datasets.</p>
      <sec id="sec-4-1">
        <title>4.1. Implementation Details</title>
        <p>In this research, the framework was implemented in PyTorch and executed on Nvidia V100 GPUs. The
training process was conducted for 5 epochs, utilizing the AdamW optimizer with a learning rate of 2e-5.
The training batch size was set to 2, with gradient accumulation set to 8. For QLoRa, we configured
LoRA’s rank to 64 and its alpha to 16, employing 4-bit quantization. To evaluate fine-tuned models, we
used 20% of the given dataset as a development dataset.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Evaluation Metrics</title>
        <p>To evaluate the performance of our proposed model, we used the evaluation metrics provided by PAN,
which include the following metrics:
•  −  : The conventional area under the curve score.
• @1: Rewards systems that leave complicated problems unanswered.
• 0.5: Focus on deciding same-author cases correctly.
•  1 − : A harmonic way of combining the precision and recall of the model.
• : Evaluates the accuracy of probabilistic predictions.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Result Analysis on Development Dataset</title>
        <p>As mentioned earlier, we compare two fine-tuning approaches for detecting machine-generated text:
instruction fine-tuning and classification-head fine-tuning. The performance of various LLMs under
these methodologies is illustrated in Table 1 using the development dataset. Based on the results from
Table 1, we select the two top-performing LLMs to integrate into our ensemble framework.</p>
        <p>In analyzing the results presented in Table 1, it becomes evident that both the LLama2-7B and
Mistral7B models, fine-tuned with a classification head, demonstrate promising performance across various
evaluation metrics on our development dataset. LLama2-7B demonstrates exceptional scores across all
metrics using the classification head fine-tuning approach, showcasing its robustness in distinguishing
between human and machine-generated text. Meanwhile, Mistral-7B also has notable performance,
indicating its eficacy in authorship verification tasks. These findings show the efectiveness of
employing classification head fine-tuning for both LLama2-7B and Mistral-7B within the BinocularsLLM
framework.</p>
        <p>Comparing classification head fine-tuning with instruction tuning, we observe that classification head
ifne-tuning yields superior performance. These findings indicate that classification head fine-tuning
is more efective than instruction tuning for enhancing the performance of LLMs in distinguishing
between human and machine-generated text.</p>
      </sec>
      <sec id="sec-4-4">
        <title>4.4. Results on Blinded Test Dataset</title>
        <p>As Table 2 shows, BinocularsLLM achieved outstanding performance across multiple evaluation metrics
on the PAN 2024 Task 4 (Voight-Kampf Generative AI Authorship Verification) main test dataset,
demonstrating its efectiveness in detecting machine-generated text. With a perfect ROC-AUC score of
1.0 and a Brier score close to 1.0, BinocularsLLM exhibits high discriminative ability and excellent
calibration. Additionally, BinocularsLLM outperforms all baseline approaches in terms of C@1, F1,
and F0.5u scores. The mean evaluation score further underscores the robustness and reliability of the
BinocularsLLM framework in distinguishing between human and machine-generated text.</p>
        <p>Table 3 presents the analysis of BinocularsLLM across nine variants of the test set. The mean accuracy
over these variants provides insights into the generalization capability of diferent approaches across
diverse datasets. Among the approaches evaluated, the BinocularsLLM framework achieved the highest
mean accuracy, with a median score of 0.990, indicating strong performance across various test variants.
However, when compared to baseline approaches, BinocularsLLM consistently outperforms them,
showcasing its superior ability to generalize efectively. The performance of baseline approaches varies
significantly across diferent datasets, as evidenced by the wide range between the minimum and
maximum scores. This suggests that while some approaches exhibit consistent performance across
diverse datasets, others may struggle to generalize efectively. Further analysis of the quantile values
elucidates the distribution of performance scores, highlighting the variability and potential challenges
in achieving consistent accuracy across diferent test variants.</p>
        <p>PPMd and Unmasking display moderate performance, with median accuracies of 0.750 and 0.696,
respectively. However, their lower quantiles, particularly the minimum and 25th quantile, indicate
significant variability and potential instability in their performance.</p>
        <p>Fast-DetectGPT shows the most variability among the baselines, with a minimum accuracy of 0.159
and a maximum of 0.982. This wide range suggests inconsistency and unreliability in diferent test
scenarios.</p>
        <p>The comparative analysis present in Figure 4 illustrates the discernible impact of training on the
Mistral and Llama2 model. Before training, both models exhibited limited discriminatory capability
on our development dataset between AI-generated and human-written text, as evidenced by the
overlapping distribution of data points in their respective scatter plots. However, post-training, a
noticeable refinement emerges, with the models demonstrating enhanced proficiency in distinguishing
between the two text categories. The scatter plots after training reveal a clearer separation between
AI-generated and human-written text samples, indicating an improvement in the model’s ability to
capture distinguishing features inherent to each text type.</p>
      </sec>
      <sec id="sec-4-5">
        <title>4.5. Leaderboard on Test Datasets</title>
        <p>Our team, MarSan, achieves the top position in the task leaderboard among 30 teams with our
BinocularsLLM framework and demonstrates strong performance across various metrics. Table 4 outlines the
performance metrics of the top 10 teams in the competition.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>In conclusion, the BinocularsLLM framework for the PAN 2024 "Voight-Kampf" Generative AI
Authorship Verification task demonstrates significant advancements in detecting machine-generated text.
Through the integration of supervised fine-tuning of LLMs with a classification head and the Binoculars
model, we have achieved outstanding performance, as evidenced by a perfect ROC-AUC score of 1.0 and
a Brier score close to 1.0 on the main test dataset. BinocularsLLM framework outperforms all baseline
approaches in crucial evaluation metrics, highlighting its robustness and efectiveness in distinguishing
between human and machine-generated content. Looking ahead, the success of our approach opens
up exciting avenues for future research, including exploring more sophisticated ensemble techniques,
investigating the impact of diferent fine-tuning strategies, and addressing challenges related to
scalability and computational eficiency. By continuing to innovate in this critical area, we can further
advance the field of machine-generated text detection and contribute to enhancing the trustworthiness
and authenticity of digital content.
N. Shazeer, V. Prabhakaran, E. Reif, N. Du, B. Hutchinson, R. Pope, J. Bradbury, J. Austin, M. Isard,
G. Gur-Ari, P. Yin, T. Duke, A. Levskaya, S. Ghemawat, S. Dev, H. Michalewski, X. Garcia, V. Misra,
K. Robinson, L. Fedus, D. Zhou, D. Ippolito, D. Luan, H. Lim, B. Zoph, A. Spiridonov, R. Sepassi,
D. Dohan, S. Agrawal, M. Omernick, A. M. Dai, T. S. Pillai, M. Pellat, A. Lewkowycz, E. Moreira,
R. Child, O. Polozov, K. Lee, Z. Zhou, X. Wang, B. Saeta, M. Diaz, O. Firat, M. Catasta, J. Wei,
K. Meier-Hellstern, D. Eck, J. Dean, S. Petrov, N. Fiedel, Palm: Scaling language modeling with
pathways, 2022. arXiv:2204.02311.
[17] M. Najafi, S. Sadidpur, Paa: Persian author attribution using dense and recursive connection (2024).
[18] E. Tavan, M. Najafi, R. Moradi, Identifying ironic content spreaders on twitter using psychometrics,
contextual and ironic features with gradient boosting classifier., in: CLEF (Working Notes), 2022,
pp. 2687–2697.
[19] M. Najafi, E. Tavan, Text-to-text transformer in authorship verification via stylistic and semantical
analysis., 2022.
[20] I. Solaiman, M. Brundage, J. Clark, A. Askell, A. Herbert-Voss, J. Wu, A. Radford, G. Krueger, J. W.</p>
      <p>Kim, S. Kreps, et al., Release strategies and the social impacts of language models, arXiv preprint
arXiv:1908.09203 (2019).
[21] B. Guo, X. Zhang, Z. Wang, M. Jiang, J. Nie, Y. Ding, J. Yue, Y. Wu, How close is chatgpt to human
experts? comparison corpus, evaluation, and detection, arXiv preprint arXiv:2301.07597 (2023).
[22] Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, V. Stoyanov,</p>
      <p>Roberta: A robustly optimized bert pretraining approach, arXiv preprint arXiv:1907.11692 (2019).
[23] A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskever, et al., Language models are
unsupervised multitask learners, OpenAI blog 1 (2019) 9.
[24] A. Bakhtin, S. Gross, M. Ott, Y. Deng, M. Ranzato, A. Szlam, Real or fake? learning to discriminate
machine from human generated text, 2019. arXiv:1906.03351.
[25] A. Uchendu, T. Le, K. Shu, D. Lee, Authorship attribution for neural text generation, in: B.
Webber, T. Cohn, Y. He, Y. Liu (Eds.), Proceedings of the 2020 Conference on Empirical Methods
in Natural Language Processing (EMNLP), Association for Computational Linguistics, Online,
2020, pp. 8384–8395. URL: https://aclanthology.org/2020.emnlp-main.673. doi:10.18653/v1/
2020.emnlp-main.673.
[26] X. Hu, P.-Y. Chen, T.-Y. Ho, Radar: Robust ai-text detection via adversarial learning, 2023.</p>
      <p>arXiv:2307.03838.
[27] Y. Tian, H. Chen, X. Wang, Z. Bai, Q. Zhang, R. Li, C. Xu, Y. Wang, Multiscale positive-unlabeled
detection of ai-generated texts, arXiv preprint arXiv:2305.18149 (2023).
[28] W. Liang, M. Yuksekgonul, Y. Mao, E. Wu, J. Zou, Gpt detectors are biased against non-native
english writers, 2023. arXiv:2304.02819.
[29] E. Mitchell, Y. Lee, A. Khazatsky, C. D. Manning, C. Finn, Detectgpt: Zero-shot machine-generated
text detection using probability curvature, 2023. arXiv:2301.11305.
[30] S. Gehrmann, H. Strobelt, A. Rush, GLTR: Statistical detection and visualization of generated
text, in: M. R. Costa-jussà, E. Alfonseca (Eds.), Proceedings of the 57th Annual Meeting of the
Association for Computational Linguistics: System Demonstrations, Association for Computational
Linguistics, Florence, Italy, 2019, pp. 111–116. URL: https://aclanthology.org/P19-3019. doi:10.
18653/v1/P19-3019.
[31] J. Su, T. Y. Zhuo, D. Wang, P. Nakov, Detectllm: Leveraging log rank information for zero-shot
detection of machine-generated text, arXiv preprint arXiv:2306.05540 (2023).
[32] A. Grinbaum, L. Adomaitis, The ethical need for watermarks in machine-generated language, 2022.</p>
      <p>arXiv:2209.03118.
[33] S. Abdelnabi, M. Fritz, Adversarial watermarking transformer: Towards tracing text provenance
with data hiding, 2021. arXiv:2009.03015.
[34] J. Kirchenbauer, J. Geiping, Y. Wen, J. Katz, I. Miers, T. Goldstein, A watermark for large language
models, 2024. arXiv:2301.10226.
[35] S. Longpre, L. Hou, T. Vu, A. Webson, H. W. Chung, Y. Tay, D. Zhou, Q. V. Le, B. Zoph, J. Wei, et al.,
The flan collection: Designing data and methods for efective instruction tuning, in: International
Conference on Machine Learning, PMLR, 2023, pp. 22631–22648.
[36] R. Taori, I. Gulrajani, T. Zhang, Y. Dubois, X. Li, C. Guestrin, P. Liang, T. B. Hashimoto, Stanford
alpaca: An instruction-following llama model, https://github.com/tatsu-lab/stanford_alpaca, 2023.
[37] C. Zhou, P. Liu, P. Xu, S. Iyer, J. Sun, Y. Mao, X. Ma, A. Efrat, P. Yu, L. Yu, S. Zhang, G. Ghosh,</p>
      <p>M. Lewis, L. Zettlemoyer, O. Levy, Lima: Less is more for alignment, 2023. arXiv:2305.11206.
[38] S. Mangrulkar, S. Gugger, L. Debut, Y. Belkada, S. Paul, B. Bossan, Peft: State-of-the-art
parametereficient fine-tuning methods, https://github.com/huggingface/peft, 2022.
[39] E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, W. Chen, Lora: Low-rank
adaptation of large language models, 2021. arXiv:2106.09685.
[40] T. Dettmers, A. Pagnoni, A. Holtzman, L. Zettlemoyer, Qlora: Eficient finetuning of quantized
llms, 2023. arXiv:2305.14314.
[41] M. Fröbe, M. Wiegmann, N. Kolyada, B. Grahm, T. Elstner, F. Loebe, M. Hagen, B. Stein, M. Potthast,
Continuous Integration for Reproducible Shared Tasks with TIRA.io, in: J. Kamps, L. Goeuriot,
F. Crestani, M. Maistro, H. Joho, B. Davis, C. Gurrin, U. Kruschwitz, A. Caputo (Eds.), Advances
in Information Retrieval. 45th European Conference on IR Research (ECIR 2023), Lecture Notes
in Computer Science, Springer, Berlin Heidelberg New York, 2023, pp. 236–241. doi:10.1007/
978-3-031-28241-6_20.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>L.</given-names>
            <surname>Ouyang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Almeida</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Wainwright</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Mishkin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , S. Agarwal,
          <string-name>
            <given-names>K.</given-names>
            <surname>Slama</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ray</surname>
          </string-name>
          , et al.,
          <article-title>Training language models to follow instructions with human feedback</article-title>
          ,
          <source>Advances in neural information processing systems</source>
          <volume>35</volume>
          (
          <year>2022</year>
          )
          <fpage>27730</fpage>
          -
          <lpage>27744</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>T. B. Brown</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Mann</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Ryder</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Subbiah</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Kaplan</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Dhariwal</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Neelakantan</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Shyam</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Sastry</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Askell</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Agarwal</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Herbert-Voss</surname>
            , G. Krueger,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Henighan</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Child</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Ramesh</surname>
            ,
            <given-names>D. M.</given-names>
          </string-name>
          <string-name>
            <surname>Ziegler</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Wu</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Winter</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Hesse</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Chen</surname>
            , E. Sigler,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Litwin</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Gray</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Chess</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Clark</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Berner</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>McCandlish</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Radford</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          <string-name>
            <surname>Sutskever</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Amodei</surname>
          </string-name>
          ,
          <article-title>Language models are few-shot learners</article-title>
          ,
          <year>2020</year>
          . arXiv:
          <year>2005</year>
          .14165.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>H.</given-names>
            <surname>Touvron</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Martin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Stone</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Albert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Almahairi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Babaei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Bashlykov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Batra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Bhargava</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bhosale</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Bikel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Blecher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. C.</given-names>
            <surname>Ferrer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Cucurull</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Esiobu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Fernandes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Fu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Fu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Fuller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Goswami</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hartshorn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Hosseini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Hou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Inan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kardas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Kerkez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Khabsa</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Kloumann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Korenev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. S.</given-names>
            <surname>Koura</surname>
          </string-name>
          , M.
          <article-title>-</article-title>
          <string-name>
            <surname>A. Lachaux</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Lavril</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Liskovich</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Lu</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Mao</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          <string-name>
            <surname>Martinet</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Mihaylov</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Mishra</surname>
            , I. Molybog,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Nie</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Poulton</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Reizenstein</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Rungta</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Saladi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Schelten</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Silva</surname>
            ,
            <given-names>E. M.</given-names>
          </string-name>
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Subramanian</surname>
            ,
            <given-names>X. E.</given-names>
          </string-name>
          <string-name>
            <surname>Tan</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Tang</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Taylor</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Williams</surname>
            ,
            <given-names>J. X.</given-names>
          </string-name>
          <string-name>
            <surname>Kuan</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          <string-name>
            <surname>Yan</surname>
            , I. Zarov,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Fan</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Kambadur</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Narang</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Rodriguez</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Stojnic</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Edunov</surname>
          </string-name>
          ,
          <source>T. Scialom, Llama</source>
          <volume>2</volume>
          :
          <article-title>Open foundation and fine-tuned chat models</article-title>
          ,
          <year>2023</year>
          . arXiv:
          <volume>2307</volume>
          .
          <fpage>09288</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A. Q.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sablayrolles</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mensch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Bamford</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. S.</given-names>
            <surname>Chaplot</surname>
          </string-name>
          , D. de las Casas,
          <string-name>
            <given-names>F.</given-names>
            <surname>Bressand</surname>
          </string-name>
          , G. Lengyel,
          <string-name>
            <given-names>G.</given-names>
            <surname>Lample</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Saulnier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. R.</given-names>
            <surname>Lavaud</surname>
          </string-name>
          , M.
          <article-title>-</article-title>
          <string-name>
            <surname>A. Lachaux</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Stock</surname>
            ,
            <given-names>T. L.</given-names>
          </string-name>
          <string-name>
            <surname>Scao</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Lavril</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Lacroix</surname>
            ,
            <given-names>W. E.</given-names>
          </string-name>
          <string-name>
            <surname>Sayed</surname>
          </string-name>
          , Mistral 7b,
          <year>2023</year>
          . arXiv:
          <volume>2310</volume>
          .
          <fpage>06825</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>G.</given-names>
            <surname>Jawahar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Abdul-Mageed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. V.</given-names>
            <surname>Lakshmanan</surname>
          </string-name>
          ,
          <article-title>Automatic detection of machine generated text: A critical survey</article-title>
          , arXiv preprint arXiv:
          <year>2011</year>
          .
          <volume>01314</volume>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>N.</given-names>
            <surname>Lu</surname>
          </string-name>
          , S. Liu,
          <string-name>
            <given-names>R.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.-S.</given-names>
            <surname>Ong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Tang</surname>
          </string-name>
          ,
          <article-title>Large language models can be guided to evade ai-generated text detection</article-title>
          ,
          <source>arXiv preprint arXiv:2305.10847</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>D.</given-names>
            <surname>Bill</surname>
          </string-name>
          , T. Eriksson,
          <article-title>Fine-tuning a llm using reinforcement learning from human feedback for a therapy chatbot application</article-title>
          ,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Moslem</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Haque</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. D.</given-names>
            <surname>Kelleher</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. Way,</surname>
          </string-name>
          <article-title>Adaptive machine translation with large language models</article-title>
          ,
          <year>2023</year>
          . arXiv:
          <volume>2301</volume>
          .
          <fpage>13294</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M.</given-names>
            <surname>Nejjar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Zacharias</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Stiehle</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Weber</surname>
          </string-name>
          ,
          <article-title>Llms for science: Usage for code generation and data analysis</article-title>
          ,
          <source>arXiv preprint arXiv:2311.16733</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bevendorf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X. B.</given-names>
            <surname>Casals</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Chulvi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dementieva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Elnagar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Freitag</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Fröbe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Korenčić</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mayerl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mukherjee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Panchenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Rangel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Smirnova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Stamatatos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Taulé</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ustalov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegmann</surname>
          </string-name>
          , E. Zangerle,
          <article-title>Overview of PAN 2024: Multi-Author Writing Style Analysis, Multilingual Text Detoxification, Oppositional Thinking Analysis, and Generative AI Authorship Verification, in: Experimental IR Meets Multilinguality, Multimodality, and Interaction</article-title>
          .
          <source>Proceedings of the Fourteenth International Conference of the CLEF Association (CLEF</source>
          <year>2024</year>
          ), Lecture Notes in Computer Science, Springer, Berlin Heidelberg New York,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>A. A.</given-names>
            <surname>Ayele</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Babakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bevendorf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X. B.</given-names>
            <surname>Casals</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Chulvi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dementieva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Elnagar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Freitag</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Fröbe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Korenčić</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mayerl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Moskovskiy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mukherjee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Panchenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Rangel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Rizwan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Schneider</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Smirnova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Stamatatos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Stakovskii</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Taulé</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ustalov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Yimam</surname>
          </string-name>
          , E. Zangerle,
          <article-title>Overview of PAN 2024: Multi-Author Writing Style Analysis, Multilingual Text Detoxification, Oppositional Thinking Analysis, and Generative AI Authorship Verification</article-title>
          , in: L.
          <string-name>
            <surname>Goeuriot</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Mulhem</surname>
          </string-name>
          , G. Quénot,
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>