<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>RoBERT-IA: Human-AI Collaborative Text Classification</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Deyson Gómez Sánchez</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jeison D. Jimenez</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>María Paz Ramírez</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jairo E. Serrano</string-name>
          <email>jserrano@utb.edu.co</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Juan C. Martinez-Santos</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Edwin Puertas</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Universidad Tecnológica de Bolívar; School of Engineering</institution>
          ,
          <addr-line>Architecture, and Design; Cartagena de Indias; 130013;</addr-line>
          <country country="CO">Colombia</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <abstract>
        <p>Within the framework of the Generative AI Detection 2025 SubTask 2: Human-AI Collaborative Text Classification challenge, this study addresses the classification of texts co-authored by humans and large language models (LLMs), aiming to identify the degree of contribution of each author across six specific categories. Given the increasing accessibility and use of models such as GPT-4o, Claude 3.5, and Gemini 1.5-pro, the proliferation of AI-generated or AI-assisted content presents significant challenges in areas including misinformation, academic integrity, and content authenticity. To tackle this challenge, a finetuning process was applied to the RoBERTa-base model, employing strategies to mitigate class imbalance such as undersampling and loss weighting. The dataset was split into 80% for training and 20% for evaluation, considering key metrics like accuracy, F1-score, and macro recall - the latter used as the oficial classification metric. Preliminary results indicate that loss weighting for minority classes is a more suitable strategy than synthetic data generation, as it preserves the naturalness of the texts. Evaluation on the test set demonstrated a balanced improvement in key metrics, achieving a macro recall of 47.15% on the evaluation dataset, underscoring the efectiveness of the approach in discriminating the various forms of human-AI collaboration in text creation. Furthermore, post-competition evaluation showed that increasing the number of training epochs surpasses the baseline metrics.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Generative AI Detection</kwd>
        <kwd>Text Classification</kwd>
        <kwd>Large Language Models (LLMs)</kwd>
        <kwd>AI-generated content</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The rapid emergence of large language models (LLMs) such as GPT-4o, Claude 3.5, and Gemini
1.5pro has transformed textual content generation across diverse domains, including digital platforms,
education, media, and academia Jakesch et al.[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] These models produce text with high syntactic
and semantic quality, enabling eficient human-machine collaborations. However, their widespread
adoption raises significant concerns regarding misinformation, academic integrity, content authenticity,
and authorship transparency Shah et al.[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] Addressing these challenges requires robust detection
mechanisms capable of discerning the degree of human and automated involvement in text creation. This
study is situated within Subtask 2 of the Voight-Kampf 2025 challenge, titled Human-AI Collaborative
Text Classification, which focuses on categorizing documents co-authored by humans and generative
models into six distinct classes based on authorship composition Bevendorf et al.[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] We propose a
supervised learning architecture based on the RoBERTa model, fine-tuned specifically for this task. To
overcome the severe class imbalance in the dataset—which adversely afects model generalization—we
employ a combination of oversampling, undersampling, and SMOTE techniques Zeng et al. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
Furthermore, we critically examine back-translation as a data augmentation method, highlighting its
limitations for tasks where preserving stylistic authorship is essential
      </p>
      <p>
        Our work not only advances methodological approaches to multi-class classification in human-AI
collaborative writing contexts but also provides fundamental insights into the complex nature of textual
collaboration between humans and generative models, with implications for improving authorship
verification and combating misinformation Ragab et al. [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. State of the Art</title>
      <p>
        The detection of texts generated by artificial intelligence (AI) has emerged as a critical research domain
in response to the rapid and widespread adoption of large language models (LLMs). Initial eforts
predominantly addressed the binary classification task of distinguishing human-authored from
AIgenerated texts, laying the groundwork for more sophisticated detection methodologies. Notably,
Uchendu et al. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] extended this paradigm by diferentiating not only between human and AI authorship
but also among distinct generative models (e.g., GPT-2 and GROVER) through stylometric and syntactic
feature analysis.
      </p>
      <p>
        Despite these advances, human capacity to reliably detect AI-generated content remains fundamentally
limited. Jakesch et al. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] demonstrate that human detection accuracy approximates random chance
( 50% ), even when incentivized or trained, due to the exploitation of flawed cognitive heuristics by AI
systems. This finding underscores the necessity for robust automated detection systems. Bevendorf et al.
[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] contribute a significant benchmark with the “Voight-Kampf” challenge—an adversarial competition
assessing detection systems’ resilience across 70 test variants including obfuscation techniques and
cross-lingual scenarios. Although some systems surpassed baseline performance, none achieved perfect
classification, highlighting persistent challenges especially as language models evolve.
The growing prevalence of hybrid texts, produced via human-AI collaboration, further complicates
detection eforts. Zeng et al. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] propose a two-stage segmentation and classification pipeline, revealing
that frequent human editing and shifting authorship markedly degrade detector performance.
Complementing this, Richburg et al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] compare authorship embedding models (LUAR), n-grams, and
Transformer-based approaches, demonstrating a trade-of between binary detection superiority of
embeddings and robustness of n-gram models for fine-grained authorship verification in collaborative
texts.
      </p>
      <p>
        Recent technical innovations have propelled eficiency and accuracy gains. Bao et al. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] introduce
a zero-shot detection method leveraging conditional probability curvature, significantly accelerating
detection while maintaining high precision. Stylistic and linguistic feature-based methods continue to
show promise: Rujeedawa et al. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] achieve 82.6% accuracy with Random Forest classifiers using metrics
such as text length and lexical richness; Shah et al. [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] enhance this by incorporating explainable AI to
reach 93% accuracy, identifying that AI-generated texts manifest greater lexical richness but reduced
diversity, corroborating Uchendu et al.’s findings.
      </p>
      <p>
        In deep learning, Ragab et al. [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] present a hybrid CNN-GRU architecture optimized by a metaheuristic
algorithm, attaining over 99% accuracy by combining local and long-term dependency features. Oghaz
et al. [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] similarly afirm the robustness of Transformer models, notably RoBERTa, which achieves an
F1-score of 0.992 even on short text excerpts. Building on these foundations, recent research broadens
the scope and depth of detection capabilities. Boran et al. [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] highlight the criticality of authorship
identification in limited-sample contexts, crucial for plagiarism detection in heterogeneous digital
environments.
      </p>
      <p>
        Mizumoto et al. [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] apply linguistic fingerprinting and random forest classification to distinguish
ChatGPT-generated essays from student work, underscoring the imperative for automated detection
integration in educational settings. Fiedler and Döpke [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] reveal the considerable dificulty experts
face in reliably identifying AI-generated academic texts, showing parity between human and machine
performance limitations, especially for high-quality AI-generated content. This accentuates the need
for advanced computational solutions tailored to complex detection scenarios.
      </p>
      <p>
        From a stylometric perspective, Berriche and Larabi-Marie-Sainte [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] introduce methods exploiting
intrinsic writing style features to detect ChatGPT-based plagiarism with exceptional precision (up
to 100% ), including classification of mixed human-AI texts, directly addressing challenges posed by
co-authored and paraphrased documents. Desaire et al. [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] develop classifiers capable of distinguishing
human academic authorship from ChatGPT-generated content with over 99% accuracy, leveraging
discourse-specific linguistic markers vital for formal plagiarism detection.
      </p>
      <p>
        Lastly, Lau and Zubiaga [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] investigate how human paraphrasing afects LLM-generated text detection,
demonstrating that paraphrases markedly degrade detector performance and necessitate novel strategies
for resilient detection in real-world edited texts. Beyond technical performance, this body of work
reflects broader societal implications. Reliable AI-generated text detection is essential for preserving
academic integrity, combating misinformation, and fostering trust in digital communication. However,
ethical considerations—such as privacy, transparency, and the potential for misclassification—must guide
future developments. Research directions should encompass explainable AI frameworks, multimodal
detection methods, and cross-linguistic generalization, ensuring robust, fair, and interpretable detection
systems.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Data</title>
      <p>
        For the development of this work, we employed the dataset provided by the organizers of the PAN-CLEF
2025 competition, specifically for Subtask 2: Human-AI Collaborative Text Classification [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ][
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. This
dataset is designed to address the identification and categorization of documents co-authored by humans
and large language models (LLMs), such as GPT-4o, Claude 3.5, and Gemini 1.5-pro.
The dataset includes texts in English, Spanish, and German, spanning multiple domains such as academia,
journalism, social media, and education. This diversity reflects the widespread proliferation and
applicability of AI-generated or AI-assisted content across various contexts and thematic areas.
To capture the diferent modes of collaboration between humans and machines in text generation, the
documents are annotated with labels that distinguish six specific categories: texts entirely written
by humans; texts initiated by humans and continued by machines; texts written by humans and
subsequently polished by machines; texts written by machines and later humanized (obfuscated);
texts generated by machines and later edited by humans; and deeply mixed texts with interwoven
contributions from both authors. The distribution of each class is presented in Table 1.
      </p>
      <sec id="sec-3-1">
        <title>Label Category</title>
        <p>Machine-written, then machine-humanized
Human-written, then machine-polished
Fully human-written
Human-initiated, then machine-continued
Deeply-mixed text (human + machine parts)
Machine-written, then human-edited</p>
      </sec>
      <sec id="sec-3-2">
        <title>Total</title>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Methodology</title>
      <p>In this section, we detail the methodology outlined in Figure 1, which was used to classify documents
co-authored by humans and LLMs based on the level of contribution from each party.</p>
      <p>Class
balancing
Undersampling</p>
      <p>Oversamplig
Input</p>
      <p>Read and load</p>
      <p>data
Output</p>
      <p>Evaluate model</p>
      <p>Model training</p>
      <sec id="sec-4-1">
        <title>4.1. Class Balancing</title>
        <p>Data
Augmentation
Random word</p>
        <p>removal
Insertion of
synonyms</p>
        <p>Calculate the
weight of classes</p>
        <p>Tokenization
spliting data
(Train, Valid and</p>
        <p>Test)
At this stage, we aimed to reduce the imbalance among the categories in our dataset. To this end, classes
1 and 2 were downsampled using undersampling to 80,000 samples each, while class 5 was oversampled
to reach 10,000 samples.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Data Augmentation</title>
        <p>Given that class 5 remained significantly underrepresented compared to the other classes, a 50%
probability was applied to perform data augmentation on each instance in this class. Augmentation
techniques included synonym replacement and word deletion.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Loss Weighting</title>
        <p>To avoid excessive modification of the dataset through oversampling or undersampling, we opted for a
loss weighting strategy, preserving the integrity and original nature of the data—an essential factor for
meaningful analysis.</p>
        <p>At this stage, class weights were computed based on the current distribution of the data. This strategy
aims to mitigate the impact of class imbalance during model training by penalizing classification errors
in minority classes more heavily.</p>
        <p>The class_weight parameter was employed during the training phase to assign higher penalties to errors
in underrepresented classes. This allows the model to better generalize and reduces its bias toward
majority classes.</p>
      </sec>
      <sec id="sec-4-4">
        <title>4.4. Feature Extraction</title>
        <p>
          To extract relevant textual features under the adjustments described above, we implemented a
finetuning process of the RoBERTa Transformer model (roberta-base) specifically for the text classification
task [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ]. This approach leverages RoBERTa’s semantic embeddings, which have shown to significantly
enhance classification performance by capturing relationships between words within the text.
        </p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Evaluation</title>
      <p>To measure the model’s performance, the metrics of accuracy, F1-score, precision, and recall were
analyzed both at the global dataset level and for each individual class. Additionally, the loss function’s
progression during the training and evaluation phases was continuously monitored to prevent overfitting
or poor generalization of the model. Once the dataset was preprocessed as specified in the (Data) section,
the resulting loss weights are shown in Table 2</p>
      <sec id="sec-5-1">
        <title>Class</title>
        <p>Clase 0
Clase 1
Clase 2
Clase 3
Clase 4
Clase 5
Based on the above data, it is evident that the model assigns greater weight to misclassifications in
classes 3, 4, and 5, as these are the classes with fewer samples.</p>
        <p>To evaluate the performance of the fine-tuned RoBERTa model, the original dataset was split into 80%
for training and 20% for testing. This split enabled rigorous analysis during both the training process
and the final validation on previously unseen data, allowing subsequent evaluation on the other unseen
data provided in the competition’s evaluation phase.</p>
        <sec id="sec-5-1-1">
          <title>5.1. Model testing</title>
          <p>We evaluated the model’s efectiveness based on the four key metrics previously mentioned, along with
the training and validation losses over a total of 13,797 steps within a single full epoch, as presented in
Table 3, which shows representative stages of the training process.
The results shown in the table demonstrate a solid and consistent model performance throughout
training. A progressive reduction in loss is observed in both training and validation, accompanied by
continuous improvements in classification metrics. Notably, the weighted F1 score reached values above
0.96 in the final stages, indicating an eficient balance between precision and recall.</p>
          <p>Additionally, the class-wise analysis shows that the model achieves F1 scores ranging from 0.908 to
0.999 across classes, demonstrating its ability to adequately handle the diversity and complexity of
the six evaluated categories. These results confirm the robustness and efectiveness of the fine-tuning
process, ensuring high performance in the human-AI collaborative classification task.
Upon completion of training, the trained model was evaluated using the test set, corresponding to 20%
of the initially reserved data, yielding the metrics shown in Table 4. As observed, the weighted metrics
for F1, precision, and recall continue to reflect balanced and consistent performance. Furthermore, the
evaluation was completed in 280.66 seconds, with a processing throughput close to 98 samples per
second, indicating eficiency in model deployment.</p>
        </sec>
        <sec id="sec-5-1-2">
          <title>5.2. Competition Evaluation</title>
          <p>The results revealed that the strategy implemented by our team, identified as VerbaNex registered in
the challenge under the name "gsdeyson", ranked 15th among the participating teams in the PAN-CLEF
2025 Subtask 2 challenge. The detailed performance metrics provided by the competition organizers are
presented in Table 5.
Based on the table, we note that our model placed slightly below the established baseline and the team
in 14th position. With a Macro Recall and Macro F1 of 47.15% and an overall accuracy of 56.24%, the
model’s performance highlights significant areas for improvement.</p>
          <p>These results demonstrate that, although the model is capable of addressing the complexity of the
task, its ability to generalize and discriminate among the diferent categories has not yet reached the
expected level compared to the competition. Factors such as class balancing, preprocessing quality, or
hyperparameter optimization could be revisited to enhance performance.</p>
        </sec>
        <sec id="sec-5-1-3">
          <title>5.3. Post-competition evaluation</title>
          <p>Analyzing the training data, since the validation loss continued to decrease and did not exceed the
training loss, the model was deemed susceptible to further training. Therefore, the training was extended
to 3 epochs, and predictions were made on the evaluation dataset. Unfortunately, this submission was
made after the deadline; however, the PAN-CLEF 2025 organizers kindly provided the prediction results
from this last submission, as shown in the Table 6.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusions</title>
      <p>This study presents the methodology and results obtained in the Voight-Kampf Generative AI Detection
2025 competition. Our system incorporated class balancing techniques through oversampling and
undersampling to address data distribution imbalance, as well as data augmentation strategies based
on random word deletion and random synonym insertion to increase the diversity of minority classes.
During the training process, class-specific weight calculation was implemented as a complementary
measure to mitigate persistent imbalance. This methodology was applied to fine-tune the RoBERTa-base
model, resulting in a 15th place ranking in the competition, highlighting the need for further refinements
in our approach.</p>
      <p>The main areas for improvement identified include: firstly, more efectively addressing the issue
of imbalanced data distribution by selecting more suitable feature engineering techniques, such as
incorporating additional semantic and linguistic features into RoBERTa’s base representations. Secondly,
it is necessary to implement more robust multilingual models that allow better generalization across
texts in diferent languages. Finally, a rigorous study and selection of hyperparameters associated with
the model training process is required.</p>
      <p>Our research contributes to the efort to understand and detect texts generated by large language
models, as well as texts written by humans and outputs resulting from human-AI collaboration. We
remain committed to the continuous advancement of our methodology and the refinement of our model
to accurately identify authentic and original literary productions, distinguishing them from synthetic
outputs generated by language models.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>The authors would like to acknowledge the support provided by the master’s degree scholarship program
in engineering at the Universidad Tecnologica de Bolivar (UTB) in Cartagena, Colombia.</p>
    </sec>
    <sec id="sec-8">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the author(s) used GPT-4 for grammar, spelling, and translation
assistance. After using this tool, the author(s) reviewed and edited the content as needed and take full(s)
responsibility for the publication’s content.
The GitHub repository containing the implementation and resources of this work is available via:
• GitHub.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zeng</surname>
          </string-name>
          , S. Liu,
          <string-name>
            <given-names>L.</given-names>
            <surname>Sha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Gašević</surname>
          </string-name>
          , G. Chen,
          <article-title>Detecting ai-generated sentences in human-ai collaborative hybrid texts: Challenges, strategies</article-title>
          , and insights,
          <year>2024</year>
          . URL: https: //arxiv.org/abs/2403.03506. arXiv:
          <volume>2403</volume>
          .
          <fpage>03506</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Uchendu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Le</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Shu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <article-title>Authorship attribution for neural text generation</article-title>
          , in: B.
          <string-name>
            <surname>Webber</surname>
            , T. Cohn,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>He</surname>
          </string-name>
          , Y. Liu (Eds.),
          <source>Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)</source>
          ,
          <article-title>Association for Computational Linguistics</article-title>
          , ????, pp.
          <fpage>8384</fpage>
          -
          <lpage>8395</lpage>
          . URL: https://aclanthology.org/
          <year>2020</year>
          .emnlp-main.
          <volume>673</volume>
          /. doi:
          <volume>10</volume>
          .18653/v1/
          <year>2020</year>
          . emnlp-main.
          <volume>673</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Jakesch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. T.</given-names>
            <surname>Hancock</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Naaman</surname>
          </string-name>
          ,
          <article-title>Human heuristics for AI-generated language are flawed 120 (????) e2208839120</article-title>
          . URL: https://www.pnas.org/doi/10.1073/pnas.2208839120. doi:
          <volume>10</volume>
          .1073/ pnas.2208839120, publisher:
          <source>Proceedings of the National Academy of Sciences.</source>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bevendorf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Karlgren</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Dürlich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Gogoulou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Talman</surname>
          </string-name>
          , E. Stamatatos,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <article-title>Overview of the “voight-kampf” generative AI authorship verification task at PAN and ELOQUENT~</article-title>
          <year>2024</year>
          (????).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Richburg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Bao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Carpuat</surname>
          </string-name>
          ,
          <article-title>Automatic authorship analysis in human-AI collaborative writing</article-title>
          , in: N.
          <string-name>
            <surname>Calzolari</surname>
            , M.-
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Kan</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Hoste</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Lenci</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Sakti</surname>
          </string-name>
          , N. Xue (Eds.),
          <source>Proceedings of the 2024 Joint International Conference on Computational Linguistics</source>
          ,
          <article-title>Language Resources and Evaluation (LREC-COLING 2024), ELRA</article-title>
          and
          <string-name>
            <given-names>ICCL</given-names>
            ,
            <surname>Torino</surname>
          </string-name>
          , Italia,
          <year>2024</year>
          , pp.
          <fpage>1845</fpage>
          -
          <lpage>1855</lpage>
          . URL: https://aclanthology.org/
          <year>2024</year>
          .lrec-main.
          <volume>165</volume>
          /.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>G.</given-names>
            <surname>Bao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Teng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , Fast-detectgpt:
          <article-title>Eficient zero-shot detection of machinegenerated text via conditional probability curvature</article-title>
          ,
          <year>2024</year>
          . URL: https://arxiv.org/abs/2310.05130. arXiv:
          <volume>2310</volume>
          .
          <fpage>05130</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>M. I. H.</given-names>
            <surname>Rujeedawa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Pudaruth</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Malele</surname>
          </string-name>
          ,
          <article-title>Unmasking ai-generated texts using linguistic and stylistic features</article-title>
          ,
          <source>International Journal of Advanced Computer Science and Applications</source>
          <volume>16</volume>
          (
          <year>2025</year>
          ). URL: http://dx.doi.org/10.14569/IJACSA.
          <year>2025</year>
          .
          <volume>0160321</volume>
          . doi:
          <volume>10</volume>
          .14569/IJACSA.
          <year>2025</year>
          .
          <volume>0160321</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>A.</given-names>
            <surname>Shah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Ranka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            <surname>Dedhia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Prasad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Muni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Bhowmick</surname>
          </string-name>
          ,
          <article-title>Detecting and unmasking aigenerated texts through explainable artificial intelligence using stylistic features</article-title>
          ,
          <source>International Journal of Advanced Computer Science and Applications</source>
          <volume>14</volume>
          (
          <year>2023</year>
          ). URL: http://dx.doi.org/10. 14569/IJACSA.
          <year>2023</year>
          .
          <volume>01410110</volume>
          . doi:
          <volume>10</volume>
          .14569/IJACSA.
          <year>2023</year>
          .
          <volume>01410110</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M.</given-names>
            <surname>Ragab</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. B.</given-names>
            <surname>Ashary</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Kateb</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hakeem</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Mosli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. N.</given-names>
            <surname>Albogami</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Nooh</surname>
          </string-name>
          ,
          <article-title>Classification of human-written and ai-generated sentences using a hybrid cnn-gru model optimized by the spotted hyena algorithm</article-title>
          ,
          <source>Alexandria Engineering Journal</source>
          <volume>126</volume>
          (
          <year>2025</year>
          )
          <fpage>116</fpage>
          -
          <lpage>130</lpage>
          . URL: https://www. sciencedirect.com/science/article/pii/S1110016825005666. doi:https://doi.org/10.1016/j. aej.
          <year>2025</year>
          .
          <volume>04</volume>
          .071.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>M.</given-names>
            <surname>Maktabdar Oghaz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. Babu</given-names>
            <surname>Saheer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Dhame</surname>
          </string-name>
          , G. Singaram,
          <article-title>Detection and classification of chatgpt-generated content using deep transformer models</article-title>
          ,
          <source>Frontiers in Artificial Intelligence</source>
          Volume 8
          <article-title>-</article-title>
          <year>2025</year>
          (
          <year>2025</year>
          ). URL: https://www.frontiersin.org/journals/artificial-intelligence/articles/ 10.3389/frai.
          <year>2025</year>
          .
          <volume>1458707</volume>
          . doi:
          <volume>10</volume>
          .3389/frai.
          <year>2025</year>
          .
          <volume>1458707</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>T.</given-names>
            <surname>Boran</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Martinaj</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. S.</given-names>
            <surname>Hossain</surname>
          </string-name>
          ,
          <article-title>Authorship identification on limited samplings</article-title>
          ,
          <source>Computers &amp; Security</source>
          <volume>97</volume>
          (
          <year>2020</year>
          )
          <article-title>101943</article-title>
          . URL: https://linkinghub.elsevier.com/retrieve/pii/S0167404820302194. doi:
          <volume>10</volume>
          .1016/j.cose.
          <year>2020</year>
          .
          <volume>101943</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>A.</given-names>
            <surname>Mizumoto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Yasuda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Tamura</surname>
          </string-name>
          ,
          <article-title>Identifying ChatGPT-generated texts in EFL students' writing: Through comparative analysis of linguistic fingerprints</article-title>
          ,
          <source>Applied Corpus Linguistics</source>
          <volume>4</volume>
          (
          <year>2024</year>
          )
          <article-title>100106</article-title>
          . URL: https://linkinghub.elsevier.com/retrieve/pii/S2666799124000236. doi:
          <volume>10</volume>
          .1016/j. acorp.
          <year>2024</year>
          .
          <volume>100106</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>A.</given-names>
            <surname>Fiedler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Döpke</surname>
          </string-name>
          ,
          <article-title>Do humans identify AI-generated text better than machines? Evidence based on excerpts from German theses</article-title>
          ,
          <source>International Review of Economics Education</source>
          <volume>49</volume>
          (
          <year>2025</year>
          )
          <article-title>100321</article-title>
          . URL: https://linkinghub.elsevier.com/retrieve/pii/S1477388025000131. doi:
          <volume>10</volume>
          .1016/j. iree.
          <year>2025</year>
          .
          <volume>100321</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>L.</given-names>
            <surname>Berriche</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Larabi-Marie-Sainte</surname>
          </string-name>
          ,
          <article-title>Unveiling ChatGPT text using writing style</article-title>
          ,
          <source>Heliyon</source>
          <volume>10</volume>
          (
          <year>2024</year>
          )
          <article-title>e32976</article-title>
          . URL: https://linkinghub.elsevier.com/retrieve/pii/S2405844024090078. doi:
          <volume>10</volume>
          .1016/j. heliyon.
          <year>2024</year>
          .e32976.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>H.</given-names>
            <surname>Desaire</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. E.</given-names>
            <surname>Chua</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Isom</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Jarosova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Hua</surname>
          </string-name>
          ,
          <article-title>Distinguishing academic science writing from humans or ChatGPT with over 99% accuracy using of-the-shelf machine learning tools</article-title>
          ,
          <source>Cell Reports Physical Science</source>
          <volume>4</volume>
          (
          <year>2023</year>
          )
          <article-title>101426</article-title>
          . URL: https://linkinghub.elsevier.com/retrieve/pii/ S266638642300200X. doi:
          <volume>10</volume>
          .1016/j.xcrp.
          <year>2023</year>
          .
          <volume>101426</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>H. T.</given-names>
            <surname>Lau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Zubiaga</surname>
          </string-name>
          ,
          <article-title>Understanding the efects of human-written paraphrases in LLM-generated text detection</article-title>
          ,
          <source>Natural Language Processing Journal</source>
          <volume>11</volume>
          (
          <year>2025</year>
          )
          <article-title>100151</article-title>
          . URL: https://linkinghub. elsevier.com/retrieve/pii/S2949719125000275. doi:
          <volume>10</volume>
          .1016/j.nlp.
          <year>2025</year>
          .
          <volume>100151</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bevendorf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dementieva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Fröbe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Gipp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Greiner-Petter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Karlgren</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mayerl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Panchenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Shelmanov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Stamatatos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegmann</surname>
          </string-name>
          , E. Zangerle, Overview of PAN 2025:
          <article-title>Voight-Kampf Generative AI Detection, Multilingual Text Detoxification, Multi-Author Writing Style Analysis, and Generative Plagiarism Detection</article-title>
          , in: J.
          <string-name>
            <surname>C. de Albornoz</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Gonzalo</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Plaza</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. G. S. de Herrera</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Mothe</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Piroi</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Rosso</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Spina</surname>
          </string-name>
          , G. Faggioli, N. Ferro (Eds.),
          <source>Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Sixteenth International Conference of the CLEF Association (CLEF</source>
          <year>2025</year>
          ), Lecture Notes in Computer Science, Springer, Berlin Heidelberg New York,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bevendorf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Karlgren</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Fröbe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Tsivgun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Su</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Xie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Abassy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Mansurov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Xing</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. N.</given-names>
            <surname>Ta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. A.</given-names>
            <surname>Elozeiri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Gu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. V.</given-names>
            <surname>Tomar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Geng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Artemova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Shelmanov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Habash</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Stamatatos</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Gurevych</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <article-title>Overview of the “VoightKampf” Generative AI Authorship Verification Task at PAN</article-title>
          and
          <article-title>ELOQUENT 2025</article-title>
          , in: G. Faggioli,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          , D. Spina (Eds.),
          <source>Working Notes of CLEF 2025 - Conference and Labs of the Evaluation Forum, CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ott</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Du</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Joshi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Levy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Zettlemoyer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Stoyanov</surname>
          </string-name>
          ,
          <article-title>Roberta: A robustly optimized BERT pretraining approach</article-title>
          , CoRR abs/
          <year>1907</year>
          .11692 (
          <year>2019</year>
          ). URL: http://arxiv.org/abs/
          <year>1907</year>
          .11692. arXiv:
          <year>1907</year>
          .11692.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>