<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Team HHU - An Ensemble-Based Approach to Multi-Author Writing Style Analysis Combining Experts for Diferent Dificulty Levels</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Philipp Meier</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Katarina Boland</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Laura Kallmeyer</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Stefan Dietze</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Heinrich-Heine-University</institution>
          ,
          <addr-line>Universitätsstr. 1, Düsseldorf</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <abstract>
        <p>In the task of Multi-Author Writing Style Analysis models must identify author changes between two subsequent sentences in a multi-author document. To solve the task, we deploy an ensemble method with three transformerbased language models acting as experts for diferent dificulty levels and a weighting mechanism to weigh the model predictions. Furthermore, we use data augmentation to provide a more balanced dataset and further enhance the performance of our approach. Our ensemble method outperforms a model trained on the complete dataset containing data of all dificulty levels. This shows that our model succeeds in training and selecting more useful features for the diferent datasets. However, when examining the datasets individually, the best performance is achieved by the respective expert which shows that there is room for improvement regarding the weighting mechanism. This work demonstrates how language models can be combined to tackle the Multi-Author Writing Style Analysis task for data that is heterogeneous in terms of dificulty and domain.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;PAN 2025</kwd>
        <kwd>&lt;Multi-Author Writing Style&gt;</kwd>
        <kwd>&lt;Ensemble methods</kwd>
        <kwd>Style Change Detection</kwd>
        <kwd>Writing Style Analysis&gt;</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The goal of the PAN Multi-Author Writing Style Analysis [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] shared task at CLEF is to identify
text positions in a multi-author document at which the author changes. A model is given a sentence
pair and must classify whether an author change occurs in between or not. This task is important for
applications like plagiarism detection [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] or machine-generated text detection [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
While previous editions of this task focused on author changes at the paragraph-level within Reddit
comments [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], this year’s challenge targets authorship changes between sentences from Reddit
comments, providing substantially less context for distinguishing authorship. Data is provided across
three diferent dificulty levels: easy, medium and hard. These dificulty levels vary with respect to the
extent to which topics or syntax difer (or do not difer) between texts from diferent authors. Using
topic information is a promising feature to detect author changes in the easy split. Since topic variation
decreases with increasing dificulty, models have to rely more on stylistic cues to detect an author
change for medium and hard splits.
      </p>
      <p>In a realistic scenario, a diferentiation in levels of dificulties for the input data is not given. Therefore,
we propose a model that assumes no prior knowledge concerning the dificulty of an input sentence
pair during inference but still achieves a high performance on every dificulty level by incorporating
and selecting appropriate experts. The model consists of three language models, each trained on one
specific dificulty (easy, medium, hard). Their predictions (logits) are combined and weighted through a
multilayer perceptron which weights the predictions according to the predicted dificulty level. During
training, the multilayer perceptron receives the dificulty level taken from the input data as one hot
encoding for training feedback. Thus, the multilayer perceptron should learn which experts prediction
to prioritize when no information about the dificulty level is available during inference. These weights
prioritize the prediction of the language model that was fine-tuned on the dificulty level matching the
dificulty level of the input pair. Through this generalization capability, our proposed model should be
able to handle real world scenarios where the dificulty level of input pairs is unknown.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <sec id="sec-2-1">
        <title>2.1. Writing Style</title>
        <p>
          Besides of this shared task, writing style analysis has been approached from various angles. [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] and
[
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] incorporate lexical and syntactic features to analyze writing style. It is often the case that such
features are tailored towards the dataset, lacking generalizability to other datasets [
          <xref ref-type="bibr" rid="ref12 ref13">12, 13</xref>
          ]. More general
approaches cover language models like BERT [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] which are pre-trained on large text corpora and
ifne-tuned on writing style tasks [ 15]. Another approach is contrastive learning, which aims to pull
instances of the same class close to each other while pushing instances from diferent classes away
from each other in the embedding space like in [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] and [16].
2.1.1. Multi-Author Writing Style Analysis
In previous iterations of the shared task, paragraphs from Reddit were used as training data. [17]
deployed an ensemble of language models, combining their predictions through majority voting. For
easy and medium cases, LaBSe embeddings [18] were used to measure the similarity between the input
sentences. The language models were fine-tuned on all dificulty partitions. Their model yields an
F1-Macro score of 0.96 for easy, 0.85 for medium and 0.86 for hard instances. However, this approach
requires knowledge about the dificulty of input pairs during inference. [ 16] used contrastive learning
and data augmentation in 2023, yielding and F1 score of 0.91 for easy, 0.82 for medium and 0.68 for hard.
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Ensemble Methods</title>
        <p>Ensemble mechanisms have multiple facets in related work: [19] used an ensemble mechanism
for the learning with disagreement task. The authors trained a supervised classifier on the
hidden state [CLS] representation of three fine-tuned language models. Mnassri et al. [ 20] tried
diferent ensemble mechanisms like soft voting, maximum value and stacking for hate speech
detection. [21] combined BERT logits with linguistic features through stacking for identifying propaganda.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Approach</title>
      <p>To address the varying dificulty of author change prediction, we propose an ensemble model that
combines the strengths of specialized language models through a weighting mechanism which
prioritizes the prediction of the expert model according to the dificulty. Using an ensemble model was
inspired from [17]. The architecture is shown in Figure 2. To increase robustness, the ensemble model
includes learned weights, which capture cues in the input pair that signal which each experts prediction
to prioritize. Through this, predictions from the expert language models are prioritized. Unlike [17], our
model does not rely on a majority vote, since this could possibly outvote the experts opinion. Since our
model relies on fine-tuned language models, extracting stylistic features is not necessary. Additionally,
we also used data augmentation similar to [16] in order to mitigate the unbalanced nature of the data.
(there are many more instances of no author change than of author changes among the sentence pairs).
Initial experiments indicated that language models like ERNIE [22], RoBERTa [23], BERT or DeBERTa
[24] are superior to Logistic Regression or Support Vector machine using linguistic features. Linguistic
features covered reading easiness scores, capitalization ratio and spelling errors. Conducted experiments
with ERNIE, RoBERTa, DeBERTa, Electra and BERT for dificulty-level-specific predictions resulted in
the choice of using ERNIE as expert for easy and hard instances and RoBERTa for medium instances.
ERNIE is based on the BERT-architecture but is more aware of named entities. ERNIE trained on the
easy split yields a F1-Macro of 0.96, RoBERTa trained on the medium split a F1-Macro of 0.82 and ERNIE
trained on the hard split a F1-Macro of 0.82. Further results on the validation set are shown in Table 2
in Section 4.</p>
      <p>When combining experts, a majority vote would possibly outvote correct predictions. Therefore, we
design the assembly mechanism as weighting mechanism. This allows the model to not require any
knowledge about the dificulty level during inference, which is similar to real-world applications.
Through the weighting mechanism, the ensemble model is expected to learn to prioritize the most
relevant expert prediction based on the input itself.</p>
      <sec id="sec-3-1">
        <title>3.1. Data augmentation</title>
        <p>
          Due to the imbalance between negative (no author change) and positive (author change) pairs, the
training data set was augmented similar to [16]. As illustrated in Figure 1 the document was grouped
based on the labels indicating an author change for each sentence pair. As augmentation data, the whole
provided training dataset of PAN 2025 [25] was used. Augmentation was performed under specific
constraints: We add author change instances combining sentences belonging to two subsequent groups.
The first sentence of such a new pair must not be the first of a group while the second sentence of
each new pair must be the first of its group. This ensures that augmented pairs do not span authorship
boundaries in a way that might introduce artificial topic shifts, which could mislead the model. Newly
added author change pairs on Figure 1 are s2 to s4 and s5, s6, s7 to s9, respectively. Table 1 gives the
size of the original and the augmented data. The presented augmentation method was not able to
create an exactly balanced dataset. As dificulty level for the augmented data, the original label of the
partition was used. For example, if the augmentation is done for the easy partition, the augmented data
is also labeled as ’easy’. However, in a post analysis, we found out that this assumption does not always
hold: Augmented instances for the hard partition can resemble to ’easy’ or ’medium’ instances. Such
instances demonstrate a clear topic shift between the sentence pair and not necessarily a diference in
writing style. For example, consider the following labels: [
          <xref ref-type="bibr" rid="ref1 ref1 ref1">0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0</xref>
          ]. In total, these
groups contain 13 sentences. Through the augmentation method, one receives 9 non-author change
instances and 5 author-change instances. This is caused by the last group which causes more no-author
changes than contributing author changes by pairing the its first sentence with group 3. During data
augmentation, we also filtered out duplicates resulting in a smaller number of no-author change data
compared to the original dataset. This afected moderator messages, for example, which were often
automated comments with the same content and mainly contained in the easy split.
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Architecture</title>
        <p>First, the experts were fine-tuned on the specific dificulty partition of the original dataset using the
provided data-splits of PAN 2025, which contains 4200 documents per dificulty. The validation set
consists of 900 documents per dificulty. Data splits are 70% for training of the whole dataset, 15%
for validation and 15% for testing. The test dataset is only available during inference on TIRA [26].
Fine-tuning on the original dataset provided better results than using the augmented data. However,
the ensemble model yields better F1-Macro scores using the augmented data. Learning objective of
the dificulty weighting mechanism learning was to prioritize the prediction of the expert suited to the
dificulty. Ernie and RoBERTa were trained using Adam Optimizer with a learning rate of 0.0005 and a
weight decay of 0.05 using a batch size of 4 and trained for 10 epochs. Validation metric was F1-Macro.
The best checkpoint was used for the final model.</p>
        <p>The ensemble model is a feed-forward model, which wrapped the three fine-tuned experts and
learned the dificulty weights during ensemble training. Output is the prediction of the expert matching
to the dificulty. In the training procedure, the three experts in the ensemble received the input pair and
outputted their logits. These logits are weighted by a weighting mechanism, which is a feed-forward
network taking the last hidden state of the [CLS] token of each model as input. The input was stacked
resulting in an input shape of dimension 768 × 3.</p>
        <p>Experiments covered using a feed-forward network as classifier, which either used the concatenated
[CLS] representation or the predicted logits as input. Furthermore, we experimented with using logits
or using [CLS] representation as input to a feed forward network (FFN), which classified whether
an author change happens or not. The latter architecture seemed too complex, yielding results that
suggested overfitting. Additionally, we tried diferent aggregation methods for [CLS] representations
like stacking, concatenating, mean pooling and weighting. Best result was achieved by using dificulty
weighting and stacking the tensors of the hidden representations of the [CLS] token of each of the
experts as input of the dificulty weighting mechanism.</p>
        <p>
          For the weighting mechanism, a two layer neural network with ReLu activation and a hidden size
of 128 was used. The FNN takes a matrix of [batch_size, 3× 768] as input. The first layer 1 as
well as the second layer 2 has a hidden size of 128. As output, a vector of three dimensions is
computed, representing scores for each dificulty level:
 = [, , ℎ]. Computation is
shown in equation 4. As training labels for the FFN, the dificulty level of an input pair was one-hot
encoded (e.g [
          <xref ref-type="bibr" rid="ref1">0,0,1</xref>
          ] for hard instances). Finally, the weights were multiplied with the combined logits
 = [, , ℎ] of the experts as shown in equation 5. Through this, the prediction (logits)
of the expert suiting to the dificulty is reinforced by receiving the most weight. Afterwards, the
log-softmax of the weighted prediction is calculated for the negative log-likelihood loss as shown
in equation 5. Experiments have shown that the highest F1 score was reached by training on the
augmented training set.
        </p>
        <p>1 = 1 + 1, where 1 ∈ R2304* 128</p>
        <p>1 =  (1)
2 = 11 + 2, , where 1 ∈ R3* 128
 =  (   (ℎ))
 = log softmax(∑︁  * )
3

(1)
(2)
(3)
(4)
(5)
(6)</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Results</title>
      <p>Regarding the random and majority baselines, every provided model is able to outperform the baseline.
The overall loss is a weighted combination of the classification loss and the dificulty weighting loss.
Negative Log Likelihood (NLL) loss is used as classification loss and cross entropy (CE) loss for the
dificulty weighting loss. To calculate the final loss, a weighting factor

was used to weigh the
dificulty weighting loss. Experiments with diferent values of
 ranging from 10, 5, 0.1, 0.01 and 0.001
demonstrated that a value of 0.001 yields the best overall validation performance. Formula 6 shows the
loss computation, where y stands for predicted label and ˆ for true label,  for weight values and  for
the one-hot encoded dificulty labels.</p>
      <p>ℒ = NLL(, ˆ) +  * CE(, )
The random baseline predicted a label randomly while the majority baseline predicts the most common
label. The ensemble model was not able to reach the same performance as the single expert models,
which indicates that the training signal of the dificulty mechanism needs improvement. However, clear
benefits can be observed using a robust ensemble approach: Regarding real world scenarios, where the
dificulty is unknown, the ensemble model is able to outperform almost every expert on data which is
out of its domain (e.g. easy expert on hard split). Comparing the performance particularly for the expert
model specialized on hard data on the easy vs. the hard data split demonstrates that diferent features
are learned for classification. The ensemble model also yields a higher average over the 3 dificulties
than the single experts for easy and hard.</p>
      <p>The final results on the test set are shown in Table 3 below.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>We described the motivation, architecture and results of our agnostic approach for the Multi-Author
Writing Style Analysis task at PAN 2025. Our architecture combines three fine-tuned transformer models
specialized on diferent dificulty levels. By learning dificulty weights that dynamically prioritize the
most relevant expert based on inferred dificulty, our model is well-suited for real-world applications
where data varies in complexity. Additionally, we also implemented data augmentation and performed
a comparison to AdaBoost using linguistic features and a short human evaluation. Further work could
include a refinement of the weighting mechanism and further analyses of the ensemble model under
diferent types of writing styles and domains. The code will be available here: https://github.com/
PhMeier/author_writing_25_submission.</p>
    </sec>
    <sec id="sec-6">
      <title>Declaration on Generative AI</title>
      <p>The author(s) have not employed any Generative AI tools.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>We would like to thank the anonymous reviewers for their useful feedback. This work was carried out
in the project NewOrder – Understanding the erosion of the traditional knowledge order in scientific online
discourse and its impact in times of crisis (project number: K490/2022) funded by the Leibniz Association.
doi:10.18653/v1/N19-1423.
[15] M. Fabien, E. Villatoro-Tello, P. Motlicek, S. Parida, BertAA : BERT Fine-tuning for
Authorship Attribution, in: P. Bhattacharyya, D. M. Sharma, R. Sangal (Eds.), Proceedings of the
17th International Conference on Natural Language Processing (ICON), NLP Association of
India (NLPAI), Indian Institute of Technology Patna, Patna, India, 2020, pp. 127–137. URL:
https://aclanthology.org/2020.icon-main.16/.
[16] H. Chen, Z. Han, Z. Li, Y. Han, A Writing Style Embedding Based on Contrastive Learning for</p>
      <p>Multi-Author Writing Style Analysis., in: CLEF (Working Notes), 2023, pp. 2562–2567.
[17] T. Lin, Y. Wu, L. Lee, Team NYCU-NLP at PAN 2024: Integrating Transformers With Similarity</p>
      <p>Adjustments For Multi-Author Writing Style Analysis, Working Notes of CLEF (2024).
[18] F. Feng, Y. Yang, D. Cer, N. Arivazhagan, W. Wang, Language-agnostic BERT Sentence Embedding,</p>
      <p>ArXiv Preprint arXiv:2007.01852 (2020).
[19] A. K. Ojha, A. S. Doğruöz, G. Da San Martino, H. T. Madabushi, R. Kumar, E. Sartori, Proceedings
of the 17th International Workshop on Semantic Evaluation (SemEval-2023), in: Proceedings of
the 17th International Workshop on Semantic Evaluation (SemEval-2023), 2023.
[20] K. Mnassri, P. Rajapaksha, R. Farahbakhsh, N. Crespi, BERT-based Ensemble Approaches For Hate
Speech Detection, in: GLOBECOM 2022-2022 IEEE Global Communications Conference, IEEE,
2022, pp. 4649–4654.
[21] A. Kaas, V. T. Thomsen, B. Plank, Team DiSaster at SemEval-2020 Task 11: Combining BERT
and Hand-crafted Features For Identifying Propaganda Techniques In News, in: SemEval 2020,
Association for Computational Linguistics, 2020.
[22] Z. Zhang, X. Han, Z. Liu, X. Jiang, M. Sun, Q. Liu, ERNIE: Enhanced Language Representation</p>
      <p>With Informative Entities, ArXiv Preprint arXiv:1905.07129 (2019).
[23] Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, V. Stoyanov,
RoBERTa: A Robustly Optimized BERT Pretraining Approach, 2019. URL: https://arxiv.org/abs/
1907.11692. arXiv:1907.11692.
[24] P. He, X. Liu, J. Gao, W. Chen, Deberta: Decoding-Enhanced Bert With Disentangled Attention,</p>
      <p>ArXiv Preprint arXiv:2006.03654 (2020).
[25] J. Bevendorf, X. B. Casals, B. Chulvi, D. Dementieva, A. Elnagar, D. Freitag, M. Fröbe, D.
Korenčić, M. Mayerl, A. Mukherjee, A. Panchenko, M. Potthast, F. Rangel, P. Rosso, A. Smirnova,
E. Stamatatos, B. Stein, M. Taulé, D. Ustalov, M. Wiegmann, E. Zangerle, Overview of PAN 2024:
Multi-Author Writing Style Analysis, Multilingual Text Detoxification, Oppositional Thinking
Analysis, and Generative AI Authorship Verification, in: Experimental IR Meets Multilinguality,
Multimodality, and Interaction. Proceedings of the Fourteenth International Conference of the
CLEF Association (CLEF 2024), Lecture Notes in Computer Science, Springer, Berlin Heidelberg
New York, 2024.
[26] M. Fröbe, M. Wiegmann, N. Kolyada, B. Grahm, T. Elstner, F. Loebe, M. Hagen, B. Stein, M. Potthast,
Continuous Integration for Reproducible Shared Tasks with TIRA.io, in: J. Kamps, L. Goeuriot,
F. Crestani, M. Maistro, H. Joho, B. Davis, C. Gurrin, U. Kruschwitz, A. Caputo (Eds.), Advances in
Information Retrieval. 45th European Conference on IR Research (ECIR 2023), Lecture Notes in
Computer Science, Springer, Berlin Heidelberg New York, 2023, pp. 236–241. URL: https://link.
springer.com/chapter/10.1007/978-3-031-28241-6_20. doi:10.1007/978-3-031-28241-6_20.
[27] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer,
R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, E. Duchesnay,
Scikit-learn: Machine Learning in Python, Journal of Machine Learning Research 12 (2011) 2825–
2830.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>E.</given-names>
            <surname>Zangerle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mayerl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <article-title>Overview of the Multi-Author Writing Style Analysis Task at PAN 2025</article-title>
          , in: G. Faggioli,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          , D. Spina (Eds.),
          <source>Working Notes of CLEF 2025 - Conference and Labs of the Evaluation Forum, CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bevendorf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dementieva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Fröbe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Gipp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Greiner-Petter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Karlgren</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mayerl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Panchenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Shelmanov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Stamatatos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegmann</surname>
          </string-name>
          , E. Zangerle, Overview of PAN 2025:
          <article-title>Voight-Kampf Generative AI Detection, Multilingual Text Detoxification, Multi-Author Writing Style Analysis, and Generative Plagiarism Detection</article-title>
          , in: J.
          <string-name>
            <surname>C. de Albornoz</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Gonzalo</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Plaza</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. G. S. de Herrera</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Mothe</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Piroi</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Rosso</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Spina</surname>
          </string-name>
          , G. Faggioli, N. Ferro (Eds.),
          <source>Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Sixteenth International Conference of the CLEF Association (CLEF</source>
          <year>2025</year>
          ), Lecture Notes in Computer Science, Springer, Berlin Heidelberg New York,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Saini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. R.</given-names>
            <surname>Sri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Thakur</surname>
          </string-name>
          ,
          <article-title>Intrinsic Plagiarism Detection System Using Stylometric Features and DBSCAN</article-title>
          , in: 2021
          <source>International Conference on Computing, Communication, and Intelligent Systems (ICCCIS)</source>
          , IEEE,
          <year>2021</year>
          , pp.
          <fpage>13</fpage>
          -
          <lpage>18</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>V.</given-names>
            <surname>Vysotska</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Burov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Lytvyn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Demchuk</surname>
          </string-name>
          ,
          <article-title>Defining Author's Style for Plagiarism Detection in Academic Environment</article-title>
          , in: 2018
          <source>IEEE Second International Conference on Data Stream Mining &amp; Processing (DSMP)</source>
          , IEEE,
          <year>2018</year>
          , pp.
          <fpage>128</fpage>
          -
          <lpage>133</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Shah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Ranka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            <surname>Dedhia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Prasad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Muni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Bhowmick</surname>
          </string-name>
          ,
          <source>Detecting and Unmasking AIgenerated Texts through Explainable Artificial Intelligence Using Stylistic Features</source>
          ,
          <source>International Journal of Advanced Computer Science and Applications</source>
          <volume>14</volume>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>R.</given-names>
            <surname>Corizzo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Leal-Arenas</surname>
          </string-name>
          ,
          <article-title>A Deep Fusion Model for Human . Machine-generated Essay Classification</article-title>
          , in: 2023
          <source>International Joint Conference on Neural Networks (IJCNN)</source>
          , IEEE,
          <year>2023</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>10</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>L.</given-names>
            <surname>Mindner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Schlippe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Schaaf</surname>
          </string-name>
          ,
          <article-title>Classification of Human-</article-title>
          and
          <string-name>
            <surname>AI-Generated</surname>
            <given-names>Texts</given-names>
          </string-name>
          : Investigating Features for ChatGPT, Springer Nature Singapore,
          <year>2023</year>
          , p.
          <fpage>152</fpage>
          -
          <lpage>170</lpage>
          . URL: http://dx.doi.org/10. 1007/
          <fpage>978</fpage>
          -981-99-7947-9_
          <fpage>12</fpage>
          . doi:
          <volume>10</volume>
          .1007/
          <fpage>978</fpage>
          -981-99-7947-9_
          <fpage>12</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>E.</given-names>
            <surname>Zangerle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mayerl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <article-title>Overview of the Multi-Author Writing Style Analysis Task at PAN</article-title>
          <year>2023</year>
          ., in: CLEF (Working Notes),
          <year>2023</year>
          , pp.
          <fpage>2513</fpage>
          -
          <lpage>2522</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>A. A.</given-names>
            <surname>Ayele</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Babakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bevendorf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X. B.</given-names>
            <surname>Casals</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Chulvi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dementieva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Elnagar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Freitag</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Fröbe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Korencic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mayerl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Moskovskiy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mukherjee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Panchenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Rangel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Rizwan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Schneider</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Smirnova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Stamatatos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Stakovskii</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Taulé</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ustalov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Yimam</surname>
          </string-name>
          , E. Zangerle,
          <article-title>Overview of PAN 2024: Multi-author Writing Style Analysis, Multilingual Text Detoxification, Oppositional Thinking Analysis, and Generative AI Authorship Verification Condensed Lab Overview</article-title>
          , in
          <source>: CLEF (2)</source>
          ,
          <year>2024</year>
          , pp.
          <fpage>231</fpage>
          -
          <lpage>259</lpage>
          . URL: https://doi.org/10.1007/978-3-
          <fpage>031</fpage>
          -71908-0_
          <fpage>11</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>F.</given-names>
            <surname>Jafariakinabad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. A.</given-names>
            <surname>Hua</surname>
          </string-name>
          ,
          <article-title>Style-aware Neural Model with Application in Authorship Attribution</article-title>
          ,
          <source>in: 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)</source>
          , IEEE,
          <year>2019</year>
          , pp.
          <fpage>325</fpage>
          -
          <lpage>328</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>G.</given-names>
            <surname>Verma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. V.</given-names>
            <surname>Srinivasan</surname>
          </string-name>
          ,
          <string-name>
            <surname>A Lexical</surname>
          </string-name>
          ,
          <article-title>Syntactic, and Semantic Perspective for Understanding Style in Text</article-title>
          , ArXiv Preprint arXiv:
          <year>1909</year>
          .
          <volume>08349</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Sari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Stevenson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Vlachos</surname>
          </string-name>
          ,
          <article-title>Topic or Style? Exploring the Most Useful Features for Authorship Attribution</article-title>
          ,
          <source>in: Proceedings of the 27th International Conference on Computational Linguistics</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>343</fpage>
          -
          <lpage>353</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>B.</given-names>
            <surname>Ai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Tan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Tan</surname>
          </string-name>
          ,
          <article-title>Whodunit? Learning to Contrast for Authorship Attribution</article-title>
          ,
          <source>ArXiv Preprint arXiv:2209.11887</source>
          (
          <year>2022</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          , BERT:
          <article-title>Pre-training of Deep Bidirectional Transformers for Language Understanding</article-title>
          , in: J.
          <string-name>
            <surname>Burstein</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Doran</surname>
          </string-name>
          , T. Solorio (Eds.),
          <source>Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</source>
          , Volume
          <volume>1</volume>
          (Long and Short Papers),
          <source>Association for Computational Linguistics</source>
          , Minneapolis, Minnesota,
          <year>2019</year>
          , pp.
          <fpage>4171</fpage>
          -
          <lpage>4186</lpage>
          . URL: https://aclanthology.org/N19-1423/.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>