<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Team cornell-1 at PAN: Ensembling Fine-Tuned Transformer Models for Writing Style Analysis</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Deniz Bölöni-Turgut</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dhriti Verma</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Claire Cardie</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Cornell University</institution>
          ,
          <addr-line>Ithaca, NY 14853</addr-line>
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <abstract>
        <p>This paper describes our system for the Multi-Author Writing Style Analysis shared task for the PAN Lab at CLEF 2025. We design and train an ensemble model from multiple fine-tuned transformer models. Each model in the ensemble follows our custom BertStyleNN architecture, a PyTorch neural network consisting of a fine-tuned encoder model and a feed-forward neural network classification head. We train each BertStyleNN model end-to-end on a combined dificulty (easy, medium, and hard) training dataset, using five diferent pre-trained feature extractors. We then conduct an exhaustive search over three ensembling methods and model combinations for each dificulty level. Our final system achieves a macro F1 of 0.8 averaged over the three dificulty levels, significantly outperforming the baseline.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;PAN 2025</kwd>
        <kwd>multi-author style analysis</kwd>
        <kwd>sentence embeddings</kwd>
        <kwd>ensemble models</kwd>
        <kwd>transformers</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        Early techniques for style analysis employed manual feature engineering of lexical or syntactic
features [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. More recent work uses embeddings from pre-trained language models. Since many sentence
embedding models are trained with semantic similarity objectives, fine-tuning the pre-trained models
on data labeled for style change is common and often necessary.
      </p>
      <p>
        The goal of the PAN 2024 Multi-Author Style Analysis task was to identify style changes between
paragraphs as opposed to between sentences as it is in 2025 [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Of the top two submissions to the 2024
version of this task, one fine-tuned the open-source large language model Llama-3-8b [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] with low-rank
adaption [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], and the other ensembled three pre-trained transformer models with additional semantic
similarity checks applied for the easy and medium dificulty levels [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>
        Document-level authorship attribution approaches include using static embeddings as input to
Siamese networks trained with contrastive loss to perform classification [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Dataset Exploration</title>
      <p>
        The most notable observation from our data exploration is the class imbalance. Only 19.9% and 20.4%
of sentence pairs in the combined dificulty training and validation sets respectively are instances of
a style change. To investigate the significance of this class imbalance, we constructed a 50/50 class
balanced training set. This balanced training set was augmented with problems randomly chosen from
the PAN 2024 Multi-Author Style Analysis task dataset [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. We use both the balanced and original
imbalanced training sets to extract sentence embeddings from the pre-trained all-MiniLM-L12-v2
model and train a feed-forward neural network (FFNN) as a binary classifier. We do not fine-tune the
embedding model at all, only the FFNN. The validation set metrics for both training runs are shown in
Table 1; we evaluate each run on the original imbalanced validation set.
      </p>
    </sec>
    <sec id="sec-4">
      <title>4. The BertStyleNN</title>
      <p>We introduce the BertStyleNN, our custom neural network based model which contains a binary
sequence classification head and is implemented with PyTorch. The code and links to download our
trained models from HuggingFace can be found at https://github.com/denizbt/pan-styleAnalysis25.</p>
      <p>In this section, we describe the architecture and training process for BertStyleNN models.</p>
      <sec id="sec-4-1">
        <title>4.1. Model Architecture</title>
        <p>A BertStyleNN has two parts: a transformer encoder for feature extraction and a FFNN for binary
classification. BertStyleNN supports a variety of pre-trained SentenceTransformers models and
general feature extractors as its encoder. No architectural changes are made to any pre-trained encoder;
it is only fine-tuned.</p>
        <p>The architecture of the FFNN is relatively straightforward and is the same for every encoder model.
It consists of 4 hidden layers with ReLU activation functions, a 1D BatchNorm layer, and a Dropout
layer with  = 0.4. The details of the architecture were determined from experimentation with the
all-MiniLM-L12-v2 sentence embedding model; each sentence pair in the PAN dataset (all dificulties
combined) was embedded using the all-MiniLM-L12-v2 model out-of-the-box and then used to train
the FFNN. The architecture that resulted in the highest validation macro F1 was chosen.</p>
        <p>The forward pass of a BertStyleNN proceeds as follows. The pair of sentences to check for style
changes are passed in as input. Then, BertStyleNN extracts embeddings independently for each
sentence using its encoder, concatenates the embeddings, and finally applies the FFNN to get
onedimensional output for the binary classification. The complete architecture for BertStyleNN is shown
in Figure 1.</p>
        <sec id="sec-4-1-1">
          <title>Sentence 1</title>
        </sec>
        <sec id="sec-4-1-2">
          <title>Sentence 2</title>
        </sec>
        <sec id="sec-4-1-3">
          <title>Encoder</title>
        </sec>
        <sec id="sec-4-1-4">
          <title>Mean Pooling</title>
        </sec>
        <sec id="sec-4-1-5">
          <title>Embedding 2</title>
        </sec>
        <sec id="sec-4-1-6">
          <title>Mean Pooling</title>
        </sec>
        <sec id="sec-4-1-7">
          <title>Embedding 1</title>
        </sec>
        <sec id="sec-4-1-8">
          <title>FFNN</title>
        </sec>
        <sec id="sec-4-1-9">
          <title>Concatenation</title>
        </sec>
        <sec id="sec-4-1-10">
          <title>Linear</title>
        </sec>
        <sec id="sec-4-1-11">
          <title>ReLU</title>
        </sec>
        <sec id="sec-4-1-12">
          <title>BatchNorm Dropout (p=0.4)</title>
        </sec>
        <sec id="sec-4-1-13">
          <title>Output Projection (logits)</title>
        </sec>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Training</title>
        <p>Training a BertStyleNN involves simultaneously fine-tuning a pre-trained encoder model and training
a FFNN for classification (i.e. end-to-end training).</p>
        <p>
          We select and fine-tune five diferent pre-trained encoder/sentence embedding models as the encoders
for the BertStyleNN, listed below. All models are downloaded from HuggingFace.
• roberta-base [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] improves upon BERT [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] by training it on more data, using dynamic masking,
and removing the next sentence prediction task. It was chosen due to its popularity and high
performance as a general feature extractor.
• microsoft/deberta-base [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] achieves higher performance compared to BERT and RoBERTa
by using disentangled attention which uses two separate vectors for position and content and
improving the decoding for the masked LM task. This model was also chosen for its popularity
and high performance on natural language understanding tasks.
• sentence-transformers/all-MiniLM-L12-v2 [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] was finetuned with a contrastive similarity
objective from the pre-trained microsoft/MiniLM-L12-H384-uncased model [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]. As of the
writing of this paper, it is the fourth highest performing model for sentence embeddings in the
SentenceTransformers library [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ].
• sentence-transformers/all-mpnet-base-v2 [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] was finetuned from using self-supervised
contrastive learning objective from microsoft/mpnet-base model [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]. It is currently the highest
performing model for sentence embeddings in the SentenceTransformers library [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ].
• sentence-transformers/sentence-t5-base [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ] is a PyTorch version for the encoder of a T5
base model [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ]. It was chosen to add to the diversity of our set of models.
        </p>
        <p>Our training and validation sets are a combination of the data from all three dificulty levels. We
make no other alterations or augmentations to the data. We choose to use a combined training set since
each dificulty level subset is too small on its own.</p>
        <p>We holistically select diferent hyperparameters and learning schedules for every encoder model
(see Appendix A for the choices). We also conduct a linear search for the best probability prediction
threshold to apply to the output and choose the best epoch for each model based on the macro F1.
It is important to note that while the training hyperparameters difer, the architecture of the FFNN
(including hidden layer dimensions) remains the same for all styles of encoder models. Table 2 displays
the validation performance for each fine-tuned model.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Ensembling</title>
        <p>At this point, we have trained several BertStyleNN models on the combined dificulty dataset. We
now turn our attention to finding the best ensemble model for each dificulty level.</p>
        <p>We experiment with three ensembling methods: majority voting, unweighted average of output
probabilities, and unweighted average of output logits. For each dificulty level, we test all three methods
on the validation set for every subset of trained models size three or more. We report the metrics for
the highest performing subset and method for each dificulty level in Table 3. Figure 2 illustrates our
complete system pipeline, including ensembling: the Ensemble-BertStyleNN.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Results</title>
      <p>For the final system submission, we use the ensemble models along with the prediction thresholds that
performed best on the validation set; details for the ensemble used for each dificulty level are given in
Table 3. The results of our Ensembled-BertStyleNN approach on the hidden test set are in Table 4.
Our system significantly outperforms the naive baseline of predicting the majority class (0).
MEmeabnedPodoinlign1gConcatenationMEmeabnedPodoinlign2g</p>
      <p>Linear
ReLU
BatchNorm</p>
      <p>Dropout(p=0.4)</p>
      <p>OutputProjection(logits)
Sentence1</p>
      <p>Sentence2</p>
      <p>Encoder
MEmeabnedPodoinlign1gConcatenationMEmeabnedPodoinlign2g</p>
      <p>Linear
ReLU
BatchNorm</p>
      <p>Dropout(p=0.4)</p>
      <p>OutputProjection(logits)
Sentence1</p>
      <p>Sentence2</p>
      <p>Encoder
MEmeabnedPodoinlign1gConcatenationMEmeabnedPodoinlign2g</p>
      <p>Linear
ReLU
BatchNorm</p>
      <p>Dropout(p=0.4)</p>
      <p>OutputProjection(logits)
BertStyleNNs</p>
      <p>Ensemble</p>
      <p>Models
(avg-probs,
avg-logits)</p>
      <p>Validation
Prediction
Threshold</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>This paper describes an ensemble model system for the Multi-Author Style Analysis task. We fine-tune
and ensemble new BertStyleNN models with 5 distinct pre-trained encoder models and a FFNN for
binary classification. Our final system Ensembled-BertStyleNN achieves 0.8 macro F1 averaged over
the three dificulty levels, indicating promise for ensemble transformer model approaches to the style
analysis task.</p>
    </sec>
    <sec id="sec-7">
      <title>Declaration on Generative AI</title>
      <p>The authors have not employed any Generative AI tools.
In this section, we provide more details about our training parameters. For all training runs, we use
nn.BCEWithLogitsLoss with the pos-weight parameter set to 00..82 , i.e. the approximate imbalance
between positive (1) and negative (0) labels in the train and validation set. The pos-weight parameter
penalizes false negatives (predicting a 0 when it should be a 1 more harshly than false positives
(predicting 0 on true label 1), encouraging the model to predict more 1s. This mitigates some of the
negative efects of the imbalanced training data.</p>
      <p>Table 5 shows the complete list of hyperparameters used in training all models. Additionally, we
used the AdamW optimizer, a consistent batch size of 16, and mean pooling of the encoder output for
all models.
roberta-base
deberta-base
all-MiniLM-L12-v2
5
5
5</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bevendorf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dementieva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Fröbe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Gipp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Greiner-Petter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Karlgren</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mayerl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Panchenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Shelmanov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Stamatatos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegmann</surname>
          </string-name>
          , E. Zangerle, Overview of PAN 2025:
          <article-title>Voight-Kampf Generative AI Detection, Multilingual Text Detoxification, Multi-Author Writing Style Analysis, and Generative Plagiarism Detection</article-title>
          , in: J.
          <string-name>
            <surname>C. de Albornoz</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Gonzalo</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Plaza</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. G. S. de Herrera</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Mothe</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Piroi</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Rosso</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Spina</surname>
          </string-name>
          , G. Faggioli, N. Ferro (Eds.),
          <source>Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Sixteenth International Conference of the CLEF Association (CLEF</source>
          <year>2025</year>
          ), Lecture Notes in Computer Science, Springer, Berlin Heidelberg New York,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>E.</given-names>
            <surname>Zangerle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mayerl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <article-title>Overview of the Multi-Author Writing Style Analysis Task at PAN 2025</article-title>
          , in: G. Faggioli,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          , D. Spina (Eds.),
          <source>Working Notes of CLEF 2025 - Conference and Labs of the Evaluation Forum, CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Fröbe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Kolyada</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Grahm</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Elstner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Loebe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hagen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <article-title>Continuous Integration for Reproducible Shared Tasks with TIRA.io</article-title>
          ,
          <source>in: Advances in Information Retrieval. 45th European Conference on IR Research (ECIR</source>
          <year>2023</year>
          ), Lecture Notes in Computer Science, Springer, Berlin Heidelberg New York,
          <year>2023</year>
          , pp.
          <fpage>236</fpage>
          -
          <lpage>241</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Dubey</surname>
          </string-name>
          ,
          <string-name>
            <surname>Capturing Style Through Large Language Models - An Authorship Perspective</surname>
          </string-name>
          (
          <year>2024</year>
          ). URL: https://hammer.purdue.edu/articles/thesis/Capturing_Style_Through_Large_ Language_Models_-_
          <string-name>
            <surname>An</surname>
          </string-name>
          _Authorship_Perspective/27947904. doi:
          <volume>10</volume>
          .25394/PGS.27947904.
          <year>v1</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>E.</given-names>
            <surname>Zangerle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mayerl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <article-title>Overview of the Multi-Author Writing Style Analysis Task at PAN 2024</article-title>
          , in: G. Faggioli,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Galuščáková</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. G. S.</surname>
          </string-name>
          Herrera (Eds.),
          <source>Working Notes Papers of the CLEF</source>
          <year>2024</year>
          <article-title>Evaluation Labs, CEUR-WS</article-title>
          .org,
          <year>2024</year>
          , pp.
          <fpage>2513</fpage>
          -
          <lpage>2522</lpage>
          . URL: http://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>3740</volume>
          /paper-222.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J.</given-names>
            <surname>Lv</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Qi</surname>
          </string-name>
          , Team Fosu-stu
          <string-name>
            <surname>at</surname>
            <given-names>PAN</given-names>
          </string-name>
          :
          <article-title>Supervised fine-tuning of large language models for Multi Author Writing Style Analysis</article-title>
          , in: G. Faggioli,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Galuščáková</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. G. S.</surname>
          </string-name>
          Herrera (Eds.),
          <source>Working Notes Papers of the CLEF</source>
          <year>2024</year>
          <article-title>Evaluation Labs, CEUR-WS</article-title>
          .org,
          <year>2024</year>
          , pp.
          <fpage>2781</fpage>
          -
          <lpage>2786</lpage>
          . URL: http://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>3740</volume>
          /paper-265.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>E. J.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Wallis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Allen-Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Chen</surname>
          </string-name>
          , Lora:
          <article-title>Low-rank adaptation of large language models</article-title>
          ,
          <year>2021</year>
          . URL: https://arxiv.org/abs/2106.09685. arXiv:
          <volume>2106</volume>
          .
          <fpage>09685</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>T.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <surname>Team</surname>
            <given-names>NYCU</given-names>
          </string-name>
          -NLP at PAN 2024:
          <article-title>Integrating Transformers with Similarity Adjustments for Multi-Author Writing Style Analysis</article-title>
          , in: G. Faggioli,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Galuščáková</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. G. S.</surname>
          </string-name>
          Herrera (Eds.),
          <source>Working Notes Papers of the CLEF</source>
          <year>2024</year>
          <article-title>Evaluation Labs, CEUR-WS</article-title>
          .org,
          <year>2024</year>
          , pp.
          <fpage>2716</fpage>
          -
          <lpage>2721</lpage>
          . URL: http://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>3740</volume>
          /paper-255.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>E.</given-names>
            <surname>Zangerle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mayerl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <article-title>Pan24 multi-author writing style analysis</article-title>
          ,
          <year>2024</year>
          . URL: https://doi.org/10.5281/zenodo.10677876. doi:
          <volume>10</volume>
          .5281/zenodo.10677876.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ott</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Du</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Joshi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Levy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Zettlemoyer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Stoyanov</surname>
          </string-name>
          ,
          <article-title>Roberta: A robustly optimized bert pretraining approach</article-title>
          ,
          <year>2019</year>
          . URL: https://arxiv.org/abs/
          <year>1907</year>
          . 11692. arXiv:
          <year>1907</year>
          .11692.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          , Bert:
          <article-title>Pre-training of deep bidirectional transformers for language understanding</article-title>
          ,
          <year>2019</year>
          . URL: https://arxiv.org/abs/
          <year>1810</year>
          .04805. arXiv:
          <year>1810</year>
          .04805.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>P.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gao</surname>
          </string-name>
          , W. Chen, Deberta:
          <article-title>Decoding-enhanced bert with disentangled attention</article-title>
          ,
          <year>2021</year>
          . URL: https://arxiv.org/abs/
          <year>2006</year>
          .03654. arXiv:
          <year>2006</year>
          .03654.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Sentence-Transformers</surname>
          </string-name>
          ,
          <article-title>all-minilm-l12-v2</article-title>
          , https://huggingface.co/sentence-transformers/ all-MiniLM-L12-v2,
          <year>2024</year>
          . URL: https://arxiv.org/abs/
          <year>1810</year>
          .04805, accessed:
          <fpage>2024</fpage>
          -05-30.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>W.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Wei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Dong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Bao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zhou</surname>
          </string-name>
          , Minilm:
          <article-title>Deep self-attention distillation for task-agnostic compression of pre-trained transformers</article-title>
          ,
          <year>2020</year>
          . arXiv:
          <year>2002</year>
          .10957.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Sentence-Transformers</surname>
          </string-name>
          ,
          <article-title>Pretrained models documentation</article-title>
          , http://sbert.net/docs/sentence_ transformer/pretrained_models.html,
          <year>2024</year>
          . Accessed:
          <fpage>2024</fpage>
          -05-30.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Sentence-Transformers</surname>
          </string-name>
          ,
          <article-title>sentence-transformers/all-mpnet-base-v2, https://huggingface.co/ sentence-transformers/all-mpnet-base-</article-title>
          <string-name>
            <surname>v2</surname>
          </string-name>
          ,
          <year>2024</year>
          . Accessed:
          <fpage>2024</fpage>
          -05-30.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Microsoft</surname>
          </string-name>
          , microsoft/mpnet-base, https://huggingface.co/microsoft/mpnet-base,
          <year>2024</year>
          . Accessed:
          <fpage>2024</fpage>
          -05-30.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>J.</given-names>
            <surname>Ni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. H.</given-names>
            <surname>Ábrego</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Constant</surname>
          </string-name>
          , J. Ma, K. B.
          <string-name>
            <surname>Hall</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Cer</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Yang</surname>
          </string-name>
          , Sentence-t5:
          <article-title>Scalable sentence encoders from pre-trained text-to-text models</article-title>
          ,
          <year>2021</year>
          . URL: https://arxiv.org/abs/2108.08877. arXiv:
          <volume>2108</volume>
          .
          <fpage>08877</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>C.</given-names>
            <surname>Rafel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Shazeer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Roberts</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Narang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Matena</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. J.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <article-title>Exploring the limits of transfer learning with a unified text-to-text transformer</article-title>
          ,
          <year>2023</year>
          . URL: https://arxiv.org/ abs/
          <year>1910</year>
          .10683. arXiv:
          <year>1910</year>
          .10683.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>