<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Team Gladiators at PAN: Improving Author Identification: A Comparative Analysis of Pre-Trained Transformers for Multi-Author Classification</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Areeb Adnan Khan</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mohit Rai</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Khuzaima Ali Khan</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Syed Jahania Shah</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Faisal Alvi</string-name>
          <email>faisal.alvi@sse.habib.edu.pk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Abdul Samad</string-name>
          <email>abdul.samad@sse.habib.edu.pk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Dhanani School of Science and Engineering, Habib University</institution>
          ,
          <addr-line>Karachi</addr-line>
          ,
          <country country="PK">Pakistan</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2024</year>
      </pub-date>
      <abstract>
        <p>This paper presents our participation in the Multi-Author Writing Style Analysis Task for PAN at CLEF 2024. The primary goal of this task involves detecting style changes in multi-author documents at the paragraph level. The task consists of three sub-tasks: Easy, Medium, and Hard, each varying in dificulty, mainly depending on the range of topics covered in the paragraphs. We discuss the significance of style change detection in various applications such as plagiarism detection, authorship verification, and writing support. Our Approach leverages LLMs such as Electra, Deberta, Squeezebert, and Roberta.Hence we test diferent models to come up with the one that is most suitable for our use case, based on the F1-scores.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Multi-Author Detector</kwd>
        <kwd>Plagiarism Checker</kwd>
        <kwd>Authorship Verification</kwd>
        <kwd>Large Language Models</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The PAN challenge [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] aims to identify the text positions where the author has changed based upon the
author’s writing style, in a multi-authored document. This challenge, along with others in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] aims to
solve it or come up with an eficient approach that would lead us to the formulation and identifying new
means of detecting plagiarism, especially in cases where no-comparison text is given, as the case is with
traditional plagiarism detectors. Moreover, it could be further utilized in looking for gift authorship,
validating claims of authorship, and developing up-to-date writing support technologies. The dataset
itself is divided into three levels of dificulty (Easy, Medium, and Hard) and each dataset contains the
training, validation, and testing data [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. The easier tasks focus on the topic information to detect the
style changes, whereas the medium and hard tasks vary less in terms of topic diversity, however, they
focus more on the writing style of the author to solve the task. The hard task contains the same topics.
Finally, the F1-score metric is used to compare the submitted results.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        PAN stands for the acronym Uncovering Plagiarism, authorship, and Social Software Misuse. The
MultiAuthor Analysis task evolved from Author Clustering/Diarization and Author Masking/Evaluation,
detailed in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], which aimed to cluster documents by author and detect text modification. In PAN 2020
the approach commenced with two distinct datasets: narrow and wide, each accompanied by truth files
delineating labels for two tasks. Initially, documents underwent paragraph segmentation to facilitate
focused analysis.
      </p>
      <p>
        Subsequently, sentences were split, employing a nuanced approach to punctuation for accuracy.
Utilizing BERT tokenization, embeddings were generated at the sentence level, with model selection
guided by task specifics and performance metrics. Embedding was then amalgamated using tailored
methods to suit the requirements of each task. At the document level, sentence vectors were averaged
to encapsulate the document’s essence. Conversely, embedding was averaged between consecutive
paragraphs at the paragraph level to discern stylistic changes. Hence, this resulted in the approach of
Iyer and Vosoughi [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], as performing the best in all tasks of PAN 2020.
      </p>
      <p>
        Moreover, two diferent top-performing approaches for PAN 2021, Zhang et al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] for Task 2 and
Task 3 and Strøm et al. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] for Task 1. The former, employed ELECTRABase and ELECTRALarge to solve
all three tasks, with employing ELECTRALarge of max-len 128 and batch-size 64 to achieve a validation
accuracy of 0.78410 for Task 2 and 0.7073 for Task 3. They pre-processed the data into paragraphs for
all the models. They experimented with several batch sizes and max-lens on ELECTRALarge before
ifnalizing the parameters for Task 1 and Task 3. For Task 1, Strøm et al’s approach involves classifying
documents as single- or multi-authored using feature extraction methods like BERT embedding and
textual features. These features are processed at both the document and paragraph levels. A stacking
ensemble classifier combines classifiers trained on diferent feature vectors. The document-level features
are used for binary classification, achieving a macro F1-score of 0.7828 on the validation set and 0.7954
on the test set.
      </p>
      <p>
        Furthermore, in PAN 2022, the top-performing [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] approach involves a unified architecture of
ensemble neural networks. Lin et al. employ the approach as follows; For Task 1, which aims to
identify a single style change at the paragraph level, BERT, RoBERTa, and ALBERT transformers are
individually fine-tuned on labeled data, with downstream classification adjusted to binary classification
for detecting style changes. Task 2, focused on assigning paragraphs to specific authors in multi-author
texts, involves a similar process, with each paragraph compared to the preceding ones to determine
authorship. Task 3, targeting writing style changes at the sentence level, employs the same transformer
models, but fine-tuning is done using sentence pairs instead of paragraphs. The ensemble mechanism
combines individual model predictions using a majority voting approach, enhancing overall detection
performance across all three tasks.
      </p>
      <p>
        Finally, in PAN 2023 the approach by Hashmi et al. [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] begins by pre-processing the data, pairing
consecutive paragraphs to transform it into a multi-author evaluation task. In addition to that they
combined the datasets from 2020 onwards to improve their model scores. They then use task-specific
datasets to fine-tune transformer models such as BERT, RoBERTa, and ELECTRA; RoBERTa consistently
performs best. By merging predictions from various models, ensemble modeling improves performance
even more. In the competition evaluation, this method produces the highest F1 scores in two subtasks
and second place in the third subtask, demonstrating the potency of transformer models and data
augmentation in style change detection. The hyper-parameters. learning rate to 0.00001, the batch size
to 16, and the number of epochs to 10.
      </p>
      <p>
        In addition to that [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] integrates supervised contrastive learning, Rdrop, and P-tuning to improve
Multi-Author Writing Style Analysis performance. It first approaches the task as binary classification,
encoding using the DeBERTa-v3 model. Then, by integrating label information, supervised contrastive
learning is used to enhance feature representation. Furthermore, the loss is computed for both positive
and negative sample pairs using the Rdrop method. The soft-hard template is built using P-tuning,
which improves the model’s word embedding representation. Because it leverages pre-trained models
improved by P-tuning and can capture fine-grained textual changes via supervised contrastive learning,
this method works incredibly well on the hard dataset.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Our Approach</title>
      <p>The dataset is neatly divided into training, validation, and test sets of data, The training set contains 70
percent of the dataset including the text files, ground-truths, and JSON files, which are used to develop
our models. The validation dataset includes 15 percent of the whole dataset and the testing dataset
comprises the remaining 15 percent. As the dataset is already split into relevant sections, we can carry
on and consider the pairs of successive paragraphs as input samples to our model, hence concatenating
them in the process, each one is then assigned a label that shows whether the data has changed or not.</p>
      <p>Furthermore, this means that for n set of paragraphs in the text file, we have n-1 paragraph pairs and
Labels, hence we have converted the task of identifying the authors into binary classification, where
0 means there is no change and 1 means there is a change of authors between the paragraphs. Two
such sets of datasets would be created in our case, one for training the model and the other one for
validation of the model.</p>
      <p>In Table 1, it is evident that the easy dataset exhibits class imbalance, with a significantly higher
occurrence of class 1 (left column) compared to class 0 (right column). To address this imbalance, we
implemented a weighted random sampler. To implement this we first ensure that the ’changes’ column
in the training data frame is appropriately formatted. It then computes the class distribution by counting
the occurrences of each class label. By inversely weighting the class frequencies, the function calculates
a set of weights such that minority class samples are given a higher probability during sampling. These
weights are then used to create a weighted random sampler, which ensures that each class is represented
proportionally in the training process, thereby mitigating the efects of class imbalance and potentially
improving model performance on minority classes.</p>
      <sec id="sec-3-1">
        <title>3.1. Pre-Trained Transformer Models</title>
        <p>Pre-trained transformer models have been trained on enormous volumes of text data. They gain an
understanding of language’s structure and patterns, which helps them produce text of the highest caliber
and carry out various natural language processing (NLP) functions. By fine-tuning the pre-trained
models, we can take advantage of their language comprehension abilities and apply the knowledge
they have learned from their thorough pre-training to our particular task. Hence for this project, we
employed popular pre-transformer models RoBERTa, ELECTRA, DeBERTa, and SqueezeBERT.</p>
        <p>RoBERTa uses dynamic masking and omits next-sentence prediction to improve BERT’s pre-training,
resulting in stronger word representations. ELECTRA focuses on eficiency, using synthetic data with
replaced words to teach the model to distinguish them from originals, achieving good results with less
training data. DeBERTa combines ideas from ELECTRA and BERT, separating word and positional
information for better context understanding and using an ELECTRA-like task for efective training
and high performance. SqueezeBert reduces model size by compressing a large pre-trained model
like BERT into a faster, smaller version while preserving most of its functionality through knowledge
distillation</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Preliminary Results</title>
        <p>We used the base versions of several pre-trained models available on HuggingFace. These models
were implemented on Kaggle notebooks, and they were then refined further. We used the following
hyperparameter values to improve the model’s performance: a maximum sequence length of 256, a
learning rate of 0.00001, a batch size of 16, and 12 epochs.</p>
        <p>We calculated the F1 score on the given evaluation set to assess the models’ efectiveness for every
subtask. This F1 score was determined by comparing the models’ predictions for each sub-task evaluation
set’s ability to identify style changes between consecutive paragraph pairs.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Analysis</title>
        <p>From the results shown in Table 2, we observe a distinct trend in model performance. On the easy
dataset, the RoBERTa model achieves the highest F1-score of 0.940, followed closely by ELECTRA with
an F1-score of 0.939. DeBERTa secures the third position with an F1-score of 0.756, while SqueezeBERT
ranks last. A similar performance pattern is evident in the medium dataset, with RoBERTa leading and
SqueezeBERT trailing.</p>
        <p>However, in the hard dataset, the performance dynamics shift. Although RoBERTa maintains its
leading position, DeBERTa surpasses ELECTRA, achieving an F1-score of 0.756 compared to ELECTRA’s
0.713. SqueezeBERT continues to underperform in this dataset as well.</p>
        <p>The observed trend in model performance across the Easy, Medium, and Hard datasets can be
attributed to several factors related to the models’ inherent architecture and training strategies.</p>
        <p>Thanks to a dynamic masking method, intensive training on large-scale datasets, and a resilient
architecture, RoBERTa regularly outperforms competing models. These elements enable RoBERTa to
more successfully identify complex patterns in the data, as seen by its high F1 scores in every dataset.
ELECTRA, on the other hand, is comparable to RoBERTa. Still, its distinct pre-training method—which
uses substitute token detection—might not be able to handle the subtle complexity seen in more dificult
datasets. This explains the modest decline in ELECTRA’s performance compared to RoBERTa, especially
in the Hard dataset.</p>
        <p>Diferent datasets show diferent performances for DeBERTa, which uses relative position embeddings
and disentangled attention mechanisms. While it does well overall, its architecture may not fully take
advantage of the simpler structures in the Easy and Medium datasets, which could explain why it ranks
third in these situations while SqueezeBERT’s lower performance across all datasets can be attributed
to its design, which prioritizes model eficiency and reduced computational resources over capturing
complex patterns.</p>
        <p>Based upon the results it was best to work with RoBERTa and fine-tune the model further to achieve
better scores.</p>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Revised Strategy</title>
        <p>To further improve the performance metrics of the RoBERTa model, we initially considered utilizing
the larger variant available on HuggingFace. However, given the widespread use of large language
models (LLMs) in the current research community and their significant environmental impact, we opted
for alternative optimization methods. These approaches are environmentally sustainable and capable of
achieving comparable performance enhancements.</p>
        <p>One way we thought of improving the performance was by increasing the size of the dataset, this
method is also called data augmentation. To realize this we had a unique approach since we were giving
models two paragraphs and a label. Let’s consider paragraphs  and  with a label of 0, indicating
no change. We pass this data to our models, but if we reverse the order of the paragraphs—placing
paragraph  before paragraph —the label remains unchanged. This method efectively doubles the
size of our dataset.</p>
        <p>As demonstrated in Table 4, employing the new strategy has improved scores across all datasets.
This suggests that augmenting the dataset size positively impacts performance metrics. Specifically, the
Easy dataset achieved an F1-score of 0.958, the Medium dataset reached an F1-score of 0.816, and the
Hard dataset attained an F1-score of 0.787. Based on these results, we conclude that the RoBERTa base
models, trained on the Easy, Medium, and Hard datasets, should be submitted for evaluation on the
TIRA platform.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Results</title>
      <p>The approaches titled "presto-door" and "null-directory" are the same models, i.e. both are the score
obtained through the same Roberta-base-discriminator model. The only diference between the approaches
is that one was submitted manually through Docker, whereas the other was submitted through the
GitHub Actions tool. Moreover, test set results on approaches titled "presto-door" and "null-directory"
displayed in Table 5 reiterate our revised strategy of expanding the dataset and then utilizing the Roberta
model on it. The scores for the test sets are much better and closer to the validation set provided. In
conclusion, this validates that, first of all, Roberta is the best-performing model amongst the models
tested, and secondly the strategy of Data Augmentation aids in providing a better model F1-scores than,
the models trained on just the initial datasets provided.</p>
      <p>As students, academians, and part of a community that is aimed towards protecting the environment,
building upon the United Nations sustainability goals, and gearing towards lower greenhouse emissions,
we choose to work with models that have minimal impact on the environment, as we eficiently tried to
utilize the resources available to us, to complete the task of multi-author classification in PAN 2024</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>Our solution builds upon binary classification, hence it works by treating the Paragraph as two instances
of input text and giving output a label that tells whether the author is changing or not. While our
approach achieved high accuracy on the task of multi-author classification using the best models
out of the many fine-tuned transformers, it’s important to acknowledge some limitations. Running
multiple models can be computationally expensive, and fine-tuning often requires significant labeled
data. Additionally, the complex nature of transformer models can make it dificult to understand how
they arrive at their predictions. Despite these limitations, our approach achieved strong performance by
leveraging the best-performing models based on F1 scores on both the training and test sets. However,
it’s important to note that this binary classification approach treats the task as a simple presence or
absence of a style change, neglecting the potential for nuanced stylistic variations within a single
author’s work. Future work could explore multi-class classification to capture a wider range of stylistic
distinctions or delve deeper into interpretability techniques to understand the reasoning behind the
models’ predictions.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>The authors would like to acknowledge the support provided by the Ofice Of Research (OoR) at Habib
University, Karachi, Pakistan for funding this project through the internal research grant IRG-2235.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>E.</given-names>
            <surname>Zangerle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mayerl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <article-title>Overview of the Multi-Author Writing Style Analysis Task at PAN 2024</article-title>
          , in: G. Faggioli,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Galuščáková</surname>
          </string-name>
          , A. G. S. de Herrera (Eds.), Working Notes of CLEF 2024 -
          <article-title>Conference and Labs of the Evaluation Forum, CEUR-WS</article-title>
          .org,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bevendorf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X. B.</given-names>
            <surname>Casals</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Chulvi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dementieva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Elnagar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Freitag</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Fröbe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Korenčić</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mayerl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mukherjee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Panchenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Rangel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Smirnova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Stamatatos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Taulé</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ustalov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegmann</surname>
          </string-name>
          , E. Zangerle,
          <article-title>Overview of PAN 2024: Multi-Author Writing Style Analysis, Multilingual Text Detoxification, Oppositional Thinking Analysis, and Generative AI Authorship Verification</article-title>
          , in: L.
          <string-name>
            <surname>Goeuriot</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Mulhem</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Quénot</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Schwab</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Soulier</surname>
          </string-name>
          ,
          <string-name>
            <surname>G. M. D. Nunzio</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Galuščáková</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. G. S. de Herrera</surname>
          </string-name>
          , G. Faggioli, N. Ferro (Eds.),
          <source>Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Fifteenth International Conference of the CLEF Association (CLEF</source>
          <year>2024</year>
          ), Lecture Notes in Computer Science, Springer, Berlin Heidelberg New York,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Rangel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          , E. Stamatatos,
          <string-name>
            <given-names>M.</given-names>
            <surname>Tschuggnall</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          , Overview of PAN 2016-
          <article-title>New Challenges for Authorship Analysis: Cross-genre Profiling</article-title>
          , Clustering, Diarization, and Obfuscation, in: N.
          <string-name>
            <surname>Fuhr</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Quaresma</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Larsen</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Gonçalves</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Balog</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Macdonald</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Cappellato</surname>
          </string-name>
          , N. Ferro (Eds.),
          <source>Experimental IR Meets Multilinguality, Multimodality, and Interaction. 7th International Conference of the CLEF Initiative (CLEF</source>
          <year>2016</year>
          ), volume
          <volume>9822</volume>
          of Lecture Notes in Computer Science, Springer, Berlin Heidelberg New York,
          <year>2016</year>
          , pp.
          <fpage>518</fpage>
          -
          <lpage>538</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>319</fpage>
          -44564-9\ _
          <fpage>28</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Iyer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Vosoughi</surname>
          </string-name>
          ,
          <article-title>Style change detection using BERT: Notebook for PAN at CLEF</article-title>
          <year>2020</year>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>X.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Qi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , M. Huang,
          <article-title>Style change detection: Method based on pre-trained model and similarity recognition</article-title>
          ,
          <source>in: Notebook Papers of PAN at CLEF</source>
          <year>2022</year>
          , Foshan, China,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>E.</given-names>
            <surname>Strøm</surname>
          </string-name>
          <article-title>, Multi-label style change detection by solving a binary classification problem</article-title>
          ,
          <source>in: Notebook Papers of PAN at CLEF 2021, Høgskoleringen</source>
          <volume>1</volume>
          , 7491 Trondheim, Norway,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>T.-M. Lin</surname>
            ,
            <given-names>C.-Y.</given-names>
          </string-name>
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>Y.-W.</given-names>
          </string-name>
          <string-name>
            <surname>Tzeng</surname>
            ,
            <given-names>L.-H.</given-names>
          </string-name>
          <string-name>
            <surname>Lee</surname>
          </string-name>
          ,
          <article-title>Ensemble pre-trained transformer models for writing style change detection</article-title>
          ,
          <source>in: Notebook Papers of PAN at CLEF</source>
          <year>2022</year>
          , National Central University, Taiwan,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>A.</given-names>
            <surname>Hashemi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Shi</surname>
          </string-name>
          ,
          <article-title>Enhancing writing style change detection using transformer-based models and data augmentation</article-title>
          ,
          <source>in: Notebook Papers of PAN at CLEF</source>
          <year>2023</year>
          , Ottawa, Canada,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Ye</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Zhong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Qi</surname>
          </string-name>
          , Y. Han,
          <article-title>Supervised contrastive learning for multi-author writing style analysis</article-title>
          , Department of Electrical Engineering, Foshan University, China (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>