<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Conference and Labs of the Evaluation Forum, September</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Team foshan-university-of-guangdong at PAN: Adaptive Entropy-Based Stability-Plasticity for Multi-Author Writing Style Analysis</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Xurong Liu</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hui Chen</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jiajun Lv</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Foshan University</institution>
          ,
          <addr-line>Foshan , Guangdong</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Shenzhen University</institution>
          ,
          <addr-line>shenzhen , Guangdong</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2024</year>
      </pub-date>
      <volume>0</volume>
      <fpage>9</fpage>
      <lpage>12</lpage>
      <abstract>
        <p>In this paper, we address the Multi-Author Writing Style Analysis task for PAN 2024, which involves detecting style changes at paragraph levels within multi-author documents. To tackle this problem, we adopt the Entropybased Stability-Plasticity (ESP) method, which dynamically adjusts the learning rates of diferent neural network layers based on their entropy values. This approach efectively balances stability and plasticity, allowing the model to retain essential knowledge from previous tasks while eficiently learning new information, thereby mitigating catastrophic forgetting. Our experiments, conducted on datasets of varying dificulty levels (Easy, Medium, Hard), demonstrate that ESP significantly outperforms traditional methods in detecting writing style changes. The results highlight the efectiveness of ESP in leveraging prior knowledge and reducing interference between tasks, making it a robust framework for continuous learning in text analysis applications.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;style change detection</kwd>
        <kwd>ESP</kwd>
        <kwd>lifelong learning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The style change detection task aims to identify text positions within a given multi-author document at
which the author switches[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. When there are no comparison texts are provided, if multiple authors
together have written a text, the only way that find evidence for this fact is style change detection to
detect plagiarism in a document. Likewise, style change detection can help to uncover gift authorships,
to verify a claimed authorship, or to develop new technology for writing support[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Style change
detection is a branch of authorship verification focussing on the examination of a document for the
diferent authorial style[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. The application areas of writing style change detection range from plagiarism
detection, cyber security, and forensics and currently in fake news detection [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5, 6</xref>
        ]. Endeavors to
detect changes in writing styles have been done under author dimerization or clustering and style
change detection [7, 8, 9].
      </p>
      <p>The ability to continuously learn remains elusive for deep learning models, which cannot accumulate
knowledge in their weights when learning new tasks. For the task Multi-Author Writing Style Analysis
2024 of PAN that has three diferent dificulty datasets: Easy, Medium, and Hard, we adopted a method
called Entropy-based Stability-Plasticity (ESP) [10] for lifelong learning to address this problem, which
can decide dynamically how much each model layer should be modified via a plasticity factor. The results
show the robust framework of the approach in leveraging prior knowledge by reducing interference,
ofering slight improvements over the baseline in maintaining performance across sequential tasks.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Background</title>
      <p>For the task of Multi-Author Writing Style Analysis, some studies employ the use of feature combinations
such as lexical, syntactic and character features to analyze the variance in the styles of writing by
diferent authors [ 11, 12, 13, 14]. Some of these methods rely on the analysis of the diferent stylometric
features to detect stylistic changes in a document [15], while others adapted the outlier detection
methods used in plagiarism detection problems. In addition some studies investigated the use of
artificial neural networks to solve this problem [16, 15].</p>
      <p>
        The task Multi-Author Writing Style Analysis 2024 of PAN is defined as, for a given text, finding all
positions of writing style change on the paragraph level (i.e., for each pair of consecutive paragraphs,
assess whether there was a style change) [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. The simultaneous shift in authorship and topic will be
carefully controlled; Task 1, Task 2, and Task 3 correspond to datasets of three diferent dificulty levels:
Easy, Medium, and Hard. The corresponding dataset descriptions are as follows:
• Easy: Easy: The paragraphs of a document cover a variety of topics, allowing approaches to use
topic information to detect authorship changes.
• Medium: The topical variety in a document is small (though still present), forcing the approaches
to focus more on style to efectively solve the detection task.
      </p>
      <p>• Hard: All paragraphs in a document are on the same topic.</p>
      <p>Artificial neural networks learn in a bounded environment, where the input distribution is assumed
to be fixed. When the input distribution changes, the model must adapt its weights to perform correctly
on the new task. Due to those modifications, the model overwrites previously learned patterns, creating
interference between old and new tasks, causing a problem known as catastrophic forgetting[17, 18].
The lifelong learning methods to alleviate catastrophic forgetting can be categorized into three classes,
based on how storing and using the task-specific information throughout the sequential learning
process, the replay methods, the regularization-based methods, and the parameter isolation methods
[19]. Vladimir Araujo proposes the ESP method, which relies on an entropy-based criterion to decide
how much a model has to modify the weights in each of its layers, which performs well compared to the
Stability version[10], and outperforms all baselines when trained on all experiments such as Replay.</p>
    </sec>
    <sec id="sec-3">
      <title>3. System Overview</title>
      <p>The Entropy-Based Stability-Plasticity (ESP) [10] method for lifelong learning utilizes an entropy-based
criterion to manage the trade-of between stability and plasticity. The plasticity factor is a crucial
component in ESP, determining how much each layer of the neural network should be updated. This
factor is computed using entropy to assess the importance of the parameters in each layer. The plasticity
factor  for the -th layer can be expressed as:
where  is the entropy of the -th layer and  is the total number of layers.</p>
      <p>The entropy  for the -th layer is calculated based on the activations of the neurons in that layer.
The entropy helps in identifying how much information is being processed by the layer. The entropy
 for the -th layer can be defined as:
 =</p>
      <p>∑︀=1 
 = −

∑︁  log 
=1
where  is the number of neurons in the -th layer and  is the probability associated with the
-th neuron’s activation.</p>
      <p>ESP uses the plasticity factor to scale the gradients during backpropagation. This ensures that layers
with higher entropy (more important) receive smaller updates, preserving their stability, while layers
(1)
(2)
with lower entropy (less important) are updated more, enhancing plasticity. The gradient update for
the -th layer is scaled by its plasticity factor :
∆   =  · ∇  
(3)
where ∇  is the gradient of the loss with respect to the parameters of the -th layer.
The ESP training process involves the following steps:
1. Forward Pass: Compute the output and activations of each layer.
2. Entropy Calculation: Calculate the entropy for each layer.
3. Plasticity Factor Calculation: Determine the plasticity factors based on the entropies.
4. Backward Pass: Scale the gradients using the plasticity factors and update the model parameters.</p>
      <p>ESP efectively balances stability and plasticity by dynamically adjusting the learning rate for diferent
layers based on their entropy. This method ensures that important layers (with high entropy) are
preserved while less important layers (with low entropy) are more flexible to learn new tasks. These
formulas and the underlying methodology provide a robust framework for lifelong learning, addressing
the challenge of catastrophic forgetting while allowing the model to adapt to new information.</p>
      <p>We use BERT [20] as the encoder. For the decoder, following the original BERT model, we use the
ifrst token (special token [CLS]) of the sequence and a classifier to predict the class. Additionally, we
use the default BERT vocabulary in our experiments. We utilize the Adam optimizer with a learning
rate of 3 × 10− 5 and a training batch size of 32. To enhance the training for Task 2 and Task 3, we
adjusted the training sequence of the data to hard → medium → easy.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Results</title>
      <p>Following the above experiment design, the results are as table 1 follows:</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>In this work, we addressed the Multi-Author Writing Style Analysis task for PAN 2024, which involves
detecting style changes at paragraph levels in multi-author documents. The approach, Entropy-based
Stability-Plasticity (ESP), efectively manages the trade-of between stability and plasticity in lifelong
learning scenarios by dynamically adjusting the learning rates of diferent network layers based on
their entropy values. This ensures that layers with high entropy, which are deemed more important,
receive smaller updates to preserve stability, while layers with low entropy, considered less critical, are
updated more to enhance plasticity. Our experiments utilized BERT as the encoder and demonstrated
the efectiveness of ESP across diferent dificulty levels of datasets (Easy, Medium, Hard). The results
showed that ESP outperforms traditional methods by efectively leveraging prior knowledge and
reducing interference between tasks. Specifically, ESP’s adaptive gradient scaling mechanism allows
the model to retain essential information from previous tasks while eficiently learning new tasks, thus
mitigating the issue of catastrophic forgetting.</p>
      <p>In conclusion, the ESP method provides a robust framework for continuous learning in multi-author
writing style analysis, ofering slight improvements over the baseline in maintaining performance
across sequential tasks. Future work may explore the integration of ESP with other neural architectures
and its application to additional domains beyond text analysis.
[6] C. Zuo, Y. Zhao, R. Banerjee, Style change detection with feed-forward neural networks., CLEF
(Working Notes) 93 (2019).
[7] S. Nath, Style change detection using siamese neural networks., in: CLEF (Working Notes), 2021,
pp. 2073–2082.
[8] P. Rosso, F. Rangel, M. Potthast, E. Stamatatos, M. Tschuggnall, B. Stein, Overview of pan’16: new
challenges for authorship analysis: cross-genre profiling, clustering, diarization, and obfuscation,
in: Experimental IR Meets Multilinguality, Multimodality, and Interaction: 7th International
Conference of the CLEF Association, CLEF 2016, Évora, Portugal, September 5-8, 2016, Proceedings
7, Springer, 2016, pp. 332–350.
[9] E. Zangerle, M. Mayerl, M. Potthast, B. Stein, Overview of the style change detection task at pan
2020., CLEF (Working Notes) 93 (2020).
[10] V. Araujo, J. Hurtado, A. Soto, M.-F. Moens, Entropy-based stability-plasticity for lifelong learning,
in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022,
pp. 3721–3728.
[11] E. Zangerle, M. Mayerl, M. Potthast, B. Stein, Overview of the Multi-Author Writing Style Analysis
Task at PAN 2024, in: G. Faggioli, N. Ferro, P. Galuščáková, A. G. S. de Herrera (Eds.), Working
Notes of CLEF 2024 - Conference and Labs of the Evaluation Forum, CEUR-WS.org, 2021.
[12] M. L. Brocardo, I. Traore, S. Saad, I. Woungang, Authorship verification for short messages using
stylometry, in: 2013 International Conference on Computer, Information and Telecommunication
Systems (CITS), IEEE, 2013, pp. 1–6.
[13] M. P. Kuznetsov, A. Motrenko, R. Kuznetsova, V. V. Strijov, Methods for intrinsic plagiarism
detection and author diarization., in: CLEF (Working notes), 2016, pp. 912–919.
[14] K. Safin, R. Kuznetsova, Style breach detection with neural sentence embeddings., in: CLEF
(Working Notes), 2017.
[15] A. Rexha, M. Kröll, H. Ziak, R. Kern, Authorship identification of documents with high content
similarity, Scientometrics 115 (2018) 223–237.
[16] M. Kestemont, M. Tschuggnall, E. Stamatatos, W. Daelemans, G. Specht, B. Stein, M. Potthast,
Overview of the author identification task at pan-2018: cross-domain authorship attribution and
style change detection, in: Working Notes Papers of the CLEF 2018 Evaluation Labs. Avignon,
France, September 10-14, 2018/Cappellato, Linda [edit.]; et al., 2018, pp. 1–25.
[17] M. McCloskey, N. J. Cohen, Catastrophic interference in connectionist networks: The sequential
learning problem, in: Psychology of learning and motivation, volume 24, Elsevier, 1989, pp.
109–165.
[18] R. Ratclif, Connectionist models of recognition memory: constraints imposed by learning and
forgetting functions., Psychological review 97 (1990) 285.
[19] M. De Lange, R. Aljundi, M. Masana, S. Parisot, X. Jia, A. Leonardis, G. Slabaugh, T. Tuytelaars, A
continual learning survey: Defying forgetting in classification tasks, IEEE transactions on pattern
analysis and machine intelligence 44 (2021) 3366–3385.
[20] J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, Bert: Pre-training of deep bidirectional transformers
for language understanding, arXiv preprint arXiv:1810.04805 (2018).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M.</given-names>
            <surname>Fröbe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Kolyada</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Grahm</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Elstner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Loebe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hagen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <article-title>Continuous Integration for Reproducible Shared Tasks with TIRA.io</article-title>
          , in: J.
          <string-name>
            <surname>Kamps</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Goeuriot</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Crestani</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Maistro</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Joho</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Davis</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Gurrin</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          <string-name>
            <surname>Kruschwitz</surname>
            ,
            <given-names>A</given-names>
          </string-name>
          . Caputo (Eds.),
          <source>Advances in Information Retrieval. 45th European Conference on IR Research (ECIR</source>
          <year>2023</year>
          ), Lecture Notes in Computer Science, Springer, Berlin Heidelberg New York,
          <year>2023</year>
          , pp.
          <fpage>236</fpage>
          -
          <lpage>241</lpage>
          . doi:
          <volume>10</volume>
          .1007/ 978-3-
          <fpage>031</fpage>
          -28241-6_
          <fpage>20</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bevendorf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X. B.</given-names>
            <surname>Casals</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Chulvi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dementieva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Elnagar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Freitag</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Fröbe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Korenčić</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mayerl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mukherjee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Panchenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Rangel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Smirnova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Stamatatos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Taulé</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ustalov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegmann</surname>
          </string-name>
          , E. Zangerle,
          <article-title>Overview of PAN 2024: Multi-Author Writing Style Analysis, Multilingual Text Detoxification, Oppositional Thinking Analysis, and Generative AI Authorship Verification</article-title>
          , in: L.
          <string-name>
            <surname>Goeuriot</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Mulhem</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Quénot</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Schwab</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Soulier</surname>
          </string-name>
          ,
          <string-name>
            <surname>G. M. D. Nunzio</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Galuščáková</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. G. S. de Herrera</surname>
          </string-name>
          , G. Faggioli, N. Ferro (Eds.),
          <source>Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Fifteenth International Conference of the CLEF Association (CLEF</source>
          <year>2024</year>
          ), Lecture Notes in Computer Science, Springer, Berlin Heidelberg New York,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>V. A.</given-names>
            <surname>Oloo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Otieno</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. A.</given-names>
            <surname>Wanzare</surname>
          </string-name>
          ,
          <article-title>A literature survey on writing style change detection based on machine learning: State-of-the-art-</article-title>
          <string-name>
            <surname>review</surname>
          </string-name>
          ,
          <source>Int. J. Comput. Trends Technol</source>
          .
          <volume>70</volume>
          (
          <year>2022</year>
          )
          <fpage>15</fpage>
          -
          <lpage>32</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>D.</given-names>
            <surname>Castro-Castro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. A.</given-names>
            <surname>Rodríguez-Lozada</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Muñoz</surname>
          </string-name>
          ,
          <article-title>Mixed style feature representation and b-maximal clustering for style change detection</article-title>
          .,
          <source>in: CLEF (Working Notes)</source>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Iyer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Vosoughi</surname>
          </string-name>
          ,
          <article-title>Style change detection using bert</article-title>
          .,
          <string-name>
            <surname>CLEF</surname>
          </string-name>
          (Working Notes)
          <volume>93</volume>
          (
          <year>2020</year>
          )
          <fpage>106</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>