<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Style Change Detection on Real-World Data using an LSTM-powered Attribution Algorithm</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Robert Deibel</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Denise Löfflad</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Eberhard Karls University Tübingen</institution>
          ,
          <addr-line>Geschwister-Scholl-Platz, 72074 Tübingen</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2021</year>
      </pub-date>
      <abstract>
        <p>The task of Style Change Detection (SCD) aims at detecting author switches within one document based on their individual writing style. In this notebook, this task is divided into three sub-tasks: detecting multi-authored documents, finding style change positions, and attributing each paragraph to a unique author [1, 2]. We chose different machine learning approaches for the first task of multi-author detection, and the second task of style change detection. Our approach to the third task of SCD-based authorship attribution is a hybrid method building upon the prediction of the style change detection and extended by an attribution algorithm. The data was given by the PAN'21 challenge and is a data set collected from an English written Q&amp;A forum [1, 2]. While the approach to task three showed to be very computationally expensive, we found good results for task one and two with F1-scores of 86% for task one and 78% for task two on the validation set.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Style Change Detection</kwd>
        <kwd>Stylometry</kwd>
        <kwd>Word Embeddings</kwd>
        <kwd>NLP</kwd>
        <kwd>Machine Learning</kwd>
        <kwd>LSTM</kwd>
        <kwd>MLP</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Due to the ever increasing interconnection and collaborations resulting in multi-authored texts,
identifying authors within texts becomes an interesting effort. The steadily increasing amount
of readily available data gathered form online text forums or short message boards allows for
construction of methods and models for text analysis based on machine learning approaches.
Possible applications for such models can relate to e.g. plagiarism detection or style analysis.
Style analysis opens the possibility for authors to adjust their writing styles to that of other
collaborators and consequently to write more coherent texts.</p>
      <p>
        The task of Style Change Detection (SCD) aims at analysing texts by detecting whether or
not a document is multi-authored, and if so, where a Style Change occurs in order to find author
changes and predict authorship in the context of one document [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. This year’s PAN Shared Task
builds up on competitions from previous years [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. The PAN’21 SCD task [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ] is threefold:
1. Determining whether a document is single or multi-authored
2. Determining whether a Style Change occurs between paragraphs
3. Assigning each paragraph to one author
      </p>
      <p>
        Especially the last task, which is a type of author attribution task, adds additional complexity in
comparison to the previous years. Traditionally in author attribution, the setting contains a fixed
and known set of authors. The data set for this year’s PAN Shared Task represents real-world
data [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ] and therefore, we decided to treat the exact number of authors of the texts as unknown.
Additionally there is no closed set of authors thus we have to assume that the sets of authors
for any two texts are pairwise disjoint. Therefore, we can only use a single text to identify the
authors, and cannot use a global author profile, that is one spanning all texts.
      </p>
      <p>The following notebook focuses on our efforts to construct machine learning based methods to
solve the three given tasks. After this introduction we will give a brief overview of related works
in Section 2, describe our methods in Section 3, state our results and their evaluation in Section 4,
and lastly give a short discussion and conclusion in Section 5.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Works</title>
      <p>As mentioned before, the goal of the PAN’21 Shared Task is to detect the exact positions of
authorship changes. In order to lead up to this task, it is split into three sub-tasks: deciding whether
a document is multi-author, finding style changes between paragraphs, and lastly attributing the
paragraphs uniquely to one of the authors.</p>
      <p>
        Regarding the first task of detecting multi-authored documents, several different solutions have
been proposed over the last decades [
        <xref ref-type="bibr" rid="ref4 ref5 ref6">4, 5, 6</xref>
        ]. Zuo et al. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] used a multi-layer perceptron (MLP)
with a single layer for the binary classification task and represent each document by a
TFIDFweighted word vector. Iyer and Vosoughi [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] generated sentence vectors using embeddings and
compared Logistic Regression, Decision Trees, Random Forest, Support Vector Machines, and
Naive Bayes classifiers. They reported to yield the best results with a Random Forest classifier.
      </p>
      <p>
        Similar to the first task, the second task received attention by researchers over the last years.
Especially in the PAN Shared Task series, the detection of style changes between paragraphs
has been addressed repeatedly and various approaches have been proposed. Iyer and Vosoughi
(2020) [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] yielded the best results in 2020 [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], using Google’s BERT language model and word
embeddings [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Another approach included computing nearly 200 textual features and using
0-maximal clustering [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], however this approach yielded less promising results in the PAN Task.
Moreover, Hosseinia et al. (2018) [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] implemented a complex Parallel Hierarchical Attention
Network using LSTM layers and achieved promising results.
      </p>
      <p>
        Finally, the third task, namely assigning paragraphs to specific and unknown authors, is a more
recent problem and has consequently been addressed less frequently. It is similar to the problem
of authorship identification of multi-author documents (AIMD) [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. Sarwar et al. [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] tackled this
problem by using character n-grams and a multilingual feature space to identify co-authors from
a set of known authors.
      </p>
      <p>
        In all three tasks, individual writing style plays a role and can be addressed more or less
specifically. Castro-Castro et al. (2020) [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] for example computed nearly 200 textual features for
their pipeline. Stylometry is defined as statistical analysis of written texts [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] and assumes that
textual features are quantifiable and represent one’s individual and distinct writing style reliably.
Stylometry has application in and has been shown to work effectively for Authorship Attribution
[
        <xref ref-type="bibr" rid="ref10 ref11 ref12 ref13">11, 12, 13, 10</xref>
        ], and for authorship attribution of single author documents [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
      </p>
      <p>In the next section, we will focus on our contribution to the research in Style Change Detection
and Author Attribution, while taking into account past research and results.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Methods</title>
      <p>
        For the three tasks tackled in this notebook we chose different machine learning approaches
for the first task of multi-author detection, and the second task of style change detection. We
attempted to solve the first task using per-document embeddings and an MLP, while we utilized
per-paragraph embeggins and textual features as input for an LSTM for the second task. Our
approach for the third task of SCD-based authorship attribution is a hybrid method building upon
the prediction of the style change detection and extended by an attribution algorithm. The training
of the models was performed on an Intel®Core™i7-7500U CPU at 2.70GHz without hardware
acceleration. Additionally, a server environment on TIRA [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] was provided by the PAN’21
team.
      </p>
      <p>In the following we will describe our approaches in more detail and state other important
considerations for our application.</p>
      <sec id="sec-3-1">
        <title>3.1. Data Set</title>
        <p>
          The data set provided by the PAN’21 SCD challenge consisted of one training and one validation
set with 11200 and 2400 problems, respectively, as well as solutions for these problems. The
problems consist of uniquely English posts scraped from the StackExchange network that where
put together in one to four author documents of total length between 1000 and 3000 characters.
Additionally to the training and validation sets a test set was generated by the PAN’21 team but
not provided during development. The test set is similar to the validation in terms of size and
content [
          <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
          ]. Overall the data set is similar to the one used in Zangerle et al. [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. Since the data
sets provided additional metadata e.g. the total number of authors in a text, it would have been
possible to construct a model that uses this additional data for its prediction. To simulate the
real world application of SCD we chose not to consider the maximum number of authors in our
application but rather keep this value variable.
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Textual Features</title>
        <p>
          We manually extracted a small set of traditional features that are often used in Authorship
Attribution tasks [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ], and one additional feature. We computed these measures for every
paragraph. The measures are described in more detail as follows:
• Corrected Type-Token Ratio (CTTR): The total number of unique words (types) divided by
the total number of words (tokens) in a paragraph. In contrast to the traditional TTR which
is negatively influenced by texts longer than 100 words, the CTTR takes into account the
varying lengths of the paragraphs and is therefore not affected by text length.
• Mean Sentence Length in Words: The average number of words in a sentence. Considering
the creation of the data set - an online Q&amp;A, where presumably each author is less likely to
stick to a writing standard - we expected this measure to be adequate for this dataset and
task.
• Mean Word Length: The average word length in syllables. It has been shown that the use
of shorter compared to longer words is a good indicator of an author’s style [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ].
• Function Word Frequency: It is unlikely that the frequency of function word use is
consciously controlled [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]. We decided to add this feature, as it is not expected that their
frequency would vary much with the topic of the text.
• Linsear Write Formula: A readability formula to score the difficulty of English text. The
standard Linsear Write metric runs on a 100-word sample. For calculating it, we used the
following method: For the first 100 words of the text, for each simple word, defined as
words with two syllables or less, we added one point to the result . For each complex
word, defined as words with three syllables or more, we added three points to . Then, we
divide the points by the number of sentences in the 100-word sample, and adjust the result
:
  =
{︃  ,
2
        </p>
        <p>if  &lt; 20
 , otherwise
2− 1
We chose to manually implement those rather basic features as they have been shown to be good
predictors of style. However, we did not expect the model trained on this set of vfie features
to outperform the model trained on word embeddings, simply because of the seemingly small
amount of information that such a set would achieve.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Embeddings</title>
        <p>
          The word embeddings used for our model are the pretrained fastText word vectors provided by
Facebook’s AI Research lab [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ]. They were trained on Common Crawl and Wikipedia data,
using continuous bag of words with position-weights, in dimension 300, with character n-grams
of length vfie, a window of size vfie and ten negatives. Unlike Word2Vec, fastText uses character
n-grams in order to create an inherent association between words that share the same stem, thus it
not only encodes semantic and syntactic information, but morphological information as well.
        </p>
        <p>For the task of multi-author prediction we chose to calculate the embeddings on a per-document
basis. We anticipated that multi-author documents would be separate from single-author
documents in embedding space.</p>
        <p>
          For the task of style-change detection we use a per-paragraph embedding. Since each paragraph
is guaranteed to have been written by a single author, we computed the average of the word
embeddings for each paragraph, thus generating a paragraph embedding. As pointed out by
Kenter et al. [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ], simply averaging word embeddings of all words in a text has proven to be a
surprisingly successful and efficient way of obtaining features across a multitude of tasks. The
ifnal vectors of both tasks were then padded and fed into the model.
        </p>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Model Single vs. Multiple</title>
        <p>
          Due to the promissing results reported by Hosseinia et al. (2018) [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ], we decided to opt for a
MLP to tackle the first task of the challenge. Our machine learning models across all the tasks are
built using the Keras API [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ] with the TensorFlow [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ] backend. For this approach specifically
we used a standard MLP pipeline with three hidden, fully connected, and feed forward layers. We
assumed that the set of multi-authored documents and the set of single authored documents could
be well separated in the space of per-document embeddings. As a per-document approach makes
more sense for this task, we decided not to use textual features. Computing textual features on a
per-document basis would distort the feature values and lead to poorer results.
        </p>
        <p>The number of neurons and the learning rate were determined using a grid search approach in
the range of 32 to 512 with increments of 32 for the number of neurons and the options of 10− 2,
10− 3 and 10− 4 for the learning rate. For the hidden layers this resulted in 480, 480, 288 neurons,
respectively. The optimal learning rate was found to be 10− 3.</p>
        <p>
          The activation of hidden layers was set as the ReLU function as this has proven to be successful
in MLP applications [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ]. The output layer activation was set as the sigmoid function as we want
to predict a binary decision. Again for all of our models the Adam optimizer [
          <xref ref-type="bibr" rid="ref23">23</xref>
          ] and the binary
cross entropy loss was used as the optimization loss function.
        </p>
        <p>As input data we used only the per-document embeddings as described in Section 3.3 as we
wanted to analyze the discriminative abilities of MLPs on per-document level. We believe that
calculating per-document complexity measures would not yield a satisfying separation since the
documents are constructed to be similar in their global structure.</p>
      </sec>
      <sec id="sec-3-5">
        <title>3.5. Model Style Change Basic</title>
        <p>We trained a two-layered Bidirectional LSTM model with 128 hidden units per layer, adding a
Masking layer, and a Time-Distributed layer as the output layer with a sigmoid activation function.
We used binary cross-entropy as the loss function. In order to prevent overfitting, early stopping
was used.A threshold of 0.5 was established in order to decide whether there is a style change.
We applied normalization. We found that a batch size of one achieved the best results, compared
to batch sizes vfie and ten, which yielded results worse by 60% points.</p>
        <p>Our anticipation for using an LSTM was that the model could learn similarities in writing style
on a per-paragraph basis by using the paragraphs as time steps in its input.</p>
      </sec>
      <sec id="sec-3-6">
        <title>3.6. Model Style Change Real-World</title>
        <p>We implemented different approaches for this task, which turned out to be more difcfiult than
the previous two tasks. Our first approach was to train a simple LSTM model similar to that of
task two, but because there are no consistent classes across the data, the model achieved bad
results. The second approach was a two-fold pipeline containing a k-means clustering algorithm
and a classification model. As the data for the Shared Task should represent real-world data the
number of authors of every text is unknown and may differ from text to text. Therefore, it was
not straightforward to build a clustering system that would predict the number of authors.</p>
        <p>The third approach showed to yield the best result. For this approach, we used an
LSTMpowered Attribution Algorithm, visualized in Figure 1 and represented as pseudo-code in
Algorithm 1. The algorithm functions as follows. For every paragraph the authorship attribution
decision is made. We utilize the style change detection prediction that is generated by our
LSTM-model for the whole document. The first paragraph is always attributed to author A 1, we
assume that the paragraphs are written by the same author, A1, as long as no style change occurs.
After a style change was detected in the predictions we construct a new prediction problem for
our LSTM to solve. This problem consists of the current paragraph up for authorship attribution
and preceded by a previously attributed paragraph p. p is iterated until either, no style change
is detected or p is equal to the current paragraph. The first case implies that the author of p  is
the same in the current paragraph while the second case implies that the author was not detected
before and a new author should be selected.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Evaluation</title>
      <p>
        In this section, we present the results we achieved by testing the models on the validation set that
was made available by the PAN’21 committee [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ], followed by the results yielded on the test
set. For the test set, only the F1-scores are reported by the PAN team.
      </p>
      <sec id="sec-4-1">
        <title>4.1. Single vs. Multiple</title>
        <p>In this section we present our results for the task of detecting multi-authored documents, shown
in Table 1. It can be seen that the model yields good results across all measures. The small
drop in recall and accuracy suggest that the model classifies some multi-authored documents as
single-authored documents. When evaluated on the test set, the F1-score drops noticeably, this
can indicate that the model might not generalize well.</p>
        <p>
          Nevertheless, the F1-score achieved by our model outperforms the winning’s team model from
last year’s PAN Task [
          <xref ref-type="bibr" rid="ref3 ref5">3, 5</xref>
          ] and therefore suggests that this method is a good approach to the task
of detecting multi-authored documents. Of course, the data set differs somewhat from that of Iyer
and Vosoughi (2020) [
          <xref ref-type="bibr" rid="ref3 ref5">3, 5</xref>
          ], but we still believe that the results are comparable. The generation of
the data was similar, and since both data sets are retrieved from the same English Q&amp;A forum,
the data sets are also comparable.
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Style Change Detection Basic</title>
        <p>In this section we present our results for the task of detecting style change positions. As can be
seen in Table 2 the three categories yielded similar results. It is interesting to see that the model
trained with the textual features only achieved high accuracy. Nevertheless, the comparatively
low recall score of 66.27% suggests that this model tend to predict no style change, the reason
for the high accuracy being the imbalance in the data. The other two models trained with the
embeddings and the combination of the embeddings and features achieved recall scores of 74%
and 72.5%, respectively. This suggests that these models are more robust to imbalance in the data
and therefore more suitable for real-world data.</p>
        <p>
          Compared to the self-reported results from the winning team of last year’s PAN Shared Task,
our model does not outperform Iyer and Vosoughi’s model [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. Similar to Task 1, a drop in the
F1-score evaluated on the test is observed.
        </p>
        <p>Nevertheless, all three models yield good results and despite the drop in recall, the feature
model achieves satisfying results.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Style Change Real-World</title>
        <p>The implementation of the algorithm showed to be very computationally expensive and time
consuming. Due to our limited access to computational power we were not able to run this code
on the validation set in order to get results. Nevertheless, when running it on a smaller corpus we
yielded promising outcomes. However, considering the small amount of data we used we decided
to refrain from reviewing the results here.</p>
        <p>The PAN team reported a F1-score of 26.25 %. Considering this result, we assume that a
sequence of two paragraphs might not be enough information for the model to make a legitimate
prediction.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion and Outlook</title>
      <p>
        In this notebook, we presented our approaches to solve the PAN’21 Shared Task of Style Change
Detection [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ]. For each sub-task, we implemented different algorithms and built a hybrid
algorithm for the task of assigning paragraphs to authors. This task showed to be the most
complex out of the three.
      </p>
      <p>With regards to the stylometric aspect of the problems, we implemented only vfie different and
rather traditional features. Surprisingly, it was possible to achieve good results with only those
features, which shows that for this data set, the individual style of the authors was accurately
representable by mean word length, mean sentence length, function word frequency, CTTR, and
the LWF.</p>
      <p>
        Our approaches to the first and second task yielded good results on the validation set, which
shows that simple machine learning models can be good solutions for the tasks. Future work could
try to implement attention based models [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ], convolutional layers combining and compressing
paragraphs, or an autoencoder approach [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ]. Expecially autoencoders are often used in style
transformation [
        <xref ref-type="bibr" rid="ref26 ref27">26, 27</xref>
        ] and could be a potential candidate for style change tasks.
      </p>
      <p>The third task is a newer problem and more complicated to solve, especially on real-world
data with an unknown number of unknown authors. For future projects, we plan on attempting
to implement a clustering/classification model for this problem. Furthermore, it is possible to
build up on our model by increasing the number of paragraphs used to make predictions and
parallelizing the loop of the attribution algorithm to increase computation speed. We therefore
consider our model to be a promising approach despite achieving low results. It is also important
to note that we challenged ourselves to a more real-world-like problem by not taking into account
the maximum number of authors, as it would certainly lower the complexity of the task if this
information is used.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bevendorff</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Chulvi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. L. D. L. P.</given-names>
            <surname>Sarracén</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kestemont</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Manjavacas</surname>
          </string-name>
          , I. Markov,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mayerl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Rangel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Stamatatos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wolska</surname>
          </string-name>
          , , E. Zangerle, Overview of PAN 2021:
          <article-title>Authorship Verification,Profiling Hate Speech Spreaders on Twitter,and Style Change Detection</article-title>
          ,
          <source>in: 12th International Conference of the CLEF Association (CLEF</source>
          <year>2021</year>
          ), Springer,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>E.</given-names>
            <surname>Zangerle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mayerl</surname>
          </string-name>
          , ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <article-title>Overview of the Style Change Detection Task at PAN 2021, in: CLEF 2021 Labs and Workshops, Notebook Papers, CEUR-WS</article-title>
          .org,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>E.</given-names>
            <surname>Zangerle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mayerl</surname>
          </string-name>
          , G. Specht,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <article-title>Overview of the style change detection task at pan 2020</article-title>
          , CLEF,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Hosseinia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mukherjee</surname>
          </string-name>
          ,
          <article-title>A parallel hierarchical attention network for style change detection: Notebook for pan at clef 2018</article-title>
          ., in: CLEF (Working Notes),
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Iyer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Vosoughi</surname>
          </string-name>
          ,
          <article-title>Style change detection using BERT</article-title>
          , in: CLEF,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>C.</given-names>
            <surname>Zuo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Banerjee</surname>
          </string-name>
          ,
          <article-title>Style change detection with feed-forward neural networks</article-title>
          .,
          <source>in: CLEF (Working Notes)</source>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          , BERT:
          <article-title>Pre-training of deep bidirectional transformers for language understanding</article-title>
          , arXiv preprint arXiv:
          <year>1810</year>
          .
          <volume>04805</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>D.</given-names>
            <surname>Castro-Castro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. A.</given-names>
            <surname>Rodríguez-Losada</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Muñoz</surname>
          </string-name>
          ,
          <article-title>Mixed style feature representation and b0-maximal clustering for style change detection (</article-title>
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>R.</given-names>
            <surname>Sarwar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Urailertprasert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Vannaboot</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Rakthanmanon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Chuangsuwanich</surname>
          </string-name>
          , S. Nutanong, :
          <article-title>Stylometric authorship attribution of multi-author documents using a co-authorship graph</article-title>
          ,
          <source>IEEE Access 8</source>
          (
          <year>2020</year>
          )
          <fpage>18374</fpage>
          -
          <lpage>18393</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>H.</given-names>
            <surname>Ramnial</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Panchoo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Pudaruth</surname>
          </string-name>
          ,
          <article-title>Authorship attribution using stylometry and machine learning techniques</article-title>
          ,
          <source>in: Intelligent Systems Technologies and Applications</source>
          , Springer,
          <year>2016</year>
          , pp.
          <fpage>113</fpage>
          -
          <lpage>125</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>M.</given-names>
            <surname>Bhargava</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Mehndiratta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Asawa</surname>
          </string-name>
          ,
          <article-title>Stylometric analysis for authorship attribution on Twitter</article-title>
          ,
          <source>in: International Conference on Big Data Analytics</source>
          , Springer,
          <year>2013</year>
          , pp.
          <fpage>37</fpage>
          -
          <lpage>47</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>D. I. Holmes</surname>
          </string-name>
          , Authorship attribution,
          <source>Computers and the Humanities</source>
          <volume>28</volume>
          (
          <year>1994</year>
          )
          <fpage>87</fpage>
          -
          <lpage>106</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>T.</given-names>
            <surname>Neal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Sundararajan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Fatima</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Xiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Woodard</surname>
          </string-name>
          ,
          <article-title>Surveying stylometry techniques and applications</article-title>
          ,
          <source>ACM Computing Surveys (CSUR) 50</source>
          (
          <year>2017</year>
          )
          <fpage>1</fpage>
          -
          <lpage>36</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Gollub</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          , TIRA Integrated Research Architecture, in: N.
          <string-name>
            <surname>Ferro</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          Peters (Eds.),
          <source>Information Retrieval Evaluation in a Changing World, The Information Retrieval Series</source>
          , Springer, Berlin Heidelberg New York,
          <year>2019</year>
          . doi:
          <volume>10</volume>
          .1007/ 978-3-
          <fpage>030</fpage>
          -22948-1\_5.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>P.</given-names>
            <surname>Muller</surname>
          </string-name>
          ,
          <article-title>Style change detection</article-title>
          ,
          <source>ETH Zurich</source>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>M.</given-names>
            <surname>Koppel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Schler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Argamon</surname>
          </string-name>
          ,
          <article-title>Computational methods in authorship attribution</article-title>
          ,
          <source>Journal of the American Society for information Science and Technology</source>
          <volume>60</volume>
          (
          <year>2009</year>
          )
          <fpage>9</fpage>
          -
          <lpage>26</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>C.</given-names>
            <surname>Chung</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. W.</given-names>
            <surname>Pennebaker</surname>
          </string-name>
          ,
          <article-title>The psychological functions of function words</article-title>
          ,
          <source>Social communication 1</source>
          (
          <year>2007</year>
          )
          <fpage>343</fpage>
          -
          <lpage>359</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>E.</given-names>
            <surname>Grave</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Bojanowski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Joulin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mikolov</surname>
          </string-name>
          ,
          <article-title>Learning word vectors for 157 languages</article-title>
          , arXiv preprint arXiv:
          <year>1802</year>
          .
          <volume>06893</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>T.</given-names>
            <surname>Kenter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Borisov</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. De Rijke</surname>
          </string-name>
          ,
          <article-title>Siamese cbow: Optimizing word embeddings for sentence representations</article-title>
          ,
          <source>arXiv preprint arXiv:1606.04640</source>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>F.</given-names>
            <surname>Chollet</surname>
          </string-name>
          , et al.,
          <string-name>
            <surname>Keras</surname>
          </string-name>
          , https://keras.io,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>M.</given-names>
            <surname>Abadi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Agarwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Barham</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Brevdo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Citro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. S.</given-names>
            <surname>Corrado</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Davis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Dean</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Devin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ghemawat</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Goodfellow</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Harp</surname>
          </string-name>
          , G. Irving,
          <string-name>
            <given-names>M.</given-names>
            <surname>Isard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Jia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Jozefowicz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Kaiser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kudlur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Levenberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Mané</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Monga</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Moore</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Murray</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Olah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Schuster</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Shlens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Steiner</surname>
          </string-name>
          , I. Sutskever,
          <string-name>
            <given-names>K.</given-names>
            <surname>Talwar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Tucker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Vanhoucke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Vasudevan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Viégas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Vinyals</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Warden</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wattenberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wicke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zheng</surname>
          </string-name>
          ,
          <source>TensorFlow: Large-scale machine learning on heterogeneous systems</source>
          ,
          <year>2015</year>
          . URL: https://www.tensorflow.org/, software available from tensorflow.
          <source>org.</source>
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>I.</given-names>
            <surname>Goodfellow</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Bengio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Courville</surname>
          </string-name>
          , Deep Learning, MIT Press,
          <year>2016</year>
          . http://www. deeplearningbook.org.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>D. P.</given-names>
            <surname>Kingma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ba</surname>
          </string-name>
          ,
          <article-title>Adam: A method for stochastic optimization</article-title>
          ,
          <year>2017</year>
          . arXiv:
          <volume>1412</volume>
          .
          <fpage>6980</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <surname>A. M. Rush</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Chopra</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Weston</surname>
          </string-name>
          ,
          <article-title>A neural attention model for abstractive sentence summarization</article-title>
          ,
          <year>2015</year>
          . arXiv:
          <volume>1509</volume>
          .
          <fpage>00685</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <surname>M. J. Kusner</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Paige</surname>
            ,
            <given-names>J. M.</given-names>
          </string-name>
          <string-name>
            <surname>Hernández-Lobato</surname>
          </string-name>
          ,
          <article-title>Grammar variational autoencoder</article-title>
          ,
          <source>in: International Conference on Machine Learning, PMLR</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>1945</fpage>
          -
          <lpage>1954</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>Y.-J.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Pan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.-H.</given-names>
            <surname>Ling</surname>
          </string-name>
          ,
          <article-title>Learning latent representations for style control and transfer in end-to-end speech synthesis</article-title>
          ,
          <source>in: ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>6945</fpage>
          -
          <lpage>6949</lpage>
          . doi:
          <volume>10</volume>
          .1109/ICASSP.
          <year>2019</year>
          .
          <volume>8683623</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>D.</given-names>
            <surname>Ramani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Karmakar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Panda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ahmed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Tangri</surname>
          </string-name>
          ,
          <article-title>Autoencoder based architecture for fast &amp; real time audio style transfer</article-title>
          ,
          <year>2018</year>
          . arXiv:
          <year>1812</year>
          .07159.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>