<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Overview of the Authorship Verification Task at PAN 2022</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Efstathios Stamatatos</string-name>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mike Kestemont</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Krzysztof Kredens</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Piotr Pezik</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Annina Heini</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Janek Bevendorf</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Benno Stein</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Martin Potthast</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Aston University</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Bauhaus-Universität Weimar</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Leipzig University</institution>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>University of Antwerp</institution>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>University of the Aegean</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2022</year>
      </pub-date>
      <fpage>5</fpage>
      <lpage>8</lpage>
      <abstract>
        <p>The authorship verification task at PAN 2022 follows the experimental setup of similar shared tasks in the recent past. However, it focuses on a diferent, and very challenging scenario: given two texts belonging to diferent discourse types, the task is to determine whether they are written by the same author. Based on a new corpus in English, we provide pairs of texts using four discourse types: essays, emails, text messages, and business memos. The diferences in communicative purpose, intended audience, and the level of formality render the cross-discourse-type authorship verification task very hard. We received 7 submissions and evaluated them using the TIRA integrated research architecture, along with two baseline approaches. This paper reviews the submissions and presents a detailed discussion of the evaluation results.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Author identification (or authorship attribution) aims to reveal information about the
individual(s) who wrote a text [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ]. There are several relevant tasks that emulate real-world
conditions, mainly closed-set authorship attribution (where there is a finite list of candidate
authors) and open-set authorship attribution (where there is a set of candidate authors but
this does not necessarily include the true author(s)) [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. The former scenario suits cases where
only a short list of persons could eventually be the authors of disputed texts while the latter
can be applied to cases where such lists of candidates are not available (or reliable enough). A
special case of open-set attribution is authorship verification where there is only one candidate
author [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Among author identification tasks, authorship attribution plays a key role since any
given case can be decomposed into a series of authorship verification instances.
      </p>
      <p>
        In authorship verification, texts of known authorship by one author are presented to a
system, which is then tasked to verify whether another text has also been written by that same
author [
        <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
        ]. In its simplest form, only one text of known authorship is given [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. In that case,
for a pair of texts (typically one of known authorship and another of unknown authorship), we
are asked to determine whether they are written by the same author.
      </p>
      <p>
        During the last decade, an extensive list of authorship verification methods have been
proposed [
        <xref ref-type="bibr" rid="ref4 ref6 ref8 ref9">4, 6, 8, 9</xref>
        ]. In addition, several previous PAN editions included a relevant shared
task [
        <xref ref-type="bibr" rid="ref10 ref11 ref12 ref13 ref14">10, 11, 12, 13, 14</xref>
        ]. The efectiveness of authorship verification approaches depends on
several factors. Naturally, text length is a crucial factor and usually the efectiveness of systems
deteriorates when only short or very short texts are given. Another very challenging form of the
task considers cases where texts of known and unknown authorship belong to diferent domains.
In cross-domain authorship verification , texts of known and unknown authorship may difer
in topic (politics vs. sports), genre (review vs. essay) or even language (English vs. German).
In PAN 2015, Both cross-topic and cross-genre authorship verification were considered, and
results were with relatively low accuracy were obtained, especially for a cross-genre dataset
of essays and reviews in Dutch [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. In the last two editions of PAN [
        <xref ref-type="bibr" rid="ref13 ref14">13, 14</xref>
        ] fanfiction texts
(i.e., non-professional fiction published online by fan authors) belonging to diferent fandoms
(i.e., fanfiction inspired by certain highly popular works) were used. A large training dataset of
more than 350,000 verification instances was compiled for this task that enabled the application
of powerful deep learning models [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. Perhaps surprisingly, the best results obtained were
rather high, suggesting that most fanfiction authors may retain their stylistic choices over
diferent fandoms, albeit other factors that may have artificially boosted the results could not
be ruled out.
      </p>
      <p>The current edition of PAN focuses on cross-discourse type authorship verification where
texts of known and unknown authorship belong to diferent discourse types. In particular,
these discourse types have significant diferences concerning communicative purpose, intended
audience, or level of formality. For example, the discourse types of argumentative essays and
text messages sent to family members have important stylistic diferences imposed by the norms
of discourse types. It is therefore very challenging to distinguish authorial characteristics that
remain intact across discourse types. In addition, discourse types strongly correlates with text
length (e.g., essays are much longer than text messages) and cross-discourse type authorship
verification can also be used to study the efect of text length in the efectiveness of authorship
verification approaches approaches.</p>
      <p>In this paper, we first present the new datasets and the evaluation framework for the
crossdiscourse type authorship verification shared task at PAN 2022. Next, we shall survey the
received submissions and evaluate in detail their efectiveness. Finally, we discuss the main
conclusions and possible directions for future work.</p>
    </sec>
    <sec id="sec-2">
      <title>2. The PAN Cross-Discourse Type Authorship Verification Corpus 2022</title>
      <p>A novel dataset was created from a subset of the recent Aston 100 Idiolects Corpus in English
(Kredens, Heini and Pezik 2021),1 including a rich set of discourse types authored by 112
individuals. We used the following discourse types of written language: emails, essays, text messages,
and business memos. All individuals represented in the corpus have a similar age (18–22) and
are native speakers of English. The topic of text samples is not restricted, while the level of
formality can vary within a certain discourse type (e.g., text messages may be addressed to
family members or other acquaintances). Table 1 gives an overview of the data and the parts of
it used of training and testing diferent aspects of cross-discourse type authorship verification.</p>
      <p>This corpus has been anonymized in that named-entities such as mentions of locations,
person names, addresses, etc. were manually replaced with generic placeholder tags. This is
very useful for evaluating authorship verification methods in cross-discourse type scenarios
since the presence of author-specific and topic-specific information is reduced.</p>
      <p>In order to compile the required training and test datasets for the shared task at hand, the
corpus needed further preprocessing. First, we split the available individuals into two equal and
non-overlapping sets, one to be used for the training dataset and the other for the test dataset.
That way, it is ensured that any kind of particularities among the training authors will not
afect the efectiveness on the test dataset. In addition, we took advantage of the demographic
metadata available and ensured a stable gender distribution of individuals in both the training
and test dataset. More specifically, the training and test datasets represent writings by 56 authors
each (10 male, 45 female and 1 of unidentified gender).</p>
      <p>The dataset comprises a set of text pairs and in each pair the two texts belong to two diferent
discourse types. All six combinations of the four available discourse types are taken into
account. However, the distribution of text pairs over the combination of discourse types is
not homogeneous since it depends on the available texts belonging to each discourse type.
For example, the corpus comprises only one business memo and multiple email messages
per individual. Nevertheless, the distribution of verification instances per discourse type
combination is similar in both training and test datasets as can be seen in Table 1. Similarly,
both training and test datasets have a balanced distribution of positive/negative verification
cases. This is also valid for each combination of discourse types (e.g., half of the pairs belonging
to the combination essay–email is positive and the other half is negative).</p>
      <p>Since the length of texts belonging to certain discourse types can be limited, we concatenated
multiple texts of the same discourse type to produce longer text samples. In more detail, email
messages were concatenated so that a text sample of at least 2,000 characters was obtained. The
date of email messages was taken into account so that consecutive messages are concatenated.
In the case of text messages, we concatenated messages sent either to friends or to family, so
that text samples of at least 500 characters were obtained. We inserted the special tag &lt;new&gt;
in the concatenated messages to indicate the original message boundaries. The text lengths in
Table 1 for email and text messages refer to text samples created in this manner.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Evaluating Cross-Discourse Type Authorship Verification</title>
      <p>In authorship verification, one has to approximate the target function  : (, ) → {,  },
where  is a set of texts of known authorship and  is a text of unknown or disputed
authorship. In the current edition of the task, we consider  as singleton. Thus, the task is
to approximate the target function  : (, ) → {,  } for a pair of texts. If (, ) =  ,
then the author of  is also the author of  (positive instance) and if (, ) =  , then the
author of  is not the same as the author of  (negative instance). The main novelty of the
current edition is that  and  belong to diferent discourse types.</p>
      <p>
        The evaluation framework is similar to the one used in recent shared tasks at PAN [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. For
each authorship verification instance (a pair of texts) of the test dataset, participants have to
produce a scalar score  (in the [
        <xref ref-type="bibr" rid="ref1">0, 1</xref>
        ] range) indicating the probability that the pair was written
by the same author. It is possible for participants to leave text pairs unanswered by submitting
a score of precisely  = 0.5.
      </p>
      <sec id="sec-3-1">
        <title>3.1. Evaluation Measures</title>
        <p>
          Similar to recent editions of the authorship verification task [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ], we adopt a diverse set
of efectiveness measures to highlight diferent aspects of the capabilities of an authorship
verification model. We reused the four measures from the 2020 edition, but also included the
Brier score [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] as an additional fifth measure (following discussions with participants and the
audience at the 2020 workshop). In total, the following efectiveness measures were used:
• AUROC: the area under the ROC curve,
• c@1: a variant of the conventional accuracy measure, which rewards systems that leave
dificult problems unanswered [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ],
• F1: the well-known F1 efectiveness measure ( not taking into account non-answers),
• F0.5: a newly-proposed F0.5-based measure that emphasizes correctly-answered
sameauthor cases and rewards non-answers [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ],
• Brier: the complement of the Brier loss function [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] focusing on the accuracy of
probabilistic predictions (as implemented in sklearn [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ]). This measure rewards verifiers that
make “bold” but correct predictions (i.e.,  close to 0.0 or 1.0) and it indirectly penalizes
less confident ones, including non-answers (  = 0.5). In line with the other measures
we take its complement so that higher scores correspond to better efectiveness.
• The average of the above measures is used as final score to rank submitted systems.
We also report runtime on TIRA to give an indication of relative eficiency.
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Baselines</title>
        <p>
          In order to facilitate the comparison of the submitted methods with established approaches from
the literature in the field, we provide two baseline methods that are based on character n-grams
or character sequences. The source code of the following two methods were made available to
the participants at the start of the campaign (together with an oficial implementation of the
evaluation measures):
• Compression-based model. Given a pair of texts 1 and 2, the cross-entropy of 2 based
on the Prediction by Partial Matching (PPM) model of 1 is computed, and vice-versa [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ].
Then, a logistic regression classifier is trained using the mean and absolute diference of
the two cross-entropies. In addition, using a small radius verification scores around 0.5
are set to exactly 0.5.
• Distance-based character n-gram model [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ]. The most frequent character 4-grams
are extracted from the training texts and used to represent each text. Then, given a pair
of texts, the cosine similarity between them is calculated. During training, two threshold
values 1 and 2 are optimized to scale the verification scores. All verification scores
lower than 1 correspond to negative answers, all scores greater than 2 are scaled to
positive answers and the remaining scores are set to 0.5, implying that these are hard
instances that deliberately are left unanswered.
        </p>
        <p>The baselines are not tailored to particular discourse types, e.g., by tuning hyperparameters.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Survey of Submissions</title>
      <p>
        We received seven submissions and evaluated their efectiveness and eficiency using the
TIRA integrated research architecture [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]. All participants also submitted a notebook paper
describing their approach. The main characteristics of each approach are provided in Table 2.
      </p>
      <p>
        Most participants followed the recent trend in natural language processing and used
pretrained language models like BERT, T5, or MPNET to obtain text embeddings. Konstantinou
et al. [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ] report that several such models were compared and the most efective one selected.
Approaches not using pre-trained language models exploit graph-based text representations [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ],
spectral analysis [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ], or representations based on traditional feature engineering including
features like frequencies of part-of-speech (POS) tags and word unigrams (najafi22).
      </p>
      <p>
        Regarding the classification model, most participants rely on fully-connected layers that
combine the information from the text representation step. It is also reported that several
traditional machine learning algorithms, such as support vector machines and random forests
were examined but their efectiveness was found to be comparatively low [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ]. Other deep
learning methods used are convolutional and siamese neural networks. Since the use of deep
learning technology usually requires a considerable amount of training and some extra validation
data, some participants attempted to augment the provided dataset by generating new authorship
verification instances with the help of the available metadata.
      </p>
      <p>Surprisingly, no participant studied discourse type-specific approaches for the given
combinations despite their substantial diferences.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Evaluation Results</title>
      <p>AUROC
This section presents an in-depth analysis of the efectiveness and eficiency of the submitted
approaches regarding overall, dependent on discourse type, with respect to bias, runtime, and
in comparison to the previous year’s participants.</p>
      <sec id="sec-5-1">
        <title>5.1. Overall results</title>
        <p>Table 3 shows the overall results of all participants. In general, the efectiveness of all submissions
is quite low, reflecting the dificulty of the task. The approaches of najafi22, galicia22,
and jinli22 clearly outperform the rest of the submissions. It is also surprising that a naive
baseline achieved the best overall score, despite the fact that most participant models are quite
sophisticated. On the other hand, the most efective method submitted ( najafi22) outperforms
all other submissions and baselines in three out of five evaluation measures. Its main weakness
seems to be the low Brier score which means that its probabilistic predictions are in need of
improvement (even if its binary class assignments are relatively strong).</p>
        <p>Brier</p>
        <p>F1</p>
        <p>F1</p>
        <p>F0.5 Brier Overall</p>
        <p>F1</p>
        <p>F0.5 Brier Overall</p>
        <p>F1</p>
        <p>F0.5 Brier Overall
(d) Business memo–Text message</p>
        <p>(f) Essay–Business memo</p>
        <p>F1</p>
        <p>F1</p>
        <p>F0.5 Brier Overall</p>
      </sec>
      <sec id="sec-5-2">
        <title>5.2. Results by discourse type</title>
        <p>that each discourse type comes with diferent average text lengths (see Table 1). For instance,
essays are much longer than the rest of the examined discourse types. As Table 4b, c, and f
show, when essays are part of a pairing, the submission of galicia22 is the most efective
system in terms overall efectiveness. Where essays are excluded (Table 4a, e, and d), their
approach is outperformed by that of najafi22. On the shortest discourse types (business
memos and text messages; Table 4d) the submission of jinli22 seems to be the most efective.
This pairing of discourse type also has the lowest overall efectiveness, indicating that text
length (plus cross-discourse verification) remains a crucial factor in authorship verification.
The baseline-cngdist22 is relatively stable across combinations of discourse types, while
baseline-compressor22 achieves its optimal results when the longest discourse types (essays
5.3. Bias</p>
        <p>Positive Negative Unanswered</p>
      </sec>
      <sec id="sec-5-3">
        <title>5.4. Eficiency</title>
        <p>
          Beyond efectiveness, another criterion for evaluating an authorship verification system is
in terms of eficiency or its runtime cost. Depending on the application of specific kinds of
technology, this is a significant criterion, especially when large volumes of text have to be
analyzed. Table 5b shows the elapsed time of the run of each submitted method on TIRA. As can
be seen, the approaches that avoid the use of pre-trained language models [
          <xref ref-type="bibr" rid="ref24 ref25">25, 24</xref>
          ] achieve the
lowest runtime by a large margin. The highest runtime is required by the approach of huang22
that splits texts into segments and examines all combinations of segments.
        </p>
      </sec>
      <sec id="sec-5-4">
        <title>5.5. A Transfer-learning Experiment</title>
        <p>
          We applied the top-performing approaches from the previous 2021 edition of PAN [
          <xref ref-type="bibr" rid="ref30">30</xref>
          ] to
the current test dataset. Thanks to software submissions at TIRA, this can be accomplished
with relative ease. This amounts to a transfer-learning experiment, since the three models are
trained and fine-tuned on a cross-fandom authorship verification dataset but now tested on our
cross-discourse type dataset. The following methods have been employed:
• boenninghoff21 [
          <xref ref-type="bibr" rid="ref31">31</xref>
          ]: A deep learning-based approach including neural feature
extraction and deep metric learning, deep Bayes factor scoring, uncertainty modeling and
adaptation, a combined loss function, and an additional out-of-distribution detector for
non-responses. In its final step, the model was extended to a majority-voting ensemble.
• embarcaderoruiz21 [
          <xref ref-type="bibr" rid="ref32">32</xref>
          ]: Its main idea is similar to that of galicia22. A graph-based
representation approach is combined with a Siamese network.
• weerasinghe21 [
          <xref ref-type="bibr" rid="ref33">33</xref>
          ]: A variety of stylometric features, including character and POS
n-grams, function words, and vocabulary richness measures and a logistic regression
classifier, fed with the absolute diferences of these features for each text pair.
        </p>
        <p>We made no attempt to modify these methods before applying them to the new cross-discourse
type test dataset.</p>
        <p>
          The efectiveness of the above-mentioned methods on the PAN 2021 test data was exceptional.
All of them obtained an overall score (over the same five evaluation measures used in this paper)
of greater than 0.93 [
          <xref ref-type="bibr" rid="ref30">30</xref>
          ]. Table 6a shows the efectiveness of the 2021 models on the 2022 test
data. Unsurprisingly, the three models perform much worse. Their overall efectiveness on the
cross-discourse type dataset is very low, much lower than all but one of the seven submissions
and the two baselines shown in Table 3. This means that fine-tuning such models to particular
datasets hurts their generalizability. Moreover, cross-fandom verification and cross-discourse
type verification have diferent characteristics in terms of the two available datasets.
        </p>
        <p>Table 6b shows the number of positive and negative answers as well as non-answers for each
of the three 2021 models, which exert a clear bias of models towards negative answers. Note
that in the 2021 cross-fandom dataset, all texts have similar text length. Likely, this factor along
with other substantial diferences between fanfiction and the discourse types considered in the
cross-discourse type dataset confuse these models (or at least that they need appropriate
finetuning to improve the scaling of the produced verification scores). Note that the AUROC scores
(which do not depend on the scaling of verification scores) are also quite low.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>
        Previous shared tasks on authorship attribution at PAN played a crucial role to advance research
in the field of authorship analysis and modern methods have been using the PAN datasets for
evaluation purposes extensively and have incrementally improved the state of the art [
        <xref ref-type="bibr" rid="ref6 ref8">6, 8</xref>
        ].
Recent editions of PAN focused on fanfiction. The very good results obtained by the
topperforming submissions there may have given the false impression that authorship verification
is an almost solved problem [
        <xref ref-type="bibr" rid="ref13 ref14">13, 14</xref>
        ]. This is in fact not the case, as our experiment shows.
      </p>
      <p>This year, we focused on a very challenging version of the authorship verification task where
text pairs of diferent discourse types are used. When texts difer in communicative purpose,
intended audience, or level of formality, it is very challenging to identify stable characteristics
associated with authors across these discourse types. The efectiveness of all submissions in the
cross-discourse type datasets was comparatively low, some as low as a random-guess baseline.</p>
      <p>It is also surprising that all submissions, despite their increased level of sophistication in most
of the cases, were outperformed by a naive baseline based on character n-grams and cosine
similarity (at least according to the overall efectiveness across all five evaluation measures).
This suggests that traditional methods based on well-known stylometric features could still
be more efective than deep learning approaches using modern pre-trained language models
for this challenging task. Another factor is the volume of data available for training (roughly,
12,000 instances) that can be considered too little for deep learning-based approaches.</p>
      <p>Another crucial issue is text length. It seems that when the relatively long essays were used
as inputs, the graph-based approach of galicia22 was more efective. When shorter texts from
discourse types like emails, business memos, and text messages were used, the pre-trained
language-model-based approaches of najafi22 and jinli22 were more efective.</p>
      <p>The overall low efectiveness achieved shows that there is a lot of room for improvement in
cross-discourse type authorship verification. All submitted approaches adopted a unified model
that predicts authorship disregarding combinations of discourse types. Having separate models
for each combination of discourse types is an obvious next step. This would mean, however, that
the training data should also be split into smaller parts based on the combinations of discourse
types. An ensemble method combining traditional stylometric models and pre-trained language
models appears like a promising approach in this regard.
open world settings, in: G. Faggioli, N. Ferro, A. Joly, M. Maistro, F. Piroi (Eds.), CLEF 2021 Labs
and Workshops, Notebook Papers, CEUR-WS.org, 2021.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>E.</given-names>
            <surname>Stamatatos</surname>
          </string-name>
          ,
          <article-title>A survey of modern authorship attribution methods</article-title>
          ,
          <source>JASIST</source>
          <volume>60</volume>
          (
          <year>2009</year>
          )
          <fpage>538</fpage>
          -
          <lpage>556</lpage>
          . URL: https://doi.org/10.1002/asi.21001. doi:
          <volume>10</volume>
          .1002/asi.21001.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Koppel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Schler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Argamon</surname>
          </string-name>
          ,
          <article-title>Computational methods in authorship attribution</article-title>
          ,
          <source>Journal of the American Society for Information Science and Technology</source>
          <volume>60</volume>
          (
          <year>2009</year>
          )
          <fpage>9</fpage>
          -
          <lpage>26</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Koppel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Schler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Argamon</surname>
          </string-name>
          ,
          <article-title>Authorship attribution in the wild</article-title>
          ,
          <source>Language Resources and Evaluation</source>
          <volume>45</volume>
          (
          <year>2011</year>
          )
          <fpage>83</fpage>
          -
          <lpage>94</lpage>
          . doi:
          <volume>10</volume>
          .1007/s10579-009-9111-2.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>E.</given-names>
            <surname>Stamatatos</surname>
          </string-name>
          ,
          <article-title>Authorship verification: A review of recent advances</article-title>
          ,
          <source>Research in Computing Science</source>
          <volume>123</volume>
          (
          <year>2016</year>
          )
          <fpage>9</fpage>
          -
          <lpage>25</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>O.</given-names>
            <surname>Halvani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Graner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Regev</surname>
          </string-name>
          ,
          <article-title>Taveer: an interpretable topic-agnostic authorship verification method</article-title>
          , in: M.
          <string-name>
            <surname>Volkamer</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          Wressnegger (Eds.),
          <source>ARES 2020: The 15th International Conference on Availability, Reliability and Security</source>
          , ACM,
          <year>2020</year>
          , pp.
          <volume>41</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>41</lpage>
          :
          <fpage>10</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>N.</given-names>
            <surname>Potha</surname>
          </string-name>
          , E. Stamatatos,
          <article-title>Improving author verification based on topic modeling</article-title>
          ,
          <source>Journal of the Association for Information Science and Technology</source>
          <volume>70</volume>
          (
          <year>2019</year>
          )
          <fpage>1074</fpage>
          -
          <lpage>1088</lpage>
          . doi:https://doi.org/10.1002/asi.24183.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>M.</given-names>
            <surname>Koppel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Winter</surname>
          </string-name>
          ,
          <article-title>Determining if two documents are written by the same author</article-title>
          ,
          <source>Journal of the Association for Information Science and Technology</source>
          <volume>65</volume>
          (
          <year>2014</year>
          )
          <fpage>178</fpage>
          -
          <lpage>187</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S.</given-names>
            <surname>Ding</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Fung</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Iqbal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Cheung</surname>
          </string-name>
          ,
          <article-title>Learning stylometric representations for authorship analysis</article-title>
          ,
          <source>IEEE Transactions on Cybernetics</source>
          <volume>49</volume>
          (
          <year>2019</year>
          )
          <fpage>107</fpage>
          -
          <lpage>121</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>B.</given-names>
            <surname>Boenninghof</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. M.</given-names>
            <surname>Nickel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Zeiler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Kolossa</surname>
          </string-name>
          ,
          <article-title>Similarity learning for authorship verification in social media</article-title>
          ,
          <source>in: ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>2457</fpage>
          -
          <lpage>2461</lpage>
          . doi:
          <volume>10</volume>
          .1109/ICASSP.
          <year>2019</year>
          .
          <volume>8683405</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>T.</given-names>
            <surname>Gollub</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Beyer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Busse</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. M. R.</given-names>
            <surname>Pardo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Stamatatos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <article-title>Recent trends in digital text forensics and its evaluation - plagiarism detection, author identification, and author profiling</article-title>
          , in: P. Forner,
          <string-name>
            <given-names>H.</given-names>
            <surname>Müller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Paredes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          ,
          <string-name>
            <surname>B.</surname>
          </string-name>
          Stein (Eds.),
          <source>Information Access Evaluation</source>
          . Multilinguality, Multimodality, and Visualization - 4th
          <source>International Conference of the CLEF Initiative, CLEF</source>
          <year>2013</year>
          , Valencia, Spain,
          <source>September 23-26</source>
          ,
          <year>2013</year>
          . Proceedings, volume
          <volume>8138</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2013</year>
          , pp.
          <fpage>282</fpage>
          -
          <lpage>302</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Gollub</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. M. R.</given-names>
            <surname>Pardo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Stamatatos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <article-title>Improving the reproducibility of pan's shared tasks: - plagiarism detection, author identification, and author profiling</article-title>
          , in: E. Kanoulas,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lupu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. D.</given-names>
            <surname>Clough</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sanderson</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. M. Hall</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Hanbury</surname>
          </string-name>
          , E. G. Toms (Eds.),
          <source>Information Access Evaluation</source>
          . Multilinguality, Multimodality, and Interaction - 5th
          <source>International Conference of the CLEF Initiative, CLEF</source>
          <year>2014</year>
          ,
          <article-title>Shefield</article-title>
          , UK,
          <source>September 15-18</source>
          ,
          <year>2014</year>
          . Proceedings, volume
          <volume>8685</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2014</year>
          , pp.
          <fpage>268</fpage>
          -
          <lpage>299</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>E.</given-names>
            <surname>Stamatatos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. M. R.</given-names>
            <surname>Pardo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <article-title>Overview of the PAN/CLEF 2015 evaluation lab</article-title>
          , in: J.
          <string-name>
            <surname>Mothe</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Savoy</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Kamps</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Pinel-Sauvagnat</surname>
            ,
            <given-names>G. J. F.</given-names>
          </string-name>
          <string-name>
            <surname>Jones</surname>
          </string-name>
          , E. SanJuan, L. Cappellato, N. Ferro (Eds.),
          <source>Experimental IR Meets Multilinguality</source>
          , Multimodality, and Interaction - 6th
          <source>International Conference of the CLEF Association, CLEF</source>
          <year>2015</year>
          , Toulouse, France, September 8-
          <issue>11</issue>
          ,
          <year>2015</year>
          , Proceedings, volume
          <volume>9283</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2015</year>
          , pp.
          <fpage>518</fpage>
          -
          <lpage>538</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bevendorf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Ghanem</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Giachanou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kestemont</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Manjavacas</surname>
          </string-name>
          , I. Markov,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mayerl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. M. R.</given-names>
            <surname>Pardo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Specht</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Stamatatos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegmann</surname>
          </string-name>
          , E. Zangerle, Overview of PAN 2020:
          <article-title>Authorship verification, celebrity profiling, profiling fake news spreaders on twitter, and style change detection</article-title>
          , in: A.
          <string-name>
            <surname>Arampatzis</surname>
            , E. Kanoulas,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Tsikrika</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Vrochidis</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Joho</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Lioma</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Eickhof</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Névéol</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Cappellato</surname>
          </string-name>
          , N. Ferro (Eds.),
          <source>Experimental IR Meets Multilinguality</source>
          , Multimodality, and Interaction - 11th
          <source>International Conference of the CLEF Association, CLEF</source>
          <year>2020</year>
          , Thessaloniki, Greece,
          <source>September 22-25</source>
          ,
          <year>2020</year>
          , Proceedings, volume
          <volume>12260</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2020</year>
          , pp.
          <fpage>372</fpage>
          -
          <lpage>383</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bevendorf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Chulvi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. L. D. la Peña</given-names>
            <surname>Sarracén</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kestemont</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Manjavacas</surname>
          </string-name>
          , I. Markov,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mayerl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Rangel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Stamatatos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wolska</surname>
          </string-name>
          , E. Zangerle, Overview of PAN 2021:
          <article-title>Authorship verification, profiling hate speech spreaders on twitter, and style change detection</article-title>
          , in: K. S. Candan,
          <string-name>
            <given-names>B.</given-names>
            <surname>Ionescu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Goeuriot</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Larsen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Müller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Joly</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Maistro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Piroi</surname>
          </string-name>
          , G. Faggioli, N. Ferro (Eds.),
          <source>Experimental IR Meets Multilinguality</source>
          , Multimodality, and Interaction - 12th
          <source>International Conference of the CLEF Association, CLEF</source>
          <year>2021</year>
          ,
          <string-name>
            <given-names>Virtual</given-names>
            <surname>Event</surname>
          </string-name>
          ,
          <source>September 21-24</source>
          ,
          <year>2021</year>
          , Proceedings, volume
          <volume>12880</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2021</year>
          , pp.
          <fpage>419</fpage>
          -
          <lpage>431</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>S.</given-names>
            <surname>Bischof</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Deckers</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Schliebs</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Thies</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hagen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Stamatatos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <article-title>The Importance of Suppressing Domain Style in Authorship Analysis</article-title>
          , CoRR abs/
          <year>2005</year>
          .14714 (
          <year>2020</year>
          ). URL: https://arxiv.org/abs/
          <year>2005</year>
          .14714.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>G. W.</given-names>
            <surname>Brier</surname>
          </string-name>
          , et al.,
          <article-title>Verification of forecasts expressed in terms of probability</article-title>
          ,
          <source>Monthly weather review 78</source>
          (
          <year>1950</year>
          )
          <fpage>1</fpage>
          -
          <lpage>3</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>A.</given-names>
            <surname>Peñas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Rodrigo</surname>
          </string-name>
          ,
          <article-title>A simple measure to assess non-response, in: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1</article-title>
          , HLT '11,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computational Linguistics, USA,
          <year>2011</year>
          , p.
          <fpage>1415</fpage>
          -
          <lpage>1424</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bevendorf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hagen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <article-title>Generalizing unmasking for short texts</article-title>
          , in: J.
          <string-name>
            <surname>Burstein</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Doran</surname>
          </string-name>
          , T. Solorio (Eds.),
          <source>Proceedings of the</source>
          <year>2019</year>
          <article-title>Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis</article-title>
          , MN, USA, June 2-7,
          <year>2019</year>
          , Volume
          <volume>1</volume>
          (Long and Short Papers),
          <source>Association for Computational Linguistics</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>654</fpage>
          -
          <lpage>659</lpage>
          . URL: https://doi.org/10.18653/v1/n19-
          <fpage>1068</fpage>
          . doi:
          <volume>10</volume>
          .18653/v1/n19-
          <fpage>1068</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>F.</given-names>
            <surname>Pedregosa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Varoquaux</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gramfort</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Michel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Thirion</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Grisel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Blondel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Prettenhofer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Weiss</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Dubourg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Vanderplas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Passos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Cournapeau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Brucher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Perrot</surname>
          </string-name>
          , E. Duchesnay,
          <article-title>Scikit-learn: Machine learning in Python</article-title>
          ,
          <source>Journal of Machine Learning Research</source>
          <volume>12</volume>
          (
          <year>2011</year>
          )
          <fpage>2825</fpage>
          -
          <lpage>2830</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>W. J.</given-names>
            <surname>Teahan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. J.</given-names>
            <surname>Harper</surname>
          </string-name>
          ,
          <source>Using Compression-Based Language Models for Text Categorization</source>
          , Springer Netherlands, Dordrecht,
          <year>2003</year>
          , pp.
          <fpage>141</fpage>
          -
          <lpage>165</lpage>
          . URL: https://doi.org/10.1007/
          <fpage>978</fpage>
          -94-017-0171-
          <issue>6</issue>
          _7. doi:
          <volume>10</volume>
          .1007/
          <fpage>978</fpage>
          -94-017-0171-
          <issue>6</issue>
          _
          <fpage>7</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>M.</given-names>
            <surname>Kestemont</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Stover</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Koppel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Karsdorp</surname>
          </string-name>
          , W. Daelemans,
          <article-title>Authenticating the writings of julius caesar</article-title>
          ,
          <source>Expert Systems with Applications</source>
          <volume>63</volume>
          (
          <year>2016</year>
          )
          <fpage>86</fpage>
          -
          <lpage>96</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Gollub</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <article-title>TIRA integrated research architecture</article-title>
          , in: N.
          <string-name>
            <surname>Ferro</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          Peters (Eds.),
          <source>Information Retrieval Evaluation in a Changing World - Lessons Learned from 20 Years of CLEF</source>
          , volume
          <volume>41</volume>
          <source>of The Information Retrieval Series</source>
          , Springer,
          <year>2019</year>
          , pp.
          <fpage>123</fpage>
          -
          <lpage>160</lpage>
          . URL: https://doi.org/10.1007/978-3-
          <fpage>030</fpage>
          -22948-
          <issue>1</issue>
          _5. doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>030</fpage>
          -22948-1\_5.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>S.</given-names>
            <surname>Konstantinou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Zinonos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <article-title>Diferent Encoding Approaches for Authorship Verification, in: CLEF 2022 Labs and Workshops, Notebook Papers, CEUR-WS</article-title>
          .org,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Martinez-Galicia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Embarcadero-Ruiz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A</given-names>
            . R.-O. na, H.
            <surname>Gómez-Adorno</surname>
          </string-name>
          ,
          <article-title>Graph-Based Siamese Network for Authorship Verification, in: CLEF 2022 Labs and Workshops, Notebook Papers, CEUR-WS</article-title>
          .org,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>M.</given-names>
            <surname>Crespo-Sanchez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Gómez-Adorno</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Lopez-Arevalo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Aldana-Bobadilla</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Salas-Jimenez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Cortes-Lopez</surname>
          </string-name>
          ,
          <article-title>A Content Spectral-based Analysis for Authorship Verification, in: CLEF 2022 Labs and Workshops, Notebook Papers, CEUR-WS</article-title>
          .org,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>M.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Kong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Peng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Ye</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Jiang</surname>
          </string-name>
          , Z. Han,
          <article-title>Authorship verification Based On Fully Interacted Text Segments, in: CLEF 2022 Labs and Workshops, Notebook Papers, CEUR-WS</article-title>
          .org,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Lei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Qi</surname>
          </string-name>
          , H. Y,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Peng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <article-title>Application of BERT in Author Verification Task, in: CLEF 2022 Labs and Workshops, Notebook Papers, CEUR-WS</article-title>
          .org,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>M.</given-names>
            <surname>Najafi</surname>
          </string-name>
          , E. Tavan,
          <article-title>Text-to-Text Transformer in Authorship Verification Via Stylistic and Semantical Analysis, in: CLEF 2022 Labs and Workshops, Notebook Papers, CEUR-WS</article-title>
          .org,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Ye</surname>
          </string-name>
          , H. Y,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Peng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Kong</surname>
          </string-name>
          , Z. Han,
          <article-title>Authorship Verification Using Convolutional Neural Network, in: CLEF 2022 Labs and Workshops, Notebook Papers, CEUR-WS</article-title>
          .org,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>M.</given-names>
            <surname>Kestemont</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Stamatatos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Manjavacas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bevendorf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <source>Overview of the Authorship Verification Task at PAN</source>
          <year>2021</year>
          , in: G. Faggioli,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Joly</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Maistro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Piroi</surname>
          </string-name>
          (Eds.),
          <article-title>CLEF 2021 Labs and Workshops, Notebook Papers, CEUR-WS</article-title>
          .org,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>B.</given-names>
            <surname>Boenninghof</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. M.</given-names>
            <surname>Nickel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Kolossa</surname>
          </string-name>
          , O2D2:
          <article-title>Out-of-distribution detector to capture undecidable trials in authorship verification</article-title>
          , in: G. Faggioli,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Joly</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Maistro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Piroi</surname>
          </string-name>
          (Eds.),
          <article-title>CLEF 2021 Labs and Workshops, Notebook Papers, CEUR-WS</article-title>
          .org,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>D.</given-names>
            <surname>Embarcadero-Ruiz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Gómez-Adorno</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Reyes-Hernández</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>García</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Embarcadero-Ruiz</surname>
          </string-name>
          ,
          <article-title>Graph-based Siamese network for authorship verification</article-title>
          , in: G. Faggioli,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Joly</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Maistro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Piroi</surname>
          </string-name>
          (Eds.),
          <article-title>CLEF 2021 Labs and Workshops, Notebook Papers, CEUR-WS</article-title>
          .org,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>J.</given-names>
            <surname>Weerasinghe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Greenstadt</surname>
          </string-name>
          ,
          <article-title>Feature vector diference based authorship verification for</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>