<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Overview of the Author Identification Task at PAN 2015</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Efstathios Stamatatos</string-name>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Walter Daelemans</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ben Verhoeven</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Patrick Juola</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Aurelio López-López</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Martin Potthast</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Benno Stein</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Bauhaus-Universität Weimar</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Duquesne University</institution>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>INAOE</institution>
          ,
          <country country="MX">Mexico</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>University of Antwerp</institution>
          ,
          <country country="BE">Belgium</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>University of the Aegean</institution>
          ,
          <country country="GR">Greece</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper presents an overview of the author identification task at PAN-2015 evaluation lab. Similar to previous editions of PAN, this shared task focuses on the problem of author verification: given a set of documents by the same author and another document of unknown authorship, the task is to determine whether or not the known and unknown documents have the same author. However, in contrast to the setup of PAN-2013 and PAN-2014, as well as most previous work in this area, it is no longer assumed that all documents match in genre and topic. In other words, we study cross-topic and cross-genre author verification, a challenging, yet realistic, task. A new evaluation corpus was built, covering the four languages Dutch, English, Greek, and Spanish and comprising a variety of genres and topics. A total of 18 teams participated in this task. Following the practice of previous PAN editions, software submissions were required and evaluated within the evaluation-as-a-service platform TIRA. Based on TIRA, we were able to define challenging baseline models using submissions from the corresponding shared tasks at PAN-2013 and PAN-2014. Analytical evaluation results are given, including statistical significance tests. Moreover, we examine the performance of a heterogeneous ensemble that combines all participant models, and we present a comprehensive review of the submitted methods.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        The main idea behind author identification relies on the assumption that it is possible to
reveal the author of a text given (i) a set of candidate authors and (ii) a set of undisputed
text samples for each one of them [
        <xref ref-type="bibr" rid="ref16 ref44">16, 44</xref>
        ]. Writing style is the most crucial
information source to solve this task, and it is essential to be able to quantify stylistic choices
in texts and measure stylistic similarity between texts. Beyond its traditional literary
applications (e.g., verifying the authorship of disputed novels, identifying the author of
works published anonymously, etc.) [
        <xref ref-type="bibr" rid="ref17 ref48">17, 48</xref>
        ] author identification is associated with
important forensic applications (e.g. revealing the author of harassing messages in social
media, linking terrorist proclamations by their author, etc.) [
        <xref ref-type="bibr" rid="ref1 ref25">1, 25</xref>
        ].
      </p>
      <p>
        Author identification can be formulated in various ways, depending on the
number of candidate authors and whether the set of candidate authors is closed or open.
One particular variation is the task of authorship verification, where there is exactly
one candidate author with undisputed text samples, the task is to decide whether an
unknown text is by that author or not [
        <xref ref-type="bibr" rid="ref12 ref23 ref27">12, 23, 27</xref>
        ]. In more detail, the authorship
verification task corresponds to a one-class classification problem, where the samples of
known authorship by the author in question form the target class. All texts written by
other authors are viewed as the outlier class, a huge and heterogeneous class, which
renders finding representative samples difficult. However challenging, authorship
verification is a fundamental problem since any given author identification problem can
be decomposed into a set of authorship verification problems. Therefore, it provides an
excellent research field to examine competitive approaches aiming at the extraction of
reliable and general conclusions [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ].
      </p>
      <p>
        Previous PAN editions focused on the authorship verification task; a number of
evaluation corpora covering several natural languages and genres have been
created [
        <xref ref-type="bibr" rid="ref18 ref46">18, 46</xref>
        ]. Moreover, a suitable evaluation framework was developed, highlighting
the ability of methods to leave problems unanswered when there is high uncertainty,
as well as to assign probability scores to their answers. However, the previous editions
of PAN, as well as most work in the literature assume that all texts within a
verification case match for both genre and topic. This assumption simplifies the problem, since
style is affected by genre in addition to the personal style of each author. Moreover,
low-frequency stylistic features are heavily affected by topic nuances. Thus when all
documents match in genre and topic, the personal style of the authors would be the
major discriminating factor between texts.
      </p>
      <p>
        PAN-2015 still focuses on authorship verification, but it is no longer assumed that
all texts within a verification problem match for genre and topic. This cross-genre and
cross-topic variation of the verification task corresponds to a more realistic view of the
problem at hand, since, in many applications, it is not possible to obtain text samples
of undisputed authorship by certain authors in specific genres and topics. For instance,
verifying the authorship of a suicide note, it does not make sense to look for samples
of suicide notes by the suspects [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. In addition, the author of a anonymously published
crime fiction novel might be a famous child fiction author who has never published a
crime fiction novel before [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ].
      </p>
      <p>
        A new cross-genre and cross-topic corpus was built, covering four languages and
a variety of genres and topics. We received 18 software submissions that were
evaluated on the TIRA experimentation platform [
        <xref ref-type="bibr" rid="ref37 ref9">9, 37</xref>
        ]. Following the practice of previous
PAN editions, we also examine the performance of baseline models, based on
submissions to the corresponding tasks in PAN-2013 and PAN-2014, and train a heterogeneous
ensemble classifier that fuses the output of all submitted methods as if they were on.
      </p>
      <p>The remainder of this paper is organized as follows: the next section describes
related work in cross-genre and cross-topic author identification. Section 3 presents the
evaluation framework of our shared task on author identification at PAN-2015, and
Section 4 describes the new evaluation corpus. Section 5 reports on evaluation results
obtained, including tests of statistical significance. Then, Section 6 presents a review of
the submitted methods, and Section 7 summarizes the main conclusions and discusses
directions for future work.</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>
        A review of related work on authorship verification, including the results of previous
editions of PAN with respect to this task, is given in [
        <xref ref-type="bibr" rid="ref46">46</xref>
        ]. Most of the related work
on authorship verification—and author identification in general—concerns only cases
where the examined documents match for genre and topic [
        <xref ref-type="bibr" rid="ref12 ref24 ref27 ref47">12, 24, 27, 47</xref>
        ]. A notable
exception has been reported by Koppel et al. [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ], who apply unmasking to authorship
verification problems where multiple topics were covered by each author, producing
very reliable results. Kestemont et al. [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] use the same method in a cross-genre
authorship verification experiment on a corpus of prose and theatrical works by a number
of authors, demonstrating that unmasking (with default settings) is ineffective in such
difficult cases.
      </p>
      <p>
        In general, a study focusing on cross-genre and cross-topic authorship attribution,
where a closed set of candidate authors is used (a simpler case in comparison to
authorship verification) is presented in [
        <xref ref-type="bibr" rid="ref45">45</xref>
        ]: a corpus of opinion articles covering multiple
topics and book reviews, all published in a UK newspaper, was used, and experimental
results revealed that character n-gram features are more robust with respect to word
features in cross-topic and cross-genre conditions. More recently, Sapkota et al. [
        <xref ref-type="bibr" rid="ref39">39</xref>
        ] show
that character n-grams corresponding to word affixes, including punctuation marks, are
the most significant features in cross-topic authorship attribution. In addition, Sapkota
et al. [
        <xref ref-type="bibr" rid="ref40">40</xref>
        ] demonstrate that using training texts from multiple topics instead of a single
topic can significantly help to correctly recognize the author of texts on another topic.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Evaluation Setup</title>
      <p>The evaluation setup for this task is basically identical to the one used for PAN-2014.
Given a set of documents known to be written by the same author, and exactly one
document of unknown authorship, the task is to determine whether the latter document is
written by the same author as the former ones. Text length varies from a few hundred to
a few thousand words, depending on genre. It is also assumed that positive and negative
answers have equal prior probabilities. The only difference to PAN-2014 is that texts
within a problem do not necessarily match for genre and/or topic.</p>
      <p>
        Participants are asked to submit software that provides a [
        <xref ref-type="bibr" rid="ref1">0,1</xref>
        ]-normalized score
corresponding to the probability of a positive answer (i.e., the known documents and the
questioned document are by the same author) for each verification problem. It is
possible to leave some problems unanswered by assigning a probability score of exactly 0.5.
The evaluation of the provided answers is based on two scalar measures: the Area
Under the Receiver Operating Characteristic Curve (AUC) [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], and c@1 [
        <xref ref-type="bibr" rid="ref34">34</xref>
        ]. The former
tests the ability of methods to rank scores appropriately, assigning low values to
negative problems and high values to positive problems. The latter rewards methods that
leave problems unanswered rather than providing wrong answers. Finally, the
participating teams are ranked by the final score (AUC c@1).
      </p>
      <p>Baselines One of the advantages of using TIRA for the evaluation of software
submissions is its support for the continuous evaluation of software against newly developed
corpora. This enables us to apply software that has been submitted to previous editions
of PAN to the cross-genre and cross-topic corpora of PAN-2015. Furthermore, we can
avoid the use of simplistic random-guess baselines (corresponding to a final score of
0.25), and, establish more challenging baselines, adapted to the difficulty of the corpus.
These baselines reveals if a newly submitted software performs better than
state-of-theart models. We employ the following three baselines:
– PAN13-BASELINE: The best-performing software submitted to PAN-2013 by</p>
      <p>
        Jankowska et al. [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. This software also served as baseline in PAN-2014 [
        <xref ref-type="bibr" rid="ref46">46</xref>
        ].
– PAN14-BASELINE-1: The second-best software submitted to PAN-2014 by Fréry
et al. [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].1
– PAN14-BASELINE-2: The third-best software submitted to PAN-2014 by Castillo
et al. [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]
      </p>
      <p>
        In addition, following previous PAN editions, we train a meta-model that combines
all participant approaches [
        <xref ref-type="bibr" rid="ref18 ref38 ref46">18, 38, 46</xref>
        ]. A heterogeneous ensemble is built based on the
average of scores returned by participants for the verification problems of our evaluation
corpus (hereafter called PAN15-ENSEMBLE). Note that the baseline obtained from
PAN-2013 and PAN-2014 have been trained and fine-tuned using different corpora, and
under the assumption that all documents within a problem instance match for genre and
topic. Therefore, their performance on cross-genre and cross-topic author verification
corpora will not be optimal.
4
      </p>
    </sec>
    <sec id="sec-4">
      <title>Evaluation Corpus</title>
      <p>Although it is rather simple to compile a corpus of texts by different authors that belong
to different genres/topics (i.e., negative instances of the authorship verification task), it
is a lot more challenging to populate the corpus with corresponding positive instances
(i.e., texts in different genres/topics by the same author). A new corpus was built that
matches the size of the PAN-2014 evaluation corpus, and that covers the same four
languages: Dutch, English, Greek, and Spanish. The corpus is divided into a training
part, which is released to participants, and a test part, which is used to compute the
official evaluation results. Table 1 shows key figures of the corpus.</p>
      <p>There are notable differences between the sub-corpora for each language. In the
English part, only one known document per problem is provided. In Dutch and Greek
parts, the number of known documents per problem vary, whereas, in the Spanish part,
there are always four known documents per problem. The documents of Greek and
Spanish parts are, on average, longer than those of the Dutch and English parts. For all
languages, positive and negative instances are equally distributed.</p>
      <p>
        The Dutch part of our evaluation corpus is a modified version of Verhoeven and
Daelemans [
        <xref ref-type="bibr" rid="ref50">50</xref>
        ]’s CLiPS Stylometry Investigation corpus, which comprises documents
from two genres (essays and reviews), written by language students at the University
of Antwerp between 2012 and 2014. The English part is a collection dialog lines from
plays, excluding speaker names, stage directions, lists of characters, and so on. All
positive verification instances comprise parts from different plays by the same author.
1 Due to some technical problems it was not possible to also test the first PAN-2014 winner
The English part is the largest one in terms of verification problems. The Greek part
is a collection of opinion articles, published at the online forum Protagon,2 where all
documents are categorized into several categories (e.g., Politics, Economy, Science,
Health, Media, Sports, etc). For all verification problems of the Greek part, the category
of the known documents is the same, but different that of the questioned document. The
Spanish part consists of opinion articles taken from a variety of online newspapers and
magazines, as well as personal web pages or blogs covering, each covering a variety
of topics. It also includes literary essays. This part mixes cross-topic and cross-genre
problems, where some problems comprise documents that are noticeable different in
both topic and genre, and others match for genre but differ in topic.
5
      </p>
    </sec>
    <sec id="sec-5">
      <title>Evaluation Results</title>
      <p>In total, 18 teams submitted their software for this task. The submitted author
verification approaches processed each language of the corpus separately using the TIRA
experimentation platform. During evaluation, participants did not have access to
standard output, standard error, and the evaluation results of their systems. The PAN
organizers served as moderators to verify the successful execution of each participant’s
software. The majority of the 18 teams were able to process all four language parts of
the evaluation corpus.</p>
      <p>
        Table 2 compiles the final score (AUC c@1) of all teams for each language of our
corpus, alongside micro-averaged and macro-averaged scores. The performances of the
3 baselines and that of the ensemble can also be seen. Since the English part is much
larger with respect to the number of problems, the macro-averaged score provides for a
fair overall picture of the capabilities of each team’s approach across all languages. On
average, the best results were achieved for the cross-topic Greek part. Quite predictably,
the cross-genre Dutch part proved to be the most challenging one, followed by the
English part (this can be explained by the low number of known documents per problem).
Note that the Greek and Spanish parts comprise longer texts (on average more than 500
2 http://www.protagon.gr
Bagnall [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] 0.451
Bartoli et al. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] 0.518
Castro-Castro et al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] 0.247
Gómez-Adorno et al. [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] 0.390
Gutierrez et al. [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] 0.329
Halvani [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] 0.455
Hürlimann et al. [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] 0.616
Kocher and Savoy [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] 0.218
Maitra et al. [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ] 0.518
Mechti et al. [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ] –
Moreau et al. [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ] 0.635
Nikolov et al. [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ] 0.089
Pacheco et al. [
        <xref ref-type="bibr" rid="ref33">33</xref>
        ] 0.624
Pimas et al. [
        <xref ref-type="bibr" rid="ref35">35</xref>
        ] 0.262
Posadas-Durán et al. [
        <xref ref-type="bibr" rid="ref36">36</xref>
        ] 0.132
Sari and Stevenson [
        <xref ref-type="bibr" rid="ref41">41</xref>
        ] 0.381
Solórzano et al. [
        <xref ref-type="bibr" rid="ref43">43</xref>
        ] 0.153
Vartapetiance and Gillam [
        <xref ref-type="bibr" rid="ref49">49</xref>
        ] 0.262
PAN15-ENSEMBLE 0.426
PAN14-BASELINE-1 [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] 0.255
PAN14-BASELINE-2 [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] 0.191
PAN13-BASELINE [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] 0.242
nall [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and Moreau et al. [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ] clearly outperform the rest of the participants,
respectively. The former seems to be particularly effective for cross-topic verification, but
seems to be affected by differences in genre, judging by the low performance on the
Dutch part. The latter is very effective for cross-genre verification on the Dutch part,
whereas its performance is worse on the English part where only one known document
per problem is available. Most of the rest of participants did not manage to achieve
notable performances across all four corpora. For example, Bartoli et al. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] achieves
good results for Dutch and Spanish, but fails to be competitive in English and Greek,
while the picture is reversed for Kocher and Savoy [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]. Exceptions to the rule are the
approaches of Pacheco et al. [
        <xref ref-type="bibr" rid="ref33">33</xref>
        ] and Hürlimann et al. [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
      </p>
      <p>
        Unlike the evaluation results of PAN-2013 and PAN-2014 [
        <xref ref-type="bibr" rid="ref18 ref46">18, 46</xref>
        ], the ensemble of
all participants is not the best-performing approach. With respect to the micro-averaged
and macro-averaged final scores, the ensemble is outperformed by 5 and 4 participants,
respectively. An explanation for the mediocre performance of the meta-model can be
found in the low average performances of the submitted approaches. This is
demonstrated by the fact that all PAN-2014 participants achieved a micro-averaged final score
greater than 0.3 while 6 out of 18 PAN-2015 participants achieve a micro-averaged
final score lower than 0.3—a score close to the final score of 0.25 of a random-guessing
model.
      </p>
      <p>A more detailed picture of the evaluation results can be found in Table 3, where,
apart from the final score (FS), also ROC AUC, c@1, the number of Unanswered
Problems (UP), and the runtime are reported for Dutch, English, Greek, and Spanish,
re</p>
      <p>c@1), area under the curve (AUC) of the receiver operating characteristic
(ROC), c@1, unanswered problems (UP), and runtime.</p>
      <p>(a) Dutch
(b) English</p>
      <sec id="sec-5-1">
        <title>Team FS</title>
        <p>
          Moreau et al. [
          <xref ref-type="bibr" rid="ref30">30</xref>
          ] 0.635 0.825 0.770 0 08:09:35
Pacheco et al. [
          <xref ref-type="bibr" rid="ref33">33</xref>
          ] 0.624 0.822 0.759 30 00:05:08
Hürlimann et al. [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] 0.616 0.808 0.762 1 00:00:38
Maitra et al. [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ] 0.518 0.759 0.683 4 02:32:48
Bartoli et al. [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] 0.518 0.751 0.689 1 00:07:01
Halvani [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] 0.455 0.709 0.642 8 00:00:09
Bagnall [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] 0.451 0.700 0.644 2 12:00:43
PAN15-ENSEMBLE 0.426 0.696 0.612 0 –
Gómez-Adorno et al. [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] 0.390 0.625 0.624 0 83:58:15
Sari and Stevenson [
          <xref ref-type="bibr" rid="ref41">41</xref>
          ] 0.381 0.613 0.621 4 00:02:04
Gutierrez et al. [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] 0.329 0.592 0.556 5 00:40:32
Vartapetiance and G. [
          <xref ref-type="bibr" rid="ref49">49</xref>
          ] 0.262 0.512 0.512 1 00:44:51
Pimas et al. [
          <xref ref-type="bibr" rid="ref35">35</xref>
          ] 0.262 0.508 0.515 0 00:02:27
PAN14-BASELINE-1 0.255 0.506 0.503 0 00:00:17
Castro-Castro et al. [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] 0.247 0.503 0.491 0 00:05:51
PAN13-BASELINE 0.242 0.506 0.479 0 00:00:47
Kocher and Savoy [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ] 0.218 0.449 0.484 18 00:00:07
PAN14-BASELINE-2 0.191 0.422 0.452 16 00:02:10
Solórzano et al. [
          <xref ref-type="bibr" rid="ref43">43</xref>
          ] 0.153 0.397 0.385 4 00:10:25
Posadas-Durán et al. [
          <xref ref-type="bibr" rid="ref36">36</xref>
          ] 0.132 0.382 0.346 54 36:39:07
Nikolov et al. [
          <xref ref-type="bibr" rid="ref31">31</xref>
          ] 0.089 0.256 0.348 1 00:00:47
Mechti et al. [
          <xref ref-type="bibr" rid="ref29">29</xref>
          ] 0.000 0.500 0.000 165 –
(c) Greek
Bagnall [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] 0.750 0.882 0.851 5 10:07:49
Moreau et al. [
          <xref ref-type="bibr" rid="ref30">30</xref>
          ] 0.693 0.887 0.781 10 07:07:42
Kocher and Savoy [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ] 0.631 0.822 0.768 20 00:00:11
Hürlimann et al. [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] 0.599 0.788 0.760 0 00:01:01
Gutierrez et al. [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] 0.581 0.802 0.725 5 00:28:32
PAN15-ENSEMBLE 0.537 0.779 0.690 0 –
Pacheco et al. [
          <xref ref-type="bibr" rid="ref33">33</xref>
          ] 0.517 0.773 0.670 3 00:02:01
Halvani [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] 0.493 0.767 0.643 9 00:00:17
Bartoli et al. [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] 0.458 0.698 0.657 1 00:07:45
Nikolov et al. [
          <xref ref-type="bibr" rid="ref31">31</xref>
          ] 0.454 0.709 0.640 0 00:01:01
PAN14-BASELINE-2 0.412 0.634 0.650 0 00:01:22
Castro-Castro et al. [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] 0.391 0.621 0.630 0 00:17:59
PAN13-BASELINE 0.384 0.641 0.600 0 00:01:46
Maitra et al. [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ] 0.357 0.613 0.582 4 06:22:48
Gómez-Adorno et al. [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] 0.348 0.590 0.590 0 00:09:22
Solórzano et al. [
          <xref ref-type="bibr" rid="ref43">43</xref>
          ] 0.330 0.590 0.560 0 00:12:56
Pimas et al. [
          <xref ref-type="bibr" rid="ref35">35</xref>
          ] 0.230 0.480 0.480 0 00:03:58
Vartapetiance and G. [
          <xref ref-type="bibr" rid="ref49">49</xref>
          ] 0.212 0.460 0.460 0 00:36:30
PAN14-BASELINE-1 0.198 0.484 0.410 28 00:00:30
Mechti et al. [
          <xref ref-type="bibr" rid="ref29">29</xref>
          ] 0.000 0.500 0.000 100 –
Posadas-Durán et al. [
          <xref ref-type="bibr" rid="ref36">36</xref>
          ] 0.000 0.500 0.000 100 –
Sari and Stevenson [
          <xref ref-type="bibr" rid="ref41">41</xref>
          ] 0.000 0.500 0.000 100 –
Bagnall [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] 0.614 0.811 0.757 3 21:44:03
Castro-Castro et al. [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] 0.520 0.750 0.694 0 02:07:20
Gutierrez et al. [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] 0.513 0.739 0.694 39 00:37:06
Kocher and Savoy [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ] 0.508 0.738 0.689 94 00:00:24
PAN15-ENSEMBLE 0.468 0.786 0.596 0 –
Halvani [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] 0.458 0.762 0.601 25 00:00:21
Moreau et al. [
          <xref ref-type="bibr" rid="ref30">30</xref>
          ] 0.453 0.709 0.638 0 24:39:22
Pacheco et al. [
          <xref ref-type="bibr" rid="ref33">33</xref>
          ] 0.438 0.763 0.574 2 00:15:01
Hürlimann et al. [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] 0.412 0.648 0.636 5 00:01:46
PAN14-BASELINE-2 0.409 0.639 0.640 0 00:26:19
PAN13-BASELINE 0.404 0.654 0.618 0 00:02:44
Posadas-Durán et al. [
          <xref ref-type="bibr" rid="ref36">36</xref>
          ] 0.400 0.680 0.588 0 01:41:50
Maitra et al. [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ] 0.347 0.602 0.577 10 15:19:13
Bartoli et al. [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] 0.323 0.578 0.559 3 00:20:33
Gómez-Adorno et al. [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] 0.281 0.530 0.530 0 07:36:58
Solórzano et al. [
          <xref ref-type="bibr" rid="ref43">43</xref>
          ] 0.259 0.517 0.500 0 00:29:48
Nikolov et al. [
          <xref ref-type="bibr" rid="ref31">31</xref>
          ] 0.258 0.493 0.524 16 00:01:36
Pimas et al. [
          <xref ref-type="bibr" rid="ref35">35</xref>
          ] 0.257 0.507 0.506 0 00:07:22
PAN14-BASELINE-1 0.249 0.537 0.464 159 00:01:11
Mechti et al. [
          <xref ref-type="bibr" rid="ref29">29</xref>
          ] 0.247 0.489 0.506 0 00:04:59
Sari and Stevenson [
          <xref ref-type="bibr" rid="ref41">41</xref>
          ] 0.201 0.401 0.500 0 00:05:47
Vartapetiance and G. [
          <xref ref-type="bibr" rid="ref49">49</xref>
          ] 0.000 0.500 0.000 500 –
(d) Spanish
Bartoli et al. [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] 0.773 0.932 0.830 0 00:09:16
Bagnall [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] 0.721 0.886 0.814 10 11:21:41
PAN15-ENSEMBLE 0.715 0.894 0.800 0 –
PAN14-BASELINE-2 0.683 0.823 0.830 0 00:04:03
Pacheco et al. [
          <xref ref-type="bibr" rid="ref33">33</xref>
          ] 0.663 0.908 0.730 0 00:04:23
Moreau et al. [
          <xref ref-type="bibr" rid="ref30">30</xref>
          ] 0.661 0.853 0.775 25 15:27:31
Hürlimann et al. [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] 0.539 0.739 0.730 0 00:01:29
Gutierrez et al. [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] 0.509 0.755 0.674 7 00:24:20
Sari and Stevenson [
          <xref ref-type="bibr" rid="ref41">41</xref>
          ] 0.485 0.724 0.670 0 00:03:48
Posadas-Durán et al. [
          <xref ref-type="bibr" rid="ref36">36</xref>
          ] 0.462 0.680 0.680 0 02:20:35
PAN14-BASELINE-1 0.443 0.692 0.640 0 00:00:45
Halvani [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] 0.441 0.704 0.627 23 00:00:14
PAN13-BASELINE 0.367 0.656 0.560 0 00:02:37
Kocher and Savoy [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ] 0.366 0.650 0.564 20 00:00:22
Maitra et al. [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ] 0.352 0.610 0.577 3 10:36:31
Vartapetiance and G. [
          <xref ref-type="bibr" rid="ref49">49</xref>
          ] 0.348 0.590 0.590 0 00:48:37
Castro-Castro et al. [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] 0.329 0.558 0.590 0 00:23:54
Gómez-Adorno et al. [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] 0.281 0.530 0.530 0 00:50:41
Pimas et al. [
          <xref ref-type="bibr" rid="ref35">35</xref>
          ] 0.240 0.490 0.490 0 00:04:12
Solórzano et al. [
          <xref ref-type="bibr" rid="ref43">43</xref>
          ] 0.218 0.454 0.480 0 00:11:18
Nikolov et al. [
          <xref ref-type="bibr" rid="ref31">31</xref>
          ] 0.095 0.280 0.340 0 00:01:09
Mechti et al. [
          <xref ref-type="bibr" rid="ref29">29</xref>
          ] 0.000 0.500 0.000 100 –
        </p>
      </sec>
      <sec id="sec-5-2">
        <title>Team FS</title>
        <p>spectively. In these tables, participants as well as the baseline models and the
PAN15ENSEMBLE are ranked according to their final score.</p>
        <p>
          The performance of the baseline models reflects the difficulty of the evaluation
corpora. In the Dutch cross-genre part, all three baselines do note improve much above
a random-guessing classifier. The PAN13-BASELINE and the PAN14-BASELINE-2
provide relatively good results for the English and Greek cross-topic parts, while the
performance of PAN14-BASELINE-1 is considerably lower. This may be explained
by the fact that the latter method is based on eager supervised learning, so that it
depends too much on the properties of its original training corpus [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. Both the
PAN14BASELINE-1 and the PAN14-BASELINE-2 are perform significantly better when
applied to the mixed Spanish part, where some verification problems match the
properties of PAN-2014 corpora. On average, the PAN13-BASELINE and the
PAN140.9
0.8
        </p>
        <p>False positive rate (FPR)</p>
        <p>BASELINE-2 outperform almost half of the participating teams, demonstrating their
potential as generic approaches that can be used on any given corpus. On the other hand,
the average performance of the PAN14-BASELINE-1 resembles random-guessing.</p>
        <p>Based on the performance of the baseline models and the PAN15-ENSEMBLE, we
can divide the 18 submitted approaches into 3 rough categories for each language:
– Remarkable. Approaches whose performance is better than PAN15-ENSEMBLE.
– Good. Approaches whose performance is higher than PAN13-BASELINE and</p>
        <p>lower than PAN15-ENSEMBLE.</p>
        <p>– Poor. Approaches whose performance is lower than PAN13-BASELINE.</p>
        <p>
          ROC Curves To obtained a more insights into the performances of the submitted
methods, Figure 1 shows the ROC curves of the top-4 participants alongside the convex
hull of all 18 participants, and the ROC curve of PAN15-ENSEMBLE for the entire
evaluation corpus (865 verification problems). As can be seen, the convex hull of the
submitted methods is almost completely dominated by the winning approach of Bagnall
[
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. However, at very low/high FPR values, the approach of Pacheco et al. [
          <xref ref-type="bibr" rid="ref33">33</xref>
          ] performs
better. These points correspond to the case where false positives/negatives have a very
high cost. The performance of the PAN15-ENSEMBLE is also very competitive for
such extreme cases, especially for low FPR values.
]4[ 2- ENI LESAB-41 NAP ** = = ** = * ** ** = * * * = * * * * * = * *
* * * * * * * *
* * * * * * * * *
]8[ 1- ENI LESAB-41 NAP ** ** ** ** ** ** ** ** ** ** ** * * * * ** ** *** *** ***
* *
* * * * * * * * * * * * *
]51[ ENI LESAB-31 NAP ** * ** = ** = * = = * * * = ** *** *** *** *** **
* * * *
* * * * *
ELB MESNE-51 NAP * * * * * * * * * *
* = = * = ** * * * * * * = * * * * *
* * * * * * * * * *
] [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]
9
4
[
] ]
6 ] am ]5 [8
[3 14 ll [1 -1
.l [ ] i E
a n 3 G L E E
la te [11 .[ ovy ]8 ]9 ]03 ][
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]3 t o [4 d B IN IN
        </p>
        <p>
          1
]3 te o . l 2 [ 3 [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] e sn .l an ML L
.[ ro ron tla ] tea aS .[2 .[ .la .la .la .[35 ráun teev tea ce SE SE SE
ll][ang2 tliltreao ttr-ssaoC -ezdAm tirreeze li[avn13 lirannm rceahnd titlraea titlceah treaeu lteoov teceho tlsaea -saadD iandS rzaóno ttiraeapn -15ENN -13BNA -14BNA
a a a ó u a ü o a e o ik ca im so ra lo a A A A
        </p>
        <p>
          B B C G G H H K MMMN P P P S S V P P P
Statistical significance tests Following PAN-2014 [
          <xref ref-type="bibr" rid="ref46">46</xref>
          ], we compute the statistical
significance of performance differences between all examined approaches using
approximate randomization testing [
          <xref ref-type="bibr" rid="ref32">32</xref>
          ]. This non-parametric test does not make
assumptions that do not hold for the performance measures used, and it can handle complicated
distributions. We did a pairwise comparison of the accuracy of all approaches based on
this method and the results are shown in Table 4. The null hypothesis is that there is no
difference in the output of two approaches. When the probability of accepting the null
hypothesis is p &gt; 0.05, we consider there to be no significant difference of the output
of two approaches (denoted as =). When 0.01 &lt; p &lt; 0.05 the difference is significant
(denoted as *), when 0.001 &lt; p &lt; 0.01 the difference is very significant (denoted as **),
and when p &lt; 0.001 the difference is highly significant (denoted as ***).
        </p>
        <p>
          The overall performance of the winning approach of Bagnall [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] is significantly
better compared to the rest of the submissions as well as the baseline methods, and
the ensemble of all submissions. It should be noted that in most cases the difference
is highly significant. The second best-performing approach by Moreau et al. [
          <xref ref-type="bibr" rid="ref30">30</xref>
          ] is
also significantly better compared to the remaining approaches, with two exceptions:
Castro-Castro et al. [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] and Hürlimann et al. [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]. Beyond the first two winners, it is
noteworthy that the approach of Hürlimann et al. [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] is significantly different from
the rest of submitted approaches. Moreover, the group of methods from Bartoli et al.
[
          <xref ref-type="bibr" rid="ref3">3</xref>
          ], Gutierrez et al. [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ], Halvani [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ], Kocher and Savoy [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ], Maitra et al. [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ], and
Pacheco et al. [
          <xref ref-type="bibr" rid="ref33">33</xref>
          ]) achieves reasonably good performances, but in most of their
pairwise comparisons, no statistically significant difference between them can be observed.
6
        </p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Review of Submitted Methods</title>
      <p>
        The overall best-performing approach of Bagnall [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] in terms of both micro-averaged
and macro-averaged final score introduces a character-level Recurrent Neural Network
model. The success of this model demonstrates that character-level information can be
used in elaborate models to enhance performance compared to naive character n-gram
frequencies. The second best-performing approach by Moreau et al. [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ] is based on
a heterogeneous ensemble combined with stacked generalization. The success of this
model verifies the conclusions of previous editions of PAN that different verification
models, when combined, can achieve very good results [
        <xref ref-type="bibr" rid="ref18 ref46">18, 46</xref>
        ]. It should be noted that
both winning approaches require remarkably high computational cost. To allow for a
quick comparison between the submitted approaches, Table 5 compiles an overview of
their basic characteristics. In the remainder of this section, we review the submitted
approaches in closer detail.
      </p>
      <p>
        Verification Model There are two main categories of verification models, namely
intrinsic and extrinsic model. The intrinsic models only use the texts within a verification
problem (the known documents by one author and the unknown document) to arrive
at their decision. Usually, they handle the verification task as a one-class classification
problem [
        <xref ref-type="bibr" rid="ref15 ref4 ref8">15, 4, 8</xref>
        ]. In addition to that, the extrinsic models also use other texts by
different authors and attempt to transform the verification task to a binary classification
problem [
        <xref ref-type="bibr" rid="ref20 ref24 ref42">42, 20, 24</xref>
        ].
      </p>
      <sec id="sec-6-1">
        <title>Team (alphabetically)</title>
        <p>
          Bagnall [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]
Bartoli et al. [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]
Castro-Castro et al. [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]
Gómez-Adorno et al. [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]
Gutierrez et al. [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]
Halvani [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]
Hürlimann et al. [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]
Kocher and Savoy [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ]
Maitra et al. [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ]
Mechti et al. [
          <xref ref-type="bibr" rid="ref29">29</xref>
          ]
Moreau et al. [
          <xref ref-type="bibr" rid="ref30">30</xref>
          ]
Nikolov et al. [
          <xref ref-type="bibr" rid="ref31">31</xref>
          ]
Pacheco et al. [
          <xref ref-type="bibr" rid="ref33">33</xref>
          ]
Pimas et al. [
          <xref ref-type="bibr" rid="ref35">35</xref>
          ] intrinsic
Posadas-Durán et al. [
          <xref ref-type="bibr" rid="ref36">36</xref>
          ] intrinsic
Sari and Stevenson [
          <xref ref-type="bibr" rid="ref41">41</xref>
          ] intrinsic
Solórzano et al. [
          <xref ref-type="bibr" rid="ref43">43</xref>
          ] intrinsic
Vartapetiance and Gillam [
          <xref ref-type="bibr" rid="ref49">49</xref>
          ] intrinsic
extrinsic
intrinsic
extrinsic
intrinsic
extrinsic
intrinsic
intrinsic
extrinsic
intrinsic
extrinsic
extrinsic
intrinsic
extrinsic
        </p>
        <p>
          Both in PAN-2013 and PAN-2014, the overall best-performing approach employed
extrinsic models; more specifically, variations of the impostors method [
          <xref ref-type="bibr" rid="ref20 ref42">20, 42</xref>
          ].
Likewise, most of the best-performing submissions to PAN-2015, including the two
topperforming ones, employ extrinsic models [
          <xref ref-type="bibr" rid="ref11 ref2 ref21 ref30 ref33 ref5">2, 5, 11, 21, 30, 33</xref>
          ]. The impostors method
is part of the approach of Moreau et al. [
          <xref ref-type="bibr" rid="ref30">30</xref>
          ], while Gutierrez et al. [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] and Kocher
and Savoy [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ] propose modifications thereof. The best-performing intrinsic models
are proposed by Bartoli et al. [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ], Halvani [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ], Hürlimann et al. [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ], and Maitra et al.
[
          <xref ref-type="bibr" rid="ref28">28</xref>
          ]. It should be noted that the performance of the latter approaches on the cross-genre
Dutch corpus were remarkable.
        </p>
        <p>
          Learning algorithm The submitted author verification methods can be further
distinguished by their approach to supervised learning, namely eager methods and lazy
methods. The former make use of supervised learning algorithms to extract a general model
of the verification problems, based on the training data. Such methods strongly depend
on the size, quality and representativeness of the training data. Only a few eager
methods were submitted in previous PAN editions, including that of Fréry et al. [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ], whereas
the majority of submissions to PAN-2015 belong to this category. Well-known and
popular supervised machine learning algorithms were used, like SVMs [
          <xref ref-type="bibr" rid="ref14 ref29 ref31 ref35 ref41">14, 29, 31, 35, 41</xref>
          ],
random forest [
          <xref ref-type="bibr" rid="ref28 ref3 ref33">3, 28, 33</xref>
          ], and genetic algorithms [
          <xref ref-type="bibr" rid="ref30">30</xref>
          ].
        </p>
        <p>
          Lazy methods do not apply any eager supervised learning algorithm, but make a
decision based on information extracted for each verification problem separately. The
winning approach of Bagnall [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ], as well as some other submissions that achieve very
good performance [
          <xref ref-type="bibr" rid="ref11 ref13 ref21">11, 13, 21</xref>
          ], belong to this category.
        </p>
        <p>
          Attribution Paradigm In author identification two attribution paradigms are
distinguished [
          <xref ref-type="bibr" rid="ref44">44</xref>
          ]: the instance-based paradigm attempts to capture the style of documents
by representing each document separately [
          <xref ref-type="bibr" rid="ref11 ref14 ref2 ref3 ref30">2, 3, 11, 14, 30</xref>
          ]. The profile-based paradigm
attempts to capture the style of authors by computing a single representation for all texts
written by the same author, a so-called author profile. The latter approach is generally
more robust when few texts (in quantity or length) of known authorship are available. In
comparison to PAN-2013 and PAN-2014 an increased number of participants followed
the profile-based paradigm [
          <xref ref-type="bibr" rid="ref10 ref13 ref21 ref33">10, 13, 21, 33</xref>
          ]. Moreover, in a hybrid of the two paradigms,
separate representations are extracted for each document written by the same author
which are then combined into a single representation [
          <xref ref-type="bibr" rid="ref31 ref41">31, 41</xref>
          ].
        </p>
        <p>
          Text Representation Following the practice of participants in previous PAN editions,
low-level and language-independent measures are the main kinds of features used to
represent the writing style of documents. Typical examples are lengths of words,
sentences, and paragraphs, type-token ratio, hapax legomena, and other vocabulary
richness and readability measures. A very popular type of features is character n-grams
(including unigrams), words, punctuation marks, stopwords, etc. Many submitted
approaches rely exclusively on such text representation features, disregarding features that
require more sophisticated text analyses [
          <xref ref-type="bibr" rid="ref11 ref13 ref14 ref2 ref21 ref31 ref41 ref49">2, 11, 13, 14, 21, 31, 41, 49</xref>
          ].
        </p>
        <p>
          Regarding more sophisticated features, the most popular ones are part-of-speech
(POS) n-grams mainly due to the availability of POS taggers of acceptable
performance for all four languages of the PAN-2015 corpus [
          <xref ref-type="bibr" rid="ref28 ref29 ref3 ref33 ref43 ref5">3, 5, 28, 29, 33, 43</xref>
          ]. A few
participants apply full syntactic parsing, achieving moderate performances at the cost
of considerably increased runtime cost [
          <xref ref-type="bibr" rid="ref10 ref36">10, 36</xref>
          ]. Other features requiring more elaborate
text analysis are related to lemmatization [
          <xref ref-type="bibr" rid="ref33 ref5">5, 33</xref>
          ], style and grammar checking [
          <xref ref-type="bibr" rid="ref35">35</xref>
          ], and
Latent Dirichlet Allocation [
          <xref ref-type="bibr" rid="ref30 ref33">30, 33</xref>
          ].
        </p>
        <p>
          The majority of participants attempt to combine different types of features. Some
approaches, however, use only one type of features, for example, the most frequent
terms [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ], stopword n-grams [
          <xref ref-type="bibr" rid="ref49">49</xref>
          ], and character sequences [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. Also, some of the
proposed features do not refer to a single document but capture the difference of a
certain feature between two documents (typically one of known and one of unknown
authorship). These features are called differential features [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] or joint features [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ], and
they are used in combination with eager supervised learning models.
        </p>
        <p>
          Handling Ambiguous Cases An important issue in author verification is the
ability to leave problems unanswered when unsure, rather than providing wrong answers.
This capability of an approach is directly measured using c@1. Some of the
participants, including the two top-performing ones, attempt to focus on this issue and leave
some problems unanswered when the confidence of their answers is low. The most basic
approach is to examine the score of each problem leave it unanswered if it lies in a
specified range around 0.5 [
          <xref ref-type="bibr" rid="ref2 ref21 ref30">2, 21, 30</xref>
          ]. A more sophisticated model is proposed by Moreau
et al. [
          <xref ref-type="bibr" rid="ref30">30</xref>
          ], whose classifier determines ambiguous cases that are left unanswered to
improve c@1.
        </p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>Conclusions</title>
      <p>The shared task on author identification at PAN-2015 focused on the authorship
verification problem. In contrast to previous editions of PAN, a major novelty was that
cross-genre and cross-topic verification cases were considered. This challenging, yet
realistic variation of the problem allowed us to examine whether authorship verification
methods are heavily affected by variations in genre and topic among the documents of
a verification case. The evaluation results indicate that the cross-genre scenario is more
difficult. However, the performance of top-ranked approaches on each language of our
corpus is surprisingly high in terms of both AUC and c@1.</p>
      <p>
        The two top-performing methods introduce significant novelties. The winning
approach of Bagnall [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] is based on a character-level neural network language model that
is used for first time in authorship verification. The success of this model indicates
that, beyond the well-known and simplistic character n-gram features, more complex
approaches can better exploit character-level information in authorship analysis tasks.
The second winning approach of Moreau et al. [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ] takes advantage of a heterogeneous
ensemble that combines different authorship verification methods, which implements
and verifies one of the major conclusion drawn at PAN-2013 and PAN-2014 [
        <xref ref-type="bibr" rid="ref18 ref46">18, 46</xref>
        ].
Both of these approaches are computationally expensive. However, the increase in
runtime cost is not caused by elaborate text analysis methods, such as syntactic parsing
or semantic analysis. Rather, their runtime is spent on fine-tuning parameters of the
learning algorithms.
      </p>
      <p>
        We received 18 submissions, which compares to the corresponding tasks at
PAN2013 (18 participants) and PAN-2014 (13 participants). Among them, only five
participants also took part in the shared task at PAN-2013 and/or PAN-2014 [
        <xref ref-type="bibr" rid="ref11 ref13 ref30 ref35 ref49">11, 13, 30, 35,
49</xref>
        ]. These figures verify that there is a lively research community working on author
identification tasks, and that PAN has become the major forum of this research. We may
also claim that the focus of PAN on the authorship verification problem helped to raise
interest of researchers on that fundamental problem and to significantly advance the
state of the art. Moreover, the availability of the PAN corpora allowed the development
of novel methods that are based on eager supervised learning algorithms. Future PAN
editions will examine in more detail whether the size, diversity, and quality of training
corpora strongly affect the performance of verification models.
      </p>
      <p>
        Text-length is one important issue that has not been thoroughly studied within
authorship verification research. How long should the texts of known authorship be in
order to allow for training reliable verification models? How many words of the unknown
documents are really needed to allow for computing an accurate answer? Answers to
such and similar questions are critical in case we wish to apply this technology to short
texts, like tweets and SMS messages. Another interesting future direction is to study the
relationship of authorship verification with other author identification tasks, like author
clustering (grouping documents by authorship) [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ] and author diarization (segmenting
a multi-author document into authorial components) [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ].
      </p>
      <sec id="sec-7-1">
        <title>Acknowledgements</title>
        <p>This work was partially supported by the WIQ-EI IRSES project (Grant No. 269180)
within the FP7 Marie Curie action.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Abbasi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
          </string-name>
          , H.:
          <article-title>Applying authorship analysis to extremist-group web forum messages</article-title>
          .
          <source>Intelligent Systems, IEEE</source>
          <volume>20</volume>
          (
          <issue>5</issue>
          ),
          <fpage>67</fpage>
          -
          <lpage>75</lpage>
          (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Bagnall</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Author Identification using multi-headed Recurrent Neural Networks</article-title>
          . In: Cappellato,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Gareth</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          , San Juan, E. (eds.)
          <source>Working Notes Papers of the CLEF 2015 Evaluation Labs</source>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Bartoli</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dagri</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lorenzo</surname>
            ,
            <given-names>A.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Medvet</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tarlao</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>An Author Verification Approach Based on Differential Features</article-title>
          . In: Cappellato,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Gareth</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          , San Juan, E. (eds.)
          <source>Working Notes Papers of the CLEF 2015 Evaluation Labs</source>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Castillo</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cervantes</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vilariño</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pinto</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>León</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Unsupervised method for the authorship identification task</article-title>
          .
          <source>In: CLEF 2014 Labs and Workshops, Notebook Papers. CEUR Workshop Proceedings, CLEF and CEUR-WS.org</source>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Castro-Castro</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pelaez-Brioso</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Adame-Arcia</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>noz</surname>
            <given-names>Guillena</given-names>
          </string-name>
          , R.M.:
          <article-title>Authorship Verification, Combining Linguistic Features and Different Similarity Functions</article-title>
          . In: Cappellato,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Gareth</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          , San Juan, E. (eds.)
          <source>Working Notes Papers of the CLEF 2015 Evaluation Labs</source>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Chaski</surname>
            ,
            <given-names>C.E.</given-names>
          </string-name>
          :
          <article-title>Who's at the keyboard: Authorship attribution in digital evidence invesigations</article-title>
          .
          <source>International Journal of Digital Evidence</source>
          <volume>4</volume>
          (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Fawcett</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>An introduction to roc analysis</article-title>
          .
          <source>Pattern Recogn. Lett</source>
          .
          <volume>27</volume>
          (
          <issue>8</issue>
          ),
          <fpage>861</fpage>
          -
          <lpage>874</lpage>
          (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Fréry</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Largeron</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Juganaru-Mathieu</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Ujm at clef in author identification</article-title>
          .
          <source>In: CLEF 2014 Labs and Workshops, Notebook Papers. CEUR Workshop Proceedings, CLEF and CEUR-WS.org</source>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Gollub</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stein</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Burrows</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          : Ousting Ivory Tower Research:
          <article-title>Towards a Web Framework for Providing Experiments as a Service</article-title>
          . In: Hersh,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Callan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Maarek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            ,
            <surname>Sanderson</surname>
          </string-name>
          , M. (eds.) 35th
          <source>International ACM Conference on Research and Development in Information Retrieval (SIGIR 12)</source>
          . pp.
          <fpage>1125</fpage>
          -
          <lpage>1126</lpage>
          . ACM (Aug
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Gómez-Adorno</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sidorov</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pinto</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Markov</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>A Graph Based Authorship Identification Approach</article-title>
          . In: Cappellato,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Gareth</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          , San Juan, E. (eds.)
          <source>Working Notes Papers of the CLEF 2015 Evaluation Labs</source>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Gutierrez</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Casillas</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ledesma</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fuentes</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Meza</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Homotopy Based Classification for Author Verification Task</article-title>
          . In: Cappellato,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Gareth</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          , San Juan, E. (eds.)
          <source>Working Notes Papers of the CLEF 2015 Evaluation Labs</source>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>van Halteren</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          :
          <article-title>Linguistic profiling for author recognition and verification</article-title>
          .
          <source>In: Proceedings of the 42Nd Annual Meeting on Association for Computational Linguistics. ACL '04</source>
          ,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computational Linguistics, Stroudsburg, PA, USA (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Halvani</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          :
          <article-title>A Generic Authorship Verification Scheme Based on Equal Error Rates</article-title>
          . In: Cappellato,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Gareth</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          , San Juan, E. (eds.)
          <source>Working Notes Papers of the CLEF 2015 Evaluation Labs</source>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Hürlimann</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weck</surname>
          </string-name>
          , B., van den Berg, E.,
          <string-name>
            <surname>Suster</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nissim</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>GLAD: Groningen Lightweight Authorship Detection</article-title>
          . In: Cappellato,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Gareth</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          , San Juan, E. (eds.)
          <source>Working Notes Papers of the CLEF 2015 Evaluation Labs</source>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Jankowska</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Keselj</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Milios</surname>
          </string-name>
          , E.:
          <article-title>Proximity based one-class classification with Common N-Gram dissimilarity for authorship verification task</article-title>
          . In: Forner,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Navigli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Tufis</surname>
          </string-name>
          ,
          <string-name>
            <surname>D</surname>
          </string-name>
          . (eds.)
          <article-title>CLEF 2013 Evaluation Labs</article-title>
          and Workshop - Working Notes Papers,
          <volume>23</volume>
          -
          <fpage>26</fpage>
          September, Valencia, Spain (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Juola</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          : Authorship Attribution.
          <source>Foundations and Trends in Information Retrieval</source>
          <volume>1</volume>
          ,
          <fpage>234</fpage>
          -
          <lpage>334</lpage>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Juola</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>How a computer program helped reveal J. K. Rowling as author of A Cuckoo's Calling</article-title>
          . Scientific
          <string-name>
            <surname>American</surname>
          </string-name>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Juola</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stamatatos</surname>
          </string-name>
          , E.:
          <article-title>Overview of the author identification task at pan-2013</article-title>
          . In: P., T.D.E.F. (ed.)
          <article-title>Notebook Papers of CLEF 2013 LABs and Workshops (CLEF-</article-title>
          <year>2013</year>
          ) (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Kestemont</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Luyckx</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Daelemans</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Crombez</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Cross-genre authorship verification using unmasking</article-title>
          .
          <source>English Studies</source>
          <volume>93</volume>
          (
          <issue>3</issue>
          ),
          <fpage>340</fpage>
          -
          <lpage>356</lpage>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>Khonji</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Iraqi</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>A slightly-modified gi-based author-verifier with lots of features (asgalf)</article-title>
          .
          <source>In: CLEF 2014 Labs and Workshops, Notebook Papers. CEUR Workshop Proceedings, CLEF and CEUR-WS.org</source>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <surname>Kocher</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Savoy</surname>
          </string-name>
          , J.: UniNE at CLEF 2015:
          <article-title>Author Identification</article-title>
          . In: Cappellato,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Gareth</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          , San Juan, E. (eds.)
          <source>Working Notes Papers of the CLEF 2015 Evaluation Labs</source>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <surname>Koppel</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Akiva</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dershowitz</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dershowitz</surname>
          </string-name>
          , N.:
          <article-title>Unsupervised decomposition of a document into authorial components</article-title>
          . In: Lin,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Matsumoto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            ,
            <surname>Mihalcea</surname>
          </string-name>
          ,
          <string-name>
            <surname>R</surname>
          </string-name>
          . (eds.)
          <article-title>Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics</article-title>
          . pp.
          <fpage>1356</fpage>
          -
          <lpage>1364</lpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <surname>Koppel</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schler</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bonchek-Dokow</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          :
          <article-title>Measuring differentiability: Unmasking pseudonymous authors</article-title>
          .
          <source>J. Mach. Learn. Res</source>
          .
          <volume>8</volume>
          ,
          <fpage>1261</fpage>
          -
          <lpage>1276</lpage>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <surname>Koppel</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Winter</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Determining if two documents are written by the same author</article-title>
          .
          <source>Journal of the American Society for Information Science and Technology</source>
          <volume>65</volume>
          (
          <issue>1</issue>
          ),
          <fpage>178</fpage>
          -
          <lpage>187</lpage>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <surname>Lambers</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Veenman</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Forensic authorship attribution using compression distances to prototypes</article-title>
          . In: Geradts,
          <string-name>
            <given-names>Z.</given-names>
            ,
            <surname>Franke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            ,
            <surname>Veenman</surname>
          </string-name>
          , C. (eds.)
          <source>Computational Forensics, Lecture Notes in Computer Science</source>
          , vol.
          <volume>5718</volume>
          , pp.
          <fpage>13</fpage>
          -
          <lpage>24</lpage>
          . Springer Berlin Heidelberg (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <surname>Layton</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Watters</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dazeley</surname>
          </string-name>
          , R.:
          <article-title>Automated unsupervised authorship analysis using evidence accumulation clustering</article-title>
          .
          <source>Natural Language Engineering</source>
          <volume>19</volume>
          ,
          <fpage>95</fpage>
          -
          <lpage>120</lpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <surname>Luyckx</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Daelemans</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          :
          <article-title>Authorship attribution and verification with many authors and limited data</article-title>
          .
          <source>In: Proceedings of the Twenty-Second International Conference on Computational Linguistics (COLING</source>
          <year>2008</year>
          ). pp.
          <fpage>513</fpage>
          -
          <lpage>520</lpage>
          .
          <article-title>Coling 2008 Organizing Committee</article-title>
          , Manchester, UK (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <surname>Maitra</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ghosh</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Das</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Authorship Verification: An Approach based on Random Forest</article-title>
          . In: Cappellato,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Gareth</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          , San Juan, E. (eds.)
          <source>Working Notes Papers of the CLEF 2015 Evaluation Labs</source>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <surname>Mechti</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jaoua</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Faiz</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bsir</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Belguith</surname>
            ,
            <given-names>L.H.</given-names>
          </string-name>
          :
          <article-title>On the Empirical Evaluation of Hybrid Author Identification Method</article-title>
          . In: Cappellato,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Gareth</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          , San Juan, E. (eds.)
          <source>Working Notes Papers of the CLEF 2015 Evaluation Labs</source>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <surname>Moreau</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jayapal</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lynch</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vogel</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          : Author Verification:
          <article-title>Basic Stacked Generalization Applied To Predictions from a Set of Heterogeneous Learners</article-title>
          . In: Cappellato,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Gareth</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          , San Juan, E. (eds.)
          <source>Working Notes Papers of the CLEF 2015 Evaluation Labs</source>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <surname>Nikolov</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tabakova</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Savov</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kiprov</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nakov</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          : SU@PAN'2015:
          <article-title>Experiments in Author Veri cation</article-title>
          . In: Cappellato,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Gareth</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          , San Juan, E. (eds.)
          <source>Working Notes Papers of the CLEF 2015 Evaluation Labs</source>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <surname>Noreen</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          :
          <article-title>Computer-Intensive Methods for Testing Hypotheses: An Introduction. A Wiley-Interscience publication</article-title>
          , Wiley (
          <year>1989</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <surname>Pacheco</surname>
            ,
            <given-names>M.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fernandes</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Porco</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Random Forest with Increased Generalization: A Universal Background Approach for Authorship Verification</article-title>
          . In: Cappellato,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Gareth</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          , San Juan, E. (eds.)
          <source>Working Notes Papers of the CLEF 2015 Evaluation Labs</source>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <surname>Peñas</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rodrigo</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>A simple measure to assess non-response</article-title>
          .
          <source>In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1</source>
          . pp.
          <fpage>1415</fpage>
          -
          <lpage>1424</lpage>
          . HLT '
          <volume>11</volume>
          ,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computational Linguistics, Stroudsburg, PA, USA (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <surname>Pimas</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kröll</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kern</surname>
          </string-name>
          , R.:
          <article-title>Know-Center at PAN 2015 Author Identification</article-title>
          . In: Cappellato,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Gareth</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          , San Juan, E. (eds.)
          <source>Working Notes Papers of the CLEF 2015 Evaluation Labs</source>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [36]
          <string-name>
            <surname>Posadas-Durán</surname>
            ,
            <given-names>J.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sidorov</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Batyrshin</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mirasol-Meléndez</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          :
          <article-title>Author Verification Using Syntactic N-grams</article-title>
          . In: Cappellato,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Gareth</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          , San Juan, E. (eds.)
          <source>Working Notes Papers of the CLEF 2015 Evaluation Labs</source>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          [37]
          <string-name>
            <surname>Potthast</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gollub</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rangel</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosso</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stamatatos</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stein</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Improving the Reproducibility of PAN's Shared Tasks: Plagiarism Detection, Author Identification, and Author Profiling</article-title>
          . In: Kanoulas,
          <string-name>
            <given-names>E.</given-names>
            ,
            <surname>Lupu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Clough</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Sanderson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Hall</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Hanbury</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Toms</surname>
          </string-name>
          , E. (eds.)
          <article-title>Information Access Evaluation meets Multilinguality, Multimodality, and Visualization</article-title>
          .
          <source>5th International Conference of the CLEF Initiative (CLEF 14)</source>
          . pp.
          <fpage>268</fpage>
          -
          <lpage>299</lpage>
          . Springer, Berlin Heidelberg New York (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          [38]
          <string-name>
            <surname>Potthast</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stein</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Holfeld</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Overview of the 1st International Competition on Wikipedia Vandalism Detection</article-title>
          . In: Braschler,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Harman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Pianta</surname>
          </string-name>
          , E. (eds.)
          <source>Working Notes Papers of the CLEF 2010 Evaluation Labs (Sep</source>
          <year>2010</year>
          ), http://www.clef-initiative.eu/publication/working-notes
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          [39]
          <string-name>
            <surname>Sapkota</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bethard</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Montes-</surname>
            y-Gómez,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Solorio</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Not all character n-grams are created equal: A study in authorship attribution</article-title>
          . In: Human Language Technologies:
          <article-title>The 2015 Annual Conference of the North American Chapter of the ACL</article-title>
          . pp.
          <fpage>93</fpage>
          -
          <lpage>102</lpage>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          [40]
          <string-name>
            <surname>Sapkota</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Solorio</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <article-title>Montes-y-</article-title>
          <string-name>
            <surname>Gómez</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bethard</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosso</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Cross-topic authorship attribution: Will out-of-topic data help?</article-title>
          <source>In: COLING</source>
          <year>2014</year>
          , 25th International Conference on Computational Linguistics,
          <source>Proceedings of the Conference: Technical Papers, August 23-29</source>
          ,
          <year>2014</year>
          , Dublin, Ireland. pp.
          <fpage>1228</fpage>
          -
          <lpage>1237</lpage>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref41">
        <mixed-citation>
          [41]
          <string-name>
            <surname>Sari</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stevenson</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>A Machine Learning-based Intrinsic Method for Cross-topic and Cross-genre Authorship Verification</article-title>
          . In: Cappellato,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Gareth</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          , San Juan, E. (eds.)
          <source>Working Notes Papers of the CLEF 2015 Evaluation Labs</source>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref42">
        <mixed-citation>
          [42]
          <string-name>
            <surname>Seidman</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Authorship Verification Using the Impostors Method</article-title>
          . In: Forner,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Navigli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Tufis</surname>
          </string-name>
          ,
          <string-name>
            <surname>D</surname>
          </string-name>
          . (eds.)
          <article-title>CLEF 2013 Evaluation Labs</article-title>
          and Workshop - Working Notes Papers,
          <volume>23</volume>
          -
          <fpage>26</fpage>
          September, Valencia, Spain (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref43">
        <mixed-citation>
          [43]
          <string-name>
            <surname>Solórzano</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mijangos</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pimentel</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>López-Escobedo</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Montes</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sierra</surname>
          </string-name>
          , G.:
          <article-title>Authorship Verification by Combining SVMs with Kernels Optimized for Different Feature Categories</article-title>
          . In: Cappellato,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Gareth</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          , San Juan, E. (eds.)
          <source>Working Notes Papers of the CLEF 2015 Evaluation Labs</source>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref44">
        <mixed-citation>
          [44]
          <string-name>
            <surname>Stamatatos</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          :
          <article-title>A Survey of Modern Authorship Attribution Methods</article-title>
          .
          <source>Journal of the American Society for Information Science and Technology</source>
          <volume>60</volume>
          ,
          <fpage>538</fpage>
          -
          <lpage>556</lpage>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref45">
        <mixed-citation>
          [45]
          <string-name>
            <surname>Stamatatos</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          :
          <article-title>On the robustness of authorship attribution based on character n-gram features</article-title>
          .
          <source>Journal of Law and Policy</source>
          <volume>21</volume>
          ,
          <fpage>421</fpage>
          -
          <lpage>439</lpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref46">
        <mixed-citation>
          [46]
          <string-name>
            <surname>Stamatatos</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Daelemans</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Verhoeven</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stein</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Potthast</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Juola</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sánchez-Pérez</surname>
            ,
            <given-names>M.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Barrón-Cedeño</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Overview of the author identification task at PAN 2014</article-title>
          . In: Working Notes for CLEF 2014 Conference, Sheffield, UK,
          <source>September 15-18</source>
          ,
          <year>2014</year>
          . pp.
          <fpage>877</fpage>
          -
          <lpage>897</lpage>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref47">
        <mixed-citation>
          [47]
          <string-name>
            <surname>Stamatatos</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fakotakis</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kokkinakis</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          :
          <article-title>Automatic text categorization in terms of genre and author</article-title>
          .
          <source>Comput. Linguist</source>
          .
          <volume>26</volume>
          (
          <issue>4</issue>
          ),
          <fpage>471</fpage>
          -
          <lpage>495</lpage>
          (
          <year>2000</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref48">
        <mixed-citation>
          [48]
          <string-name>
            <surname>Stover</surname>
            ,
            <given-names>J.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Winter</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Koppel</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kestemont</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Computational authorship verification method attributes a new work to a major 2nd century african author</article-title>
          .
          <source>Journal of the Association for Information Science and Technology</source>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref49">
        <mixed-citation>
          [49]
          <string-name>
            <surname>Vartapetiance</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gillam</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Adapting for Subject-Specific Term Length using Topic Cost in Author Verification</article-title>
          . In: Cappellato,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Gareth</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          , San Juan, E. (eds.)
          <source>Working Notes Papers of the CLEF 2015 Evaluation Labs</source>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref50">
        <mixed-citation>
          [50]
          <string-name>
            <surname>Verhoeven</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Daelemans</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          :
          <article-title>Clips stylometry investigation (csi) corpus: A dutch corpus for the detection of age, gender, personality, sentiment and deception in text</article-title>
          .
          <source>In: Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC</source>
          <year>2014</year>
          ). Reykjavik, Iceland (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>