<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Y.: Determining if two documents are written by the same author.
Journal of the Association for Information Science and Technology 65(1)</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.1109/ICMLA.2010.153</article-id>
      <title-group>
        <article-title>Author Obfuscation: Attacking the State of the Art in Authorship Verification?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Martin Potthast</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Matthias Hagen</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Benno Stein</string-name>
        </contrib>
      </contrib-group>
      <pub-date>
        <year>2008</year>
      </pub-date>
      <volume>9311</volume>
      <fpage>12</fpage>
      <lpage>14</lpage>
      <abstract>
        <p>We report on the first large-scale evaluation of author obfuscation approaches built to attack authorship verification approaches: the impact of 3 obfuscators on the performance of a total of 44 authorship verification approaches has been measured and analyzed. The best-performing obfuscator successfully impacts the decision-making process of the authorship verifiers on average in about 47% of the cases, causing them to misjudge a given pair of documents as having been written by “different authors” when in fact they would have decided otherwise if one of them had not been automatically obfuscated. The evaluated obfuscators have been submitted to a shared task on author obfuscation that we organized at the PAN 2016 lab on digital text forensics. We contribute further by surveying the literature on author obfuscation, by collecting and organizing evaluation methodology for this domain, and by introducing performance measures tailored to measuring the impact of author obfuscation on authorship verification.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>The development of author identification technology has reached a point at which it can
be carefully employed in the wild to resolve cases of unknown or disputed authorship.
For a recent example, a state-of-the-art forensic software played a role in breaking the
anonymity of J. K. Rowling, who published her book “The Cuckoo’s Calling” under
the pseudonym Robert Gailbraith in order to “liberate” herself from the pressure of
stardom, caused by her success with the Harry Potter series.1 Moreover, forensic author
identification software is part of the toolbox of forensic linguists, who employ the
technology on a regular basis to support their testimony in court as expert witnesses in cases
where the authenticity of a piece of writing is important. Despite their successful
application, none of the existing approaches has been shown to work flawless—a fact that
is rooted in the complexity and the ill-posedness of the problem. All approaches have
a likelihood of returning false decisions under certain circumstances, but the
circumstances under which they do are barely understood. It is hence particularly interesting
to analyze whether and how these circumstances can be controlled, since any form of
control over the outcome of an author identification software bears the risk of misuse.</p>
      <p>In fiction, a number of examples can be found where authors tried to remain
anonymous, and where they, overtly or covertly, tried to imitate the writing style of others. In
? A summary of this report has been published as part of [92]
1 http://languagelog.ldc.upenn.edu/nll/?p=5315
is known
to have
written
publishes automatically obfuscated text
circumvents OR obstructs</p>
      <p>is received by
automatically verifies authorship
Eve
fact, style imitation is a well-known learning technique in writing courses. But the
question of whether humans are ultimately capable of controlling their own writing style so
as to fool experts into believing they have not written a given piece of text, or even that
someone else has, is difficult to answer based on observation alone: are the known cases
more or less all there is, or are they just the tip of the iceberg (i.e., examples of unskilled
attempts)? And, if the “expert” to be fooled is not a human but an author identification
software, the rules are changed entirely. The fact that software is used to assist author
identification increases the attack surface of investigations to spoil the decision-making
process of the software. This is troublesome since the human operator of such a
software may be ignorant of its flaws, and biased toward taking the software’s output at
face value instead of treating it with caution. After all, being convinced of the quality
of a software is a necessary precondition to employing it to solve a problem.</p>
      <p>
        At PAN 2016, we organized for the first time a shared task on author obfuscation
to begin exploring the potential vulnerabilities of author identification technology. A
number of interesting subtasks related to author obfuscation can be identified, from
which we have selected that of author masking. This task is built on top of the task
of authorship verification, a subtask of author identification, which was organized at
PAN 2013 through PAN 2015 [
        <xref ref-type="bibr" rid="ref47">47, 97, 98</xref>
        ] (see Figure 1 for an illustration):
      </p>
      <sec id="sec-1-1">
        <title>Authorship verification:</title>
        <p>Given two documents,
decide whether they have been
written by the same author.
vs.</p>
      </sec>
      <sec id="sec-1-2">
        <title>Author masking:</title>
        <p>Given two documents by the same author,
paraphrase the designated one so that the
author cannot be verified anymore.</p>
        <p>The two tasks are diametrically opposed to each other: the success of a certain
approach for one of these tasks depends on its “immunity” against the most effective
approaches for the other. The two tasks are also entangled, since the development of a
new approach for one of them should build upon the capabilities of existing approaches
for the other. However, compared to authorship verification, author obfuscation in
general (and author masking in particular) received little attention to date. A reason for this
may be rooted in the fact that author masking requires (automatic) paraphrasing as a
subtask, which poses a high barrier of entry to newcomers. To facilitate future research
on both tasks, we contribute the following analyses and building blocks:
1. First-time large-scale evaluation of 44 state-of-the-art authorship verification
approaches attacked by 3 author obfuscation approaches in 4 evaluation settings. This
evaluation allows for judging both the feasibility of author obfuscation as well as
the vulnerability of the state of the art in authorship verification. To cut a long story
short, it turns out that even basic author obfuscation approaches have significant
impact on many authorship verification approaches.
2. Proposal of performance measures to quantify the impact that obfuscation has on
authorship analysis technology.
3. Survey of related work on author obfuscation, and a systematic review and
organization of evaluation methodology for author obfuscation. In particular, we identify
the three performance dimensions safety, soundness, and sensibleness, wherein an
author obfuscation approach should excel before being considered fit for practical
use; we detail how obfuscation approaches can be assessed today, and what may be
useful in the future.
4. Organization of a shared task at PAN 2016 on author obfuscation to which the three
obfuscators evaluated have been submitted. Moreover, we experiment with peer
evaluation in shared tasks by inviting participants as well as interested third parties
to co-evaluate the obfuscators with regard to the three aforementioned performance
dimensions.</p>
        <p>In what follows, Section 2 surveys the related work on author obfuscation, and
Section 3 systematically reviews and organizes the corresponding evaluation methodology;
here, the obfuscation impact measures are introduced. Section 4 reviews the obfuscation
approaches that have been submitted to our shared task, and Section 5 reports on their
evaluation against the state of the art in authorship verification, including the results of
the outlined peer evaluation initiative.
2</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>The literature on author obfuscation goes into three directions: (1) obfuscation
generation, (2) obfuscation evaluation, and (3) obfuscation detection and reversal. This section
reviews the contributions that have been made to date.</p>
      <p>
        Obfuscation generation approaches divide into manual, computer-assisted, and
automatic ones. The manual approaches include a study by Brennan and Greenstadt [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]
who asked 12 laymen writers to mask their writing style and to imitate another
author’s style. The obtained results indicate that non-professional writers are capable of
influencing their style to a point at which automatic authorship attribution performs
no better than random. These results have been later replicated by Brennan et al. [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]
who additionally employed crowdsourcing via Amazon’s Mechanical Turk to increase
the number of human subjects by 45 writers. Both datasets have been published as the
(Extended) Brennan-Greenstadt Corpus.2 Almishari et al. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] also employ Amazon’s
      </p>
      <sec id="sec-2-1">
        <title>2 https://psal.cs.drexel.edu/index.php/Main_Page</title>
        <p>Mechanical Turk to obfuscate the reviews of 40 Yelp users, where up to 5 reviews per
user were obfuscated many times over on Mechanical Turk, and up to 99 reviews per
user were used to check whether the original user could still be identified among the
40 candidate users. After obfuscation, the success rate at attributing obfuscated reviews
to the correct user dropped from 95% to 55% under a standard authorship attribution
model employing words, POS tags, POS bigrams, as well as character bigrams and
trigrams as features.</p>
        <p>
          Anonymouth is the name of a tool developed by McDonald et al. [72, 73] which
belongs to the computer-assisted approaches for obfuscation generation. It supports
its users to manually obfuscate a given document by analyzing whether it can still be
attributed to its original author after each revision, and by identifying which of the
underlying authorship attribution model’s features perform best to suggest parts of the
given document that should be changed in order to maximize the next revision’s impact
on classification performance. An important component of a computer-assisted author
obfuscation tools is its underlying analysis to determine the success of incremental text
revisions; Kacmarcik and Gamon [
          <xref ref-type="bibr" rid="ref50">50</xref>
          ] focus on such a component. They also select
the best-performing features and artificially change them in order to reduce attribution
performance. This approach specifically targets and defeats Koppel and Schler’s [61]
Unmasking. Conceivably, the analysis components of both Anonymouth and
Kacmarcik and Gamon can also be applied as part of a fully automatic obfuscation generator,
however, this has not been attempted so far. Recently, Le et al. [65] introduced a
semiautomatic obfuscation approach that supersedes the aforementioned ones in terms of
safety against de-obfuscation attacks.
        </p>
        <p>
          Among the first to propose automatic obfuscation generation were Rao and
Rohatgi [91], who suggested the use of round-trip machine translation: the
to-beobfuscated document is translated to an intermediate language and the result then back
to its original language. More than one intermediate languages may be used in a row
before returning to the initial one. Supposedly, the translation round-trip distorts the
writing style of the original’s author sufficiently to confuse an authorship analysis. With the
rise of machine translation systems, this approach has been studied many times to date,
becoming a de-facto baseline for author obfuscation. However, the question whether
this approach works is still undecided; some find that it does not perform well in terms
of safety, soundness, and sensibleness of the obfuscated text [
          <xref ref-type="bibr" rid="ref10 ref13">10, 13</xref>
          ], whereas others
do find some merits [
          <xref ref-type="bibr" rid="ref3 ref56">3, 56</xref>
          ]. At any rate, the use of machine translation for
obfuscation may have limits since the existing systems must be treated as a black box and the
obfuscation results can hardly be controlled. Besides machine translation, Khosmood
and Levinson [
          <xref ref-type="bibr" rid="ref55">57, 55</xref>
          ] develop an obfuscation framework which operationalizes the
imitation of an author’s writing style by transforming the style of a given document
iteratively via style-changing text operations toward the writing style of a set of
target documents. Their approach builds on a style comparison component not unlike the
aforementioned analysis component of Anonymouth, which determines the success of
manual obfuscation and suggests where to make further manual text operations. The
style comparison component controls which of a set of style-changing text operations
are automatically applied to get closer to the target style. Khosmood [
          <xref ref-type="bibr" rid="ref54 ref56">56, 54</xref>
          ] further
proposes a number of text operations at the sentence level, including active-to-passive
transformation, diction improvement, abstractions, synonym replacement,
simplification, and round-trip translation. Independently, Xu et al. [112] study within-language
machine-translation as a way of transforming and imitating another author’s style.
Unlike the aforementioned machine learning approaches, here, the machine translation
approach is specifically trained on texts from the source style and the target style so as
to allow for accurate style paraphrases, rendering the system less of a black box than
using round-trip translations for obfuscation. However, the approach is rendered less
practical for its exceeding resource requirements in terms of samples of writing from
the target style.
        </p>
        <p>
          Obfuscation evaluation is about assessing the performance of an obfuscation
approach. All of the aforementioned obfuscation approaches have been evaluated to some
extent by their authors. However, little has been said to date on how an obfuscator
should be evaluated; rather, evaluation setups have been created individually for each
paper, rendering the reported results incomparable across papers. This is one of our
primary contributions, and Section 3 introduces a comprehensive evaluation setup for
author obfuscation under the three performance dimensions safety, soundness, and
sensibleness. Nevertheless, our setup takes inspiration from the literature by collecting
and organizing the previously employed evaluation procedures. For example, the most
common evaluation approach under the safety dimension is to employ an existing
authorship analysis approach to verify whether an obfuscated text can still be attributed
to its original’s author [
          <xref ref-type="bibr" rid="ref10 ref11 ref13 ref3 ref4 ref48 ref49">3, 4, 11, 10, 13, 48, 49, 72</xref>
          ]. Furthermore, some obfuscation
approaches have been evaluated under the dimension of soundness, to ensure that an
original’s meaning does not get distorted by its obfuscation [
          <xref ref-type="bibr" rid="ref54 ref56">56, 54</xref>
          ], and sensibleness,
to ensure that obfuscated texts are still human-readable [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. On top of that, we report on
the results of the first peer evaluation organized as part of our shared task. Participants
and volunteers independently evaluated the obfuscation approaches submitted under the
three aforementioned performance dimensions.
        </p>
        <p>
          Finally, obfuscation detection and reversal is the task of deciding whether a given
text has been obfuscated, and in that case, to undo the obfuscation in order to retrieve
as much of the original text as possible. The possibility of reversing the effects of an
(automatic) obfuscation threatens its safety. Afroz et al. [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] and Juola [
          <xref ref-type="bibr" rid="ref46">46</xref>
          ] have
simultaneously shown that the texts of humans trying to mask their writing style or trying to
imitate that of another author can be accurately detected as such. This gives rise to the
detection of literary fraud, where an author may attempt to publish a text under the name
of another author, imitating the latter’s style. However, to simply remain anonymous the
knowledge of the fact that a given text has likely been obfuscated is of no avail, since
the obfuscation’s measurable traces in a text do not necessarily give a clue about its
original author. However, when it is possible to reverse the changes made via
obfuscation, even if only partially, this puts users of the corresponding obfuscation approach
at risk of being identified. In this regard, Le et al. [65] show that the semi-automatic
obfuscation approaches of McDonald et al. [72] and Kacmarcik and Gamon [
          <xref ref-type="bibr" rid="ref50">50</xref>
          ] can
be reversed and must therefore be considered unsafe.
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Evaluating Author Obfuscation</title>
      <p>This section collects and organizes the evaluation methodology for author obfuscation
for the first time. We introduce three performance dimensions in which an author
obfuscation approach should excel to be considered fit for practical use. Afterwards, we
present an operationalization of each dimension based on both manual review as well
as performance measurement.
3.1</p>
      <sec id="sec-3-1">
        <title>The Three Dimensions of Obfuscation Evaluation</title>
        <p>The performance of an author obfuscation approach obviously rests with its capability
to achieve the goal of fooling forensic experts, be they software or human. However,
this disregards writers and their target audience whose primary goal is to
communicate, albeit safe from deanonymization. For them, the quality of an obfuscated text as
well as whether its semantics are preserved are also important. In our survey of related
work, these performance dimensions are often neglected or mentioned only in passing.
Altogether, we call an obfuscation software
– safe, if its obfuscated texts can not be attributed to their original authors anymore,
– sound, if its obfuscated texts are textually entailed by their originals, and
– sensible, if its obfuscated texts are well-formed and inconspicuous.
These dimensions are orthogonal; an obfuscation software may meet each of them to
various degrees of perfection. Each dimension keeps the others in check since trivial
or flawed approaches stand no chance of achieving perfection in all three dimensions
at the same time. The operationalization of each dimension, however, poses significant
challenges in terms of resource requirements as well as scalability. In what follows, we
review in detail the challenges involved in making each dimension measurable as well
as how they have been operationalized in related work.
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Evaluating Obfuscation Safety</title>
        <p>
          The safety of an obfuscation approach depends on it withstanding three kinds of
attacks: (1) manual authorship analyses, (2) automatic authorship analyses, and (3)
deobfuscation attacks.
(1) Manual authorship analyses Manual authorship analyses can only be done by
trained forensic linguists, so that this kind of analysis is expensive and therefore does
not scale. Furthermore, it is unlikely that a forensic linguist would take part as human
subject in an experiment to evaluate an obfuscation approach (not even anonymously),
since the risks of suffering reputation damage from failing to beat it are too high,
whereas they have little to gain otherwise. At any rate, beating one forensic linguist
is insufficient to establish the safety of an obfuscation approach; the approach would
have to be tested against a number of experts to raise sufficient confidence. Given these
limitations, author obfuscation approaches are probably not going to be analyzed for
safety against manual authorship analyses any time soon.
(2) Automatic authorship analyses Automatic authorship analyses are by
comparison much more straightforward, practical, and scalable to be employed for obfuscation
evaluation. Dozens of approaches have been proposed for the two author identification
subtasks authorship attribution and authorship verification, so that evaluating an author
obfuscation approach boils down to running the existing, pre-trained authorship
analysis approaches against problem instances with and without obfuscated texts to observe
the difference in performance. To be considered safe against automatic authorship
analyses, an obfuscation approach should be able to defeat the best-performing authorship
analysis approaches. In this connection, the related work on author obfuscation has
employed a number of approaches for authorship attribution from the literature. Most
safety evaluations rely on two basic feature sets, namely the so-called Basic 9 features
(used in [
          <xref ref-type="bibr" rid="ref10">10, 65, 72</xref>
          ]) and the Writeprint features [
          <xref ref-type="bibr" rid="ref1">1, 114</xref>
          ] (used in [
          <xref ref-type="bibr" rid="ref10 ref2 ref3">2, 3, 10, 65, 72</xref>
          ]),
on top of which Weka classifiers are applied. Some authors also evaluate their
obfuscation approaches against other authorship analysis approaches such as the ones of
Tweedie et al. [101], Clark and Hannon [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ], and Koppel’s Unmasking [61] as well
as the freely available Signature Stylometric System3 (see [
          <xref ref-type="bibr" rid="ref11 ref50">11, 50</xref>
          ]). Still, the
number of authorship analysis approaches proposed to date by far surpasses the number
of approaches employed in author obfuscation evaluations. The reason for this
shortcoming may be found in the fact that hardly any implementations of the proposed
authorship analysis approaches have surfaced to date: authors typically publish papers
about their approaches but not their code base. Nevertheless, it must be conceded that
the aforementioned approaches do not represent the landscape of authorship analysis
approaches that have been proposed to date. To mitigate this problem for future author
obfuscation evaluations, we have built two resources that allow for more comprehensive
safety evaluations for author obfuscation for both authorship attribution and authorship
verification: first, in [88], we report on the replication of 15 of the most influential
authorship attribution approaches, the implementations of which are available as source
code on GitHub.4 Second, in our shared tasks on authorship verification at PAN 2013
to PAN 2015 [
          <xref ref-type="bibr" rid="ref47">47, 97, 98</xref>
          ], we report on the performances of a total of 49 pieces of
software that have been submitted for evaluation. They are kept in working condition on
the TIRA experimentation platform [
          <xref ref-type="bibr" rid="ref32">32, 89</xref>
          ], ready to be re-run on new datasets such
as obfuscated versions of the test datasets of PAN 2013 to PAN 2015. In this paper,
we employ them for the first time to evaluate three author obfuscation approaches for
safety at scale.
        </p>
        <p>Measuring obfuscation impact. When using existing authorship analysis approaches
to measure the performance of an obfuscation approach, the impact obfuscation has
on their classification performance is of interest. More specifically, for all problem
instances where authors are correctly identified, the question is how many of them are not
correctly identified, anymore, after obfuscation.</p>
        <p>Let da denote a document written by author a, and let DA denote a set of documents
written by authors from the set of all authors A. We call D = hdu; DAi an instance of
an authorship problem, where du is a document written by an unknown author u and
DA is a set of documents of known authorship which may or may not contain a subset
3 http://www.philocomp.net/humanities/signature.htm
4 List of repositories: https://github.com/search?q=authorship+attribution+user:pan-webis-de
of documents Du DA written by the author u 2 A of du. If DA comprises only
documents written by a single author a so that DA = Da, D is called an authorship
verification problem, and otherwise an authorship attribution problem. If DA comprises
documents written by more than one author, and if it can be guaranteed that one of them
is the author of du, D is a closed-class classification problem, and otherwise and
openclass classification problem. Closed-class attribution problems can also be considered
ranking problems, where the authors represented by disjunct subsets of DA are to be
ranked so that the highest-ranking author u is that of du.</p>
        <p>We denote the universe of all authorship problem instances by D, and : D !
A [ f;g denotes the true mapping from D to the set of known authors A for problems
D 2 D whose true author u 2 A of du is among the candidates found in DA, and to ;
otherwise. An authorship analysis approach y : D ! A [ f;g is an approximation of
that has been trained on Dtrain D. The extent to which the learned approximation of
y to has been successful is evaluated using Dtest D n Dtrain by checking whether y
returns answers matching those of for the problem instances in Dtest. A basic measure
that is frequently applied to measure the performance of a given authorship analysis
approach y is its accuracy acc(y; Dtest) on a given test set Dtest:
acc(y; Dtest) = jfD 2 Dtest : y(D) = (D)gj :
jDtestj</p>
        <p>In this setting, an author obfuscation approach o : D ! D maps the universe of
authorship problems onto itself; here, o(D) = hdo; DAi, where do is the obfuscated
version of du 2 D and DA 2 D is kept as is. The true author of an obfuscated problem
o(D) is the same as without, say (o(D)) = (D). But if an obfuscator works, an
authorship analysis approach y would return y(o(D)) 6= (o(D)). Let o(D) = fo(D) :
D 2 Dg. A straightforward way to evaluate an author obfuscation approach o is to
apply it to the problem instances in Dtest, measure the accuracy acc(y; o(Dtest)), and
to calculate the performance delta:
acc(o; y; Dtest) = acc(y; o(Dtest))
acc(y; Dtest):
(1)
However, in case Dtest comprises verification problems or open-class attribution
problems, this measure takes into account problems where obfuscation need not be applied
on the document of unknown authorship du, since its true author is not among the
candidates DA. Therefore, we consider only the subset Dt+est Dtest, which comprises
only problem instances D+ = hdu; DAi where the true author u of du has written at
least one document found in DA. Measuring the accuracy of an authorship analysis
+
approach y on Dtest is equivalent to measuring recall, hence:
+
rec(y; Dtest) = acc(y; Dtest); and
rec(o; y; Dtest) = rec(y; o(Dtest))
rec(y; Dtest):
(2)
The domain of this measure is [ 1; 1], where 1 indicates the best possible
performance of an obfuscator (i.e., flipping all decisions of an authorship analysis approach y
that makes no errors on unobfuscated texts), 0 indicates the obfuscator has no
impact, and a score greater than 0 indicates the worst case, namely that the obfuscator
somehow improves the classification performance of the given authorship analysis
approach y instead of decreasing it. In practice, the range of possible scores of rec is
governed by the a priori performance rec(y; Dtest) of the authorship analysis approach y:
[ rec(y; Dtest); 1 rec(y; Dtest)]. This means that rec does not reveal whether an
obfuscation approach has accomplished everything it can against y. When achieving a
score in the interval ( 1; 0) it remains unclear whether the obfuscator has flipped all of
y’s correct authorship attributions, or not. To get an idea of the relative impact an
obfuscator has on y’s recall, we apply the following normalizations dependent of rec’s
sign:
imp(o; y; Dtest) =</p>
        <p>The domain of this measure is, independent of the a priori performance of y, in the
interval [ 1; 1], where 1 indicates the best performance an obfuscator o can reach by
successfully obfuscating the problem instances where y made a correct attribution
before, and where 1 indicates that an obfuscator supports y instead obstructing it by
allowing it to correctly attribute problems it has not correctly attributed before. Note
that we change the sign of the measure to emphasize that it captures obfuscator
performance, and to allow for a more natural ordering. A potential drawback of measuring
relative impact may be that the impact measured on authorship analysis approaches with
a poor a priori performance may be overemphasized: for example, it may be much
easier to flip the few correct attributions of an a priori poor-performing authorship analysis
approach to earn a high relative impact than to flip the many correct attributions of a
well-performing one. To mitigate this issue, a least-performance threshold under Dtest
may be imposed that an authorship analysis approach y must exceed to be considered
attack-worthy by an obfuscator o.</p>
        <p>As discussed at the outset, the performance of an obfuscation approach o should
not only be evaluated against a single authorship analysis approach y, but against as
large a collection Y of approaches as possible. After all, author obfuscation approaches
are supposed to protect authors across the board of forensic analyses, and not just
against specific specimen. Therefore, for a given collection of authorship analysis
approaches Y , and for a given obfuscation approach o, we compute its average impact
under Dtest as follows:
avg imp(o; Y; Dtest) =
1</p>
        <p>
          X imp(o; y; Dtest):
jY j y2Y
The average impact of different obfuscation approaches on a large number of authorship
analysis approaches Y allows for ranking among the obfuscation approaches in order
to determine which of them performs best in terms of safety under Dtest.
(3) De-obfuscation attacks De-obfuscation attacks include attempts to undo the
effects an obfuscation approach has on a text, as well as analyses thereof that allow for a
(semi-)accurate attribution of authorship despite the text having been obfuscated. The
analytical nature of de-obfuscation attacks require a clear formulation of the
assumptions under which de-obfuscation becomes possible, just like any proof of the safety
(3)
(4)
of an obfuscation approach against de-obfuscation does. Such assumptions are
sometimes enumerated as “attacker capabilities” or referred to as “threat model.” We propose
to make the following general assumptions when analyzing an obfuscation approach’s
safety against de-obfuscation:
– Kerckhoffs’ principle: the obfuscation approach used is public
– Data used during obfuscation is public except for the original text
– Seeds used to initialize pseudorandom number generators are secret
– No available meta data links an obfuscated text to its author
This way, the safety of an author obfuscation approach against de-obfuscation depends
only on its merits at generating an irreversible obfuscation, and not on the fact that the
approach or data used during obfuscation are secret. At the same time, if an obfuscation
approach is deterministic and its text operations are easily recognizable and reversible
to the original state, the approach must be considered unsafe and unfit for practical use.
To date, the only systematic analysis of obfuscation approaches has been conducted by
Le et al. [65] who show that the obfuscation approach of Kacmarcik and Gamon [
          <xref ref-type="bibr" rid="ref50">50</xref>
          ]
can be completely reversed via backtracking, and that the safety against de-obfuscation
of the approach implemented in Anonymouth by McDonald et al. [72] can be severely
reduced in a closed-set attribution, increasing the probability of picking the correct
author from 0.2 to 0.4. Given these results, the authors of a new obfuscation approach
should always analyze its safety against de-obfuscation, whereas independent analyses
of this kind are just as important to raise confidence. Unfortunately, de-obfuscation
attacks elude performance measurement, since they will typically be tailored to the
obfuscation approach attacked.
3.3
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>Evaluating Obfuscation Soundness</title>
        <p>The soundness of an obfuscation approach depends on its ability to transfer the
semantics of an original text to its obfuscated version. While the author of a text may value
safety pretty high, the goal of writing a text is still to get a message across, which
should remain untarnished by automatic obfuscation. A version of a text that conveys
the same meaning as its original is called a paraphrase, and the goal of author
obfuscation is to generate one under the constraint that the style of the original’s author is
not recognizable, anymore (i.e., so that it is safe against forensic authorship analyses).
Consequently, an author obfuscation approach must be evaluated whether and to what
extent it generates paraphrases.</p>
        <p>At the time of writing, research and development on paraphrase generation is mostly
carried out at the sentence level; hardly any approaches exist that paraphrase at the
paragraph level or even at the discourse level. Nevertheless, a paraphrase of a whole
text may not only be done sentence-by-sentence, but it may also involve rearrangement
of sentences, paragraphs, and entire lines of argumentative discourse. When
evaluating soundness, a key challenge therefore is to trace the changes made by an obfuscator
and to compare the parts of an obfuscated text to their unobfuscated counterparts in the
original text. Given the possible non-linear changes that can be made during
paraphrasing, an a posteriori comparison and judgment of the obfuscated text compared to its
original is rendered difficult, since the apparent relations between an original text and
its obfuscation may be ambiguous. While automatic obfuscation approaches can output
which part of the original test went into generating which part of its obfuscation, this
may not be as straightforward for manual obfuscations, unless the manual text editing
operations are traced minutely as they happen.</p>
        <p>
          Concerning evaluation, research on paraphrase generation relies almost
unanimously on manual review. Nevertheless, inspired by the success of the well-known
BLEU metric for machine translation evaluation, several performance measures for
paraphrasing evaluation have been proposed, namely ParaMetric by Callison-Burch
et al. [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ], PEM by Liu et al. [68], PINC by Chen and Dolan [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ], PARADIGM by
Weese et al. [110], and APEM (for Korean) by Moon et al. [78]. All of these metrics
are designed for sentence-wise paraphrase evaluation; the only evaluation of
passagelength paraphrases has been reported by Burrows et al. [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] but no metrics have been
proposed, whereas Xu et al. [112] propose three metrics that measure the quality of a
paraphrase under the constraint that it is supposed to match a given author’s style. The
latter’s metrics, however, require large corpora in both the original text’s style and the
style of the author to be imitated, which limits their applicability in many scenarios. In
the literature for author obfuscation, also manual reviews with regard to soundness
prevail: while the approaches to manual and semi-automatic obfuscation proposed do not
require extensive soundness reviews, since the human subjects taking part in user
studies can be trusted to produce sound paraphrases, the automatic obfuscation approach
of Khosmood and Levinson [
          <xref ref-type="bibr" rid="ref56">56</xref>
          ] has been manually reviewed. The authors divide the
generated paraphrases into the three basic categories of “correct”, “passable”, and
“incorrect” paraphrases.
        </p>
        <p>In the long run, the aforementioned measures may prove to be useful also for
evaluating the soundness of author obfuscation approaches. At this time, however, we prefer
manual review despite its disadvantages in terms of overhead, since the literature has
not settled on a metric of choice, yet. Instead, to facilitate and scale manual reviews of
obfuscated texts in comparison to their original texts, we develop a visual analytics tool
for text comparison. The tool features various text comparison visualizations that assist
manual soundness review: visualizations are applied to monitor the changes made by
an author obfuscation approach. Figures 2, 3, and 4 show examples of the visualizations
in action, contrasting the three approaches submitted to our shared task. As a brief
explanation, the visualizations show the original text and the obfuscated text at the same
time. The text is arranged in phrase, where each line either shows a large phrase the two
compared texts have in common, or two stacked phrases where the two texts differ. This
allows for quick comprehension of the effects an obfuscator has, as well as for quick
judgments, thus significantly decreasing the time for manual review.</p>
        <p>
          Relaxing Soundness Constraints. The constraint that both original and obfuscation
possess the exact same semantics may be relaxed to some extent: it may be sufficient if
the statements made in the obfuscated version of a text will be considered true under
the presumption that the corresponding statements in the original are true (i.e., if the
obfuscated statements follow logically from their respective originals). This is called
textual entailment, and an obfuscated text would be considered entailed under the
original text in such a situation. Relaxing the soundness constraint to allow for textual
entailment opens a much wider space of possible obfuscations compared to paraphrases.
A comprehensive survey of algorithms to recognize textual entailment as well as a
series of corresponding shared tasks on this subject has been given by Dagan et al. [
          <xref ref-type="bibr" rid="ref23">23</xref>
          ];
these algorithms may serve as a basis for the development of new obfuscation
soundness metrics. Another potential relaxation that arises from allowing textually entailed
obfuscation is the possibility of allowing for summary obfuscation: for example, by
summarizing a text or certain passages thereof, the original author’s style may be
significantly changed while maintaining at least the gist of the intended message.
Conceivably, in some situations where author obfuscation is applied, summarization may be an
acceptable route as means for safe obfuscation. In this connection, summaries are often
textually entailed by the summarized text.
        </p>
        <p>Finally, when relaxing the soundness constraints, one may also question how exact
the original message has to be transferred into the obfuscated text. For example, in
early machine translation systems as well as the ones deployed at scale today, not all
translations are perfect, but the translation results are still useful to get a broad, and
sometimes even a detailed understanding of a text in a foreign language. The same may
apply to author obfuscation: to get a message across, it may not be necessary that every
last detail gets transmitted correctly (as in properly paraphrased or textually entailed),
but it may be sufficient for the reader to get a “readable” text whose message can be
discerned with reasonable effort. While the goal of automatic author obfuscation should
be to be perfectly sound and to generate actual paraphrases, early systems that do not
come close to this requirement may still be useful in practice despite their deficiencies.
3.4</p>
      </sec>
      <sec id="sec-3-4">
        <title>Evaluating Obfuscation Sensibleness</title>
        <p>
          The sensibleness of an obfuscation approach depends on its ability (1) to create
readable, ideally grammatically well-formed text, and (2) to hide the fact that a given
obfuscated text has been obfuscated.
(1) Obfuscation grammaticality Next to being safe against forensic analyses and
sound, another desired property of an obfuscated text is that it is well-formed in terms of
grammar and that it fits into its genre. Automatic grammar checking has a long history in
natural language processing and computer linguistics. Almost all approaches developed
so far are designed to find specific error types in an ungrammatical text. For evaluation,
however, the specific errors made by an obfuscator may be of less interest as opposed
to deciding which parts of an obfuscated text are grammatical and which are not, for
whatever reason. This latter task of judging grammaticality of a given piece of text is by
far less often studied [
          <xref ref-type="bibr" rid="ref19 ref21 ref25 ref41">19, 21, 25, 41, 87, 100, 108, 109, 111</xref>
          ]: the proposed approaches
to classifying a given sentence as being grammatical or not rely mostly on features
extracted from statistical natural language parsers, whose confidence in their parsing
results as well as features extracted from their resulting parse trees of grammatical and
ungrammatical sentences are used to train linear classifiers at recognizing whether a
given sentence is grammatical or not. The performances reported vary between 50% to
more than 90% detection accuracy of ungrammatical sentences, dependent on the test
dataset employed, whereas the results are largely incomparable for lack of a common
baseline or a standardized benchmark dataset. Dependent on how successful
grammaticality classification will become in the future, these approaches can be employed to
build an effective performance measure for author obfuscation approaches. In fact, they
may also be applied as components within an author obfuscation approach to a priori
judge whether a given change will result in a grammatical obfuscated text. Until then,
the only means left to evaluate a given author obfuscation approach is manual review.
        </p>
        <p>Again, just like for obfuscation soundness, relaxing the criteria for obfuscation
grammaticality may be reasonable in certain situations: for example, an obfuscator may
be allowed to return ungrammatical text as long as it can be understood sufficiently well,
or even as a means to mislead readers into thinking that an obfuscated text has been
written by a second-language speaker of a given language. In this connection,
manual review cannot be entirely avoided when evaluating author obfuscation approaches,
since the line between what is acceptable and what is not is much more blurry.
(2) Hiding obfuscation style Although the safety of an obfuscated text and therefore
the obfuscation approach used to generate it cannot rely on the fact that readers of the
obfuscated text do not know that it has been obfuscated, hiding obfuscation is still an
interesting side-goal of obfuscation generation. The inconspicuousness of an obfuscated
text may serve as a first line of defense that forecloses detailed investigations. This
is particularly important for automatic author obfuscation approaches that easily defeat
automatic authorship analysis approaches but are vulnerable to manual authorship
analyses by forensic experts. Conceivably, there are not enough forensic experts to analyze
all texts that may be desirable to be analyzed, so that automatic authorship approaches
will be applied to attain scale, whereas manual analyses will only be conducted in cases
of doubt or suspicion. Avoiding to raise suspicion is therefore a worthwhile goal for an
author obfuscation approach.</p>
        <p>
          In the related work on author obfuscation, Afroz et al. [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] and Juola [
          <xref ref-type="bibr" rid="ref46">46</xref>
          ] have
shown simultaneously, independently of one another that humans trying to mask their
own style, and humans trying to imitate that of another author leave measurable traces
in their writing that allow for automatically discriminating texts where authors have
tried to alter their style from texts where authors have written in their genuine style
(e.g., without making a conscious effort at altering it). The fact that humans leave such
traces is no indication of whether automatic obfuscation approaches will do so as well,
but it is very likely that they do. The question remains whether and how an automatic
author obfuscation approach can be taught either to randomize the traces, or to blend
in in a way so that the style of the obfuscated text cannot be distinguished, anymore,
from the style of a single human writing genuinely. Evaluating this aspect of author
obfuscation approaches will rely on applying the aforementioned approaches at detecting
style deception to check whether they can be successfully lead astray, whereas
manually reviewing for style deception is infeasible since subtle traces of obfuscation that
are revealed only via statistical analyses may be lost on a human reviewer.
3.5
        </p>
      </sec>
      <sec id="sec-3-5">
        <title>Obfuscation Efficiency</title>
        <p>
          Since there are typically many alternative ways to paraphrase a given statement in order
to obfuscate it, and even more so considering an entire text, the question arises which
alternatives are better than others. One possible way to decide this question is to search
for the alternative which is least different from the original but still achieves the goals
of safety, soundness, and sensibleness. In this connection, Kacmarcik and Gamon [
          <xref ref-type="bibr" rid="ref50">50</xref>
          ]
have proposed the “amount of work” as an efficiency measure for author obfuscation,
namely the number of changes per 1000 words. They claim that their approach
obfuscates a text with as little as 14 changes per 1000 words. Suppose two given obfuscation
approaches o1 and o2 achieve sufficient safety, soundness, and sensibleness, where o1
does so with the least possible amount of changes to the original text, and o2 does
so by making significantly more changes, which of the two is to be preferred? There
is no straightforward answer to this question; while lazy approaches appeal by their
efficiency, investing more work into generating an obfuscation may be worthwhile to
attain an obfuscation that is not only safe against the state of the art but potentially also
against future authorship analysis approaches that apply new forms of analyses. In this
regard, alternative obfuscations that maximize the difference to the original text may
be more interesting. However, maximizing the distance of an obfuscation to its original
text under some style model may not be a strategy that is safe against de-obfuscation
attacks in all situations. For example, Le et al. [65] show that in a closed-class
attribution scenario, maximizing the distance, or moving the style towards the centroid of a
given set of authors, provides sufficient information for a de-obfuscation attack.
Altogether, while measuring obfuscation efficiency in terms of number of changes made to
the original text per unit is interesting, and while it is also interesting to know how little
work is necessary to achieve safety, soundness, and sensibleness today, this measure is
insufficient to rank two obfuscation approaches.
4
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Survey of Submitted Obfuscation Approaches</title>
      <p>The three approaches submitted to our shared task follow three rather different
strategies: round-trip translations, replacement of at most one frequent word per sentence,
and style feature changes. While replacement of one word per sentence is a rather
conservative strategy in that it changes the to-be-obfuscated text only slightly, the other two
approaches change the text more substantially.</p>
      <p>
        Keswani et al. The approach of Keswani et al. [
        <xref ref-type="bibr" rid="ref52">52</xref>
        ] is based on round-trip translation.
Since access to the translation APIs of big commercial search engines like Google,
Microsoft Bing, or Yandex is disallowed during the testing period of the author obfuscation
task to prevent the test data from leaking, Keswani et al. employ the open source Moses
SMT toolkit [60] trained on the Europarl corpus [59]. The original text is translated
from English to German, the German text is translated to French, and the French text
is translated back to English. The presumption of Keswani et al. is that the original text
will be sufficiently changed during the translation to obfuscate its author.
      </p>
      <p>As for the resulting text, our evaluation showed that hardly any human-readable text
is produced (not even on a sentence level). One reason might be that the Europarl corpus
is not particularly suited as a training corpus for the different genres of our test datasets.
In the current form of Keswani et al.’s translation obfuscation, especially the soundness
and the sensibleness suffer from severe errors that render the texts non-readable and not
understandable by a human. Figure 2 shows the output for the problem instance 429.
Especially the “missing” grammaticality of the output substantially decreases
sensibleness and soundness at the same time since the texts do not make sense at all. A
couple of translation artifacts (remaining German or French words from the intermediate
translations) would raise some suspicions even if the text was readable. However,
do-ityourself statistical machine translation will improve over time such that the text quality
will get better with new translation approaches and larger corpora. In this respect,
employing some of the best public translation approaches in the form of the proprietary big
commercial search engine APIs might be interesting for the future although potentially
difficult with the current evaluation setting at PAN.</p>
      <p>
        Mansoorizadeh et al. The approach of Mansoorizadeh et al. [70] focuses on
attacking the feature of word frequencies that are the basis of many verification approaches.
The idea is to exchange some occurrences of the 200 most frequent words in the
to-beobfuscated text with synonyms obtained from WordNet [76] and scored based on their
similarity to the original to-be-replaced word. For the most similar candidates, the
commonness in the context of the to-be-replaced word is measured using a 4-gram language
model trained on the Brown corpus [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ]. Instead of the Brown corpus some bigger
corpora allowing for longer n-grams might be worth a look at (e.g., the Google n-grams [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]
underlying the Netspeak queries [90] that Stein et al. used for paraphrasing in acrostic
generation [99]). A crucial point of Mansoorizadeh et al.’s approach is that it replaces
at most one (!) word per sentence—rendering the paraphrasing rather conservative.
      </p>
      <p>Since not that many changes are introduced overall, the texts in general remain as
readable and grammatically correct as the original text and usually are fully
understandable. Still, we spotted some issues, an example of which is shown in Figure 3. In
problem instance 5 (a text about Java programming), the word “machine” that refers
to a computer is consistently changed to “car,” the word “Java” is sometimes changed
to “coffee,” etc. This makes the obfuscations less sensible and sound than in case of
most other problem instances. Although the text quality is not reduced in most cases,
the conservative paraphrasing also is the reason for a suboptimal safety score since the
slight changes on average cannot obfuscate authorship very well. Allowing for some
more changes per sentence and taking other than the most frequent words into account
might be good future options that could be tested.</p>
      <p>
        Mihaylova et al. The approach of Mihaylova et al. [75] targets a wide range of
styleindicating features that are frequently used in author identification approaches: sentence
length, punctuation, stop words, parts of speech, all caps, and individual word
frequencies. As a background corpus (in the current approach formed by some books from
Project Gutenberg and the PAN training datasets), the expected values of the features
are computed. For a to-be-obfuscated text, the scores for the same feature types are
calculated and then tried to be transformed more to the background average, applying
a wide range of rule-based and random text operations. In particular, the operations
employed are merging/splitting sentences, removing/inserting commas, removing stop
words, improving/impairing spelling, exchanging words with synonyms, applying rules
from the paraphrase database [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ], lowercasing long, all-caps words, switching
American and British spelling, inserting random functional words as noise, and replacing
word shortenings, abbreviations, numbers, and possessive expressions.
      </p>
      <p>In sum, the approach of Mihaylova et al. changes the text a lot since there are no
restrictions on the number of changes per sentence. However, the context is usually not
taken into account (i.e., whether a particular word or expression is frequent or
common enough in the given context around the replacement). Thus, a lot of the changes
look rather odd to a human—also the typos and randomly inserted words—and
sometimes even change the meaning completely (e.g., “horrible night” changed to “good
night” in problem instance 134, shown in Figure 4). Still, the soundness and
sensibleness on average are slightly better than for the round-trip machine translation approach
of Keswani et al. Also, the quality of the produced text might be improved a lot by
not overdoing the spelling errors (i.e., not introducing a spelling error in every
occurrence of some word) and probably even more by taking context into account for word
replacements that can increase the chance of a more common formulation resulting in
more “meaningful” text.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Evaluation</title>
      <p>We automatically evaluate the safety of the three obfuscation approaches against 44
authorship verifiers which have been submitted to the previous three shared tasks on
authorship identification at PAN 2013 through PAN 2015, and we manually assess
sensibleness and soundness of the obfuscated texts of each obfuscator.
Our safety evaluation is built with an eye on reproducibility, so that future evaluations
of author obfuscators may be conducted with ease. In what follows, we detail our setup,
the datasets used, and report on the results obtained.</p>
      <p>
        Evaluation Setup The scale of our safety evaluation is made possible based on our
long-term evaluation-as-a-service initiative [
        <xref ref-type="bibr" rid="ref39">39</xref>
        ], and the development of the
corresponding cloud-based evaluation platform TIRA [
        <xref ref-type="bibr" rid="ref32">32, 89</xref>
        ].5 TIRA facilitates software
submissions for shared task competitions so that the organizers of a shared task can ask
participants to submit their software instead of just its run output. At PAN, we have
successfully invited software submissions for various shared tasks since 2012, all of
which are still archived and available for reuse on TIRA. This is also the case for the
past three editions of the shared task on authorship verification organized at PAN 2013
through PAN 2015. A total of 49 pieces of software have been submitted over all three
years by as many research teams from all over the world, 44 of which were eligible for
      </p>
      <sec id="sec-5-1">
        <title>5 www.tira.io</title>
        <p>
          our evaluation.6 This collection of software represents the state of the art in authorship
verification, implementing many different paradigms of tackling this task as well as
hundreds of different features. The best-performing approaches of each year are those
of Seidman [95] submitted to PAN 2013, Fréry et al. [
          <xref ref-type="bibr" rid="ref29">29</xref>
          ] submitted to PAN 2014 and
Modaresi and Gross [77] submitted to PAN 2014 (which outperforms Fréry’s approach
on a different test dataset), and, the approach of Bagnall [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] submitted to PAN 2015.
Seidman implements the Impostor’s Method of Koppel and Winter [62], Fréry
implements a “traditional” classification approach based on style-indicating features and
linear classifiers, whereas Modaresi employs fuzzy clustering and Bagnall a multi-headed
recurrent deep neural network instead of a linear classifier. The range of approaches
implemented is too broad to be reviewed in detail here, but a complete survey can be found
in the overview papers of the respective shared tasks in [
          <xref ref-type="bibr" rid="ref47">47, 97, 98</xref>
          ]. This collection of
authorship verifiers is a unique resource for reproducible evaluations on authorship
verification, and it forms a solid basis for the evaluation of author obfuscators at scale.
Supported by TIRA, the total time to evaluate all 44 authorship verification on
obfuscated versions of PAN’s test datasets amounted to less than two man-weeks work time.
        </p>
        <p>The participants of the shared task on author obfuscation, too, have been invited
to submit their obfuscation approaches in the form of software to TIRA. This way,
any newly submitted authorship verification software can be immediately evaluated for
vulnerabilities against the submitted obfuscators, and vice versa.</p>
        <p>
          Evaluation Datasets The test datasets on which our evaluation is based correspond
to those used at the shared tasks on authorship verification at PAN 2013 to PAN 2015,
covering a selection of genres:
– PAN13. Collection of English computer science textbook excerpts. Formulas and
source code were removed. Problem instances comprise one document of unknown
authorship and average 4 documents of known authorship. The training dataset
comprises 10 problem instances at an average 1000 words per document, and the
test dataset 30 problem instances at the same average document length.
– PAN14 EE. Collection of English essays written by English-as-a-second-language
students at different language proficiency levels. Essays were divide into age-based
clusters before forming verification problems. Problem instances comprise one
document of unknown authorship and an average 2.6 documents of known authorship.
The training dataset comprises 200 problem instances at an average 848 words per
document, and the test dataset 200 problem instances at an average 833 words per
document.
– PAN14 EN. Collection of English horror fiction novel excerpts from H.P.
Lovecraft’s “Cthulhu Mythos”. The genre is selected very narrowly to strip away any
cross-topic or cross-genre effects typically found in other collections. Many
unfamiliar terms are found in these documents, creating strong indicators of shared topic
and style. Most texts’ themes have a strong negative coloring. Lovacraft’s original
6 Five approaches had to be excluded: the three approaches submitted by Halvani et al. [
          <xref ref-type="bibr" rid="ref36 ref37 ref38">36,
37, 38</xref>
          ] have been deleted from TIRA at the request of the authors; the approach by Veenman
and Li [106] was dysfunctional, and the approach of Vartapetiance and Gillam [105] was not
submitted for evaluation on the English portion of the PAN test datasets.
writings as well as modern fan-fiction form part of the dataset. Problem instances
comprise two documents. The training dataset comprises 100 problem instances at
an average 3138 words per document, and the test dataset 200 problem instances at
an average 6104 words per document.
– PAN15. Collection of English dialog lines from plays, excluding speaker names,
stage directions, lists of characters, etc. Problem instances comprise two
documents; cases where the author of the two documents match are from different plays
of the same author. The training dataset comprises 100 problem instances at an
average 366 words per document, and the test dataset 500 problem instances at an
average 536 words per document.
        </p>
        <p>All datasets have a balanced ratio of problem instances where the author of the
documents of known authorship is the same as that of the document of unknown authorship
to problem instances where this is not the case. The training datasets were release to
participants so that they could develop their obfuscation approaches against it. The test
datasets were kept private. Participants who made a successful software submission
could run their software on TIRA against the test datasets, while TIRA prevents any
direct access of participants to the test datasets hosted there, and takes precautions against
data leaks. This way, any optimization of approaches against test datasets is rendered
impossible.</p>
        <p>
          Evaluation Results Table 1 shows the results of our safety evaluation of the three
submitted author obfuscation approaches against at total of 44 authorship verification
approaches on the aforementioned four PAN evaluation datasets. The best-performing
approach across all performance measures and across all datasets is the author
obfuscator of Mihaylova et al. [75]. In terms of average impact (avg imp), it manages to
flip between 46% and 49% of the correct authorship attributions of the verifiers on
the datasets PAN13, PAN14 EE, and PAN14 EN, but only about 36% on the PAN15
dataset. While the approach of Keswani et al. [
          <xref ref-type="bibr" rid="ref52">52</xref>
          ] achieves an average impact close
0.10
-0.09
0.13
-0.06
-0.36
-0.02
0.01
-0.00
verifiers that were supported instead of obstructed when applying the obfuscation to a given
test dataset. The support is measured as positive performance delta and negative relative impact,
respectively. Many of these discrepancies can be explained and dismissed for various reasons
outlined in Section 5.1; for three cases, however, no explanation could be found (marked by a ?).
to Mihaylova et al.’s on the former two datasets, its performance is much worse on
the latter two. The approach of Mansoorizadeh et al. [70] achieves about half the
average impact of Mihaylova et al.’s approach at best. For completeness sake, Table 1
also reports the performance deltas obtained for accuracy and recall as described in
Section 3, as well as the performance deltas the obfuscators achieve with respect to the
performance measures applied at PAN’s shared tasks on authorship verification. The
performance values are numerically much smaller, but their trend is the same as that of
the average impact measure.
        </p>
        <p>However, while the averaged performance deltas and the average impact all point
into the direction that obfuscation approaches decrease the performance of authorship
verification approaches, this is not unanimously the case. Tables 3, 4, and 5 show the
performances when applying the individual authorship verification approaches on the
four respective test datasets, each one obfuscated once by the approaches of Mihaylova
et al. (Table 3), Keswani et al. (Table 4), and Mansoorizadeh et al. (Table 5). Notably, all
obfuscators appear to improve a number of verifiers instead of obstructing them. Table 2
-0.29
-0.39
-0.38
-0.33
-0.54
-0.05
-0.01
-0.01
-0.11
-0.10
-0.29
-0.54
-0.04
-0.74
-0.04
-0.02
-0.01
-0.06
-0.10
-1.00
-0.14
-0.11
-0.25
-0.05
-0.02
-0.03
-0.83
-0.16
-0.22
-0.06
-0.02
-0.03
-0.01
collects all cases where an authorship verification approach benefits from obfuscation.
However, not all of these cases spell out failure for the obfuscation approaches:
– Reason 1. Negative impacts greater than -0.1 can be explained by random variation
of the authorship verification approach.
– Reason 2. Authorship verification approaches which have an a priori performance
of less than 0:6 acc may perhaps not be trusted to make reasonable decisions in the
face of text alterations.
– Reason 3. High improvements on the PAN13 dataset are easily obtained by flipping
only a few decisions, since the dataset is very small compared to the others.
– Reason 4. In three cases, the high negative obfuscation impacts on Maitra’s [69]
approach go along with a high negative accuracy performance delta ( acc), which
indicates that obfuscation has caused this verifier to shift a high number of its
decisions from “different author” to “same author,” but also on problem instances where
no obfuscation has been applied (i.e., problems whose documents have indeed been
written by different authors). This suggests that the decisions of Maitra’s approach
are not independent of one another but relative to decisions on other problem
instances, which must be considered erroneous behavior.</p>
        <p>These reasons notwithstanding, some cases remain where a priori well-performing
obfuscators significantly improve after obfuscation, and the question remains why this
is the case. It remains unclear whether these are random fluctuations which are to be
expected when applying many authorship verification approaches implemented by as
many different people onto obfuscated datasets, or whether something like this can
be expected in practice as well. After applying all these rules of the thumb, only three
cases remain unexplained, namely the negative obfuscation impact of Mihaylova et al.’s
obfuscator on Kocher’s verifier on the dataset PAN14 EN, and that of Mansoorizadeh et
al.’s obfuscator on Mechti’s verifier and Modaresi’s verifier on the dataset PAN14 EN.
Since in all three cases, the PAN14 EN dataset is involved, it’s characteristics may cause
this behavior, but this is just a speculation.</p>
        <p>Altogether, we draw three conclusions from these results: (1) even basic author
obfuscation can already achieve some degree of safety against state-of-the-art automatic
forensic authorship analyses, (2) the state of the art in authorship verification is
extremely vulnerable to obfuscation, and (3) obfuscation approaches must be evaluated in
as many different situations as possible to identify odd behavior of both the obfuscator
as well as authorship verifiers. The latter conclusion is particularly important since it
forces us to now take the effectiveness of authorship verification technology reported in
the literature with a grain of salt, especially in adversarial scenarios: its application in
court is rendered doubtful, since texts of disputed authorship may have been tampered
with to influence the court’s decision.
5.2</p>
        <sec id="sec-5-1-1">
          <title>Sensibleness and Soundness</title>
          <p>A human assessor skimmed through a random subset of ten obfuscated texts for each
approach using the aforementioned visual analytics tool outlining the differences of the
original and obfuscated texts (some example screenshots are given in Section 4). The
Verifier
Team</p>
          <p>
            [Reference]
Bagnall [
            <xref ref-type="bibr" rid="ref5">5</xref>
            ] 0.478 0.04
Bartoli [
            <xref ref-type="bibr" rid="ref7">7</xref>
            ] 0.647 -0.11
Bobicev [
            <xref ref-type="bibr" rid="ref8">8</xref>
            ] 0.5144 -0.29
Castillo [
            <xref ref-type="bibr" rid="ref17">17</xref>
            ] 0.4 -0.10
Castro [
            <xref ref-type="bibr" rid="ref18">18</xref>
            ] 0.9 -0.04
Feng [
            <xref ref-type="bibr" rid="ref24">24</xref>
            ] 0.46 -0.22
Fratila [
            <xref ref-type="bibr" rid="ref28">28</xref>
            ] 0.38 -0.41
Fréry [
            <xref ref-type="bibr" rid="ref29">29</xref>
            ] 0.333 -0.21
Ghaeini [
            <xref ref-type="bibr" rid="ref31">31</xref>
            ] 0.46 -0.42
Gómez-Adorno [
            <xref ref-type="bibr" rid="ref33">33</xref>
            ] inf 0.00
Grozea [
            <xref ref-type="bibr" rid="ref34">34</xref>
            ] 0.05 0.02
Gutierrez [
            <xref ref-type="bibr" rid="ref35">35</xref>
            ] 0.8182 0.02
Harvey [
            <xref ref-type="bibr" rid="ref40">40</xref>
            ] 0.001 -0.19
Hürlimann [
            <xref ref-type="bibr" rid="ref42">42</xref>
            ] 0.6394 -0.25
Jankowska [
            <xref ref-type="bibr" rid="ref43">43</xref>
            ] 0.59 -0.15
Jankowska [
            <xref ref-type="bibr" rid="ref44">44</xref>
            ] 0.615 -0.16
Jayapal [
            <xref ref-type="bibr" rid="ref45">45</xref>
            ] 1 0.00
Kern [
            <xref ref-type="bibr" rid="ref51">51</xref>
            ] 0.5 0.15
Khonji [
            <xref ref-type="bibr" rid="ref53">53</xref>
            ] 0.444 -0.34
Kocher [58] 0.484 -0.08
Layton [64] inf 0.15
Layton [63] 0.7057 -0.08
Ledesma [66] inf 0.00
Maitra [69] 0.8 0.04
Mayor [71] 0.1 -0.14
Mechti [74] 0.469 0.09
Modaresi [77] 0.392 -0.06
Moreau [81] 1 0.00
Moreau [80] 0.6215 -0.28
Nikolov [82] 0.448 -0.15
Pacheco [83] 0.7223 -0.06
Petmanson [84] 0.59 -0.39
Sari [93] 0.546 -0.02
Satyam [94] 0.423 0.02
Seidman [95] 1 -0.01
Solórzano [96] 0.812 -0.15
van Dam [102] 1 0.00
Vartapetiance [103] inf 0.00
Vartapetiance [104] inf 0.00
Vilarino [107] 1 0.00
Zamani [113] 0.997 0.07
PAN 2014 EN test dataset
final acc
          </p>
          <p>Section 3
final acc</p>
          <p>Section 3
formances and performance deltas of various authorship verification approaches submitted to
PAN 2013 through PAN 2015 when run on test datasets that have been obfuscated by this
obfuscator. Verifiers that failed to process a dataset (e.g., for being incompatible or not scalable) have
been omitted from the tables. Verifiers whose optimal classification threshold
classification accuracy acc on the unobfuscated test dataset turned out to be negative or positive
infinity (i.e., marking all problem instances “same author” or “different author”, respectively)
were omitted from forming the average performances reported in Table 1.
PAN 2014 EN test dataset
final acc</p>
          <p>Section 3
final acc</p>
          <p>Section 3
formances and performance deltas of various authorship verification approaches submitted to
PAN 2013 through PAN 2015 when run on test datasets that have been obfuscated by this
obfuscator. Verifiers that failed to process a dataset (e.g., for being incompatible or not scalable) have
been omitted from the tables. Verifiers whose optimal classification threshold
classification accuracy acc on the unobfuscated test dataset turned out to be negative or positive
infinity (i.e., marking all problem instances “same author” or “different author”, respectively)
were omitted from forming the average performances reported in Table 1.
PAN 2014 EN test dataset
PAN measures
AUC
final acc</p>
          <p>Section 3
final acc</p>
          <p>Section 3
performances and performance deltas of various authorship verification approaches submitted to
PAN 2013 through PAN 2015 when run on test datasets that have been obfuscated by this
obfuscator. Verifiers that failed to process a dataset (e.g., for being incompatible or not scalable) have
been omitted from the tables. Verifiers whose optimal classification threshold
classification accuracy acc on the unobfuscated test dataset turned out to be negative or positive
infinity (i.e., marking all problem instances “same author” or “different author”, respectively)
were omitted from forming the average performances reported in Table 1.
assessor’s observation on these sample cases was that most of the paraphrased texts of
each particular approach have a very similar characteristic with respect to sensibleness
and soundness. The final decision then was to base the manual assessment on just one
random text from each year of PAN’s test datasets, excluding the original texts from
language learners (their original text quality might already be suboptimal): in-depth
manual assessment was performed on problem instances 5, 134, and 429.</p>
          <p>The human assessor started by reading the obfuscated texts without knowing which
was the output of what approach. During this reading phase, the assessor marked up
errors (typos, grammar) and assigned school grades (on a scale from 1 (excellent) to
5 (fail)) for the sensibleness of each of the sample problem instances. As a result, the
obfuscated texts of Mansoorizadeh et al.’s approach got a grade 2 for all three cases
mainly due to the many punctuation problems where a white space was inserted before
every punctuation mark. The texts of Mihaylova et al.’s obfuscator all got a grade 4
due to the many grammatical errors, capitalizing issues (lower-case sentence starts,
capitalized words in the middle of sentences), punctuation problems (many missing
or useless punctuations), and consistent typos (“tto” and “oof”). The assessor noted
that the texts were difficult to read due to the many problems but that grade 4 was
given to show the difference to the even worse texts of Keswani et al.’s obfuscator.
For Keswani et al.’s approach, our assessor noted that the texts were impossible to
read or understand with lots of grammatical errors, capitalization problems at sentence
beginnings, etc. The assessor even wanted to stop reading before finishing the whole
text due to the “painful” experience.</p>
          <p>After grading the sensibleness of the obfuscated texts, the assessor read the original
texts and used the visual analytics tool highlighting the textual differences in various
ways to assess the soundness of the obfuscated texts on a three-point scale as correct,
passable, or incorrect. The obfuscated texts of Mihaylova et al.’s and Keswani et al.’s
approaches were both judged “incorrect” for all three cases since they are almost
impossible to read. Mihaylova et al.’s obfuscator might produce slightly more sound texts
than Keswani et al.’s, yet, the assessor did not want to assign a “passable” to any of these
but suggested to further differentiate the point scale as a future evaluation improvement.
Not that surprising, Mansoorizadeh et al.’s very conservative approach achieved
“correct” as a score for the problem instances 134 and 429. However, some of the changes
for problem instance 5 did change the meaning a lot (“car” instead of “machine” when
a computer was meant, etc.) such that for problem instance 5, Mansoorizadeh et al.’s
approach only got a “passable” score.
Three groups participated in the peer evaluation; one external group (Liebeck et al. [67])
and two of the author obfuscation task participants (Mansoorizadeh et al. and
Mihaylova et al.).</p>
          <p>
            As for the safety evaluation, all three groups employ the GLAD authorship
verification approach [
            <xref ref-type="bibr" rid="ref42">42</xref>
            ] and checked how many of its decisions are changed by obfuscation.
Interestingly, two of the peer review teams agree that Mihaylova et al.’s approach is
safest against the GLAD system while Keswani et al.’s obfuscator is ranked second;
in Mansoorizadeh et al. peer review, however, Keswani et al. is ranked before
Mansoorizadeh et al., and Mihaylova et al. is ranked third, which is claimed to have hardly
any effect. In our evaluation against many more than just one verification approach, we
agree with the former two peer reviews in that Mihaylova et al. is the safest approach
while Keswani et al.’s approach ranks second.
          </p>
          <p>As for soundness, Liebeck et al. suggest an automatic measure based on approaches
submitted to the shared tasks on semantic textual similarity (STS) organized at
SemEval, where an optimum soundness would achieve a score of 5 and the worst
soundness a score of 0. In Liebeck et al.’s evaluation, all three obfuscation approaches achieve
scores above 4, where Mansoorizadeh et al. performs almost perfect and Keswani et al.
the worst. Still, given our own soundness evaluation, the scores for the approaches of
Keswani et al. and Mihaylova et al. with 4+ out of 5 seem a too optimistic and should
be much closer to 0. Similarly, also the peer review of Mansoorizadeh et al. scores
soundness from 5 down to 0 but by a manual judgment. Here, their own obfuscation
approach outperforms the others by far (4.86), while Mihaylova et al. achieve a 3.93 and
Keswani et al. a 2.02; the same ordering as in Liebeck et al.’s evaluation but probably
also rather high scores for the two worse approaches. The peer review of Mihaylova et
al. is also based on manual scoring for soundness and seems to come to the conclusion
that their own approach is the best, followed by Mansoorizadeh et al.’s obfuscations.
However, this seems to be a rather biased outcome since our own manual soundness
evaluation showed the soundness of Mihaylova et al.’s obfuscation approach to be
inferior to Mansoorizadeh et al.’s approach.</p>
          <p>As for sensibleness, all three peer evaluators opt for a manual analysis and rank
the obfuscations of Mansoorizadeh et al. clearly more sensible than Mihaylova et al.’s
that again are more sensible than Keswani et al.’s obfuscations. Interestingly,
Mansoorizadeh et al. employ a scale from 0–5 with 5 as most sensible, Liebeck et al.
employ a three point scale, while Mihalyova et al. only have a two point scale (sensible or
not) and grade half of a sample of their own approach’s obfuscations as sensible. Just as
is the case for Mihalyova et al.’s peer review of soundness, this somewhat contradicts
our own manual sensibleness evaluation, but in sum the ordering of the approaches of
all three peer reviews is consistent with our own sensibleness grading.</p>
          <p>Not surprisingly, the approach of Mansoorizadeh et al. that hardly changes anything
in a text, except for introducing many spaces before punctuation marks, achieves good
and very good scores for sensibleness and soundness but is the least safe of the tested
obfuscators. Although hardly being sound or sensible, the texts produced by the safest
obfuscator of Mihaylova et al. have a slightly better quality compared to the
roundtrip translations produced by Keswani et al.’s obfuscator. Most of the three external
peer reviews agree with these evaluation results at least on the relative ordering of the
individual obfuscators.
6</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Conclusion and Outlook</title>
      <p>We have conducted the first large-scale evaluation of author obfuscation approaches
in terms of their safety against the state of the art in authorship verification. A total
of 44 verification approaches have been tested as to their vulnerability to obfuscation,
and we found that many of them are indeed vulnerable to a greater or lesser extent.
Moreover, for the first time, we have shown that author obfuscation technology can
take on many authorship verification approach simultaneously, which is a must if this
technology is supposed to be useful in practice. The best-performing obfuscator flips on
average about 47% of an authorship verifier’s decisions towards choosing “different
author” when the opposite decision would have been correct. The obfuscation approaches
evaluated have been collected via a shared task on author obfuscation that we
organized at PAN 2016; three obfuscators have been submitted which are now hosted on the
TIRA evaluation-as-a-service platform, ready for re-evaluation against new authorship
verification approaches. Furthermore, we have systematically reviewed the literature on
author obfuscation and collected and organized for the first time its evaluation
methodology, introducing the three main performance dimensions of an author obfuscator:
safety, soundness, and sensibleness.</p>
      <p>There are still many open challenges when it comes to evaluating author obfuscation
approaches properly and at scale, some requiring original research into new
technologies that are capable of recognizing paraphrases, textual entailment, grammaticality, and
style deception. Conceivably, approaches to these problems can be devised which are
tailored to the evaluation of author obfuscation approaches and therefore exploit certain
aspect of this application domain to achieve better performance than in the general case.
We leave a more detailed investigation in this direction for future work.</p>
      <sec id="sec-6-1">
        <title>Acknowledgements</title>
        <p>We thank the participating teams of this shared task. Our special thanks go to Adobe
Systems Inc. for sponsoring the event.
72. McDonald, A., Afroz, S., Caliskan, A., Stolerman, A., Greenstadt, R.: Use Fewer
Instances of the Letter "i": Toward Writing Style Anonymization. In: Fischer-Hübner, S.,
Wright, M. (eds.) Privacy Enhancing Technologies - 12th International Symposium, PETS
2012, Vigo, Spain, July 11-13, 2012. Proceedings. Lecture Notes in Computer Science,
vol. 7384, pp. 299–318. Springer (2012), http://dx.doi.org/10.1007/978-3-642-31680-7_16
73. McDonald, A., Ulman, J., Barrowclift, M., Greenstadt, R.: Anonymouth Revamped:
Getting Closer to Stylometric Anonymity. In: Kapadia, A., Caine, K., Camp, L., Lee, A.,
Patil, S., Reiter, M., Staddon, J. (eds.) Proceedings of the Workshop on Privacy Enhancing
Tools PETools, Bloomington, Indiana, USA, July 9, 2013. (2013)
74. Mechti, S., Jaoua, M., Faiz, R., Belguith, L., Bsir, B.: On the Empirical Evaluation of</p>
        <p>
          Author Identification Hybrid Method—Notebook for PAN at CLEF 2015. In: [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]
75. Mihaylova, T., Karadjov, G., Nakov, P., Kiprov, Y., Georgiev, G., Koychev, I.:
SU@PAN’2016: Author Obfuscation—Notebook for PAN at CLEF 2016. In: [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ],
http://ceur-ws.org/Vol-1609/
76. Miller, G.A.: Wordnet: A lexical database for english. Commun. ACM 38(11), 39–41
(Nov 1995), http://doi.acm.org/10.1145/219717.219748
77. Modaresi, P., Gross, P.: A Language Independent Author Verifier Using Fuzzy C-Means
        </p>
        <p>
          Clustering—Notebook for PAN at CLEF 2014. In: [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]
78. Moon, S.W., Gweon, G., Choi, H., Heo, J.: Apem: Automatic paraphrase evaluation using
morphological analysis for the korean language. In: 2016 18th International Conference
on Advanced Communication Technology (ICACT). pp. 680–684. IEEE (2016)
79. Moreau, E., Jayapal, A., , Vogel, C.: Author Verification: Exploring a Large set of
        </p>
        <p>
          Parameters using a Genetic Algorithm—Notebook for PAN at CLEF 2014. In: [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]
80. Moreau, E., Jayapal, A., Lynch, G., Vogel, C.: Author Verification: Basic Stacked
Generalization Applied To Predictions from a Set of Heterogeneous Learners—Notebook
for PAN at CLEF 2015. In: [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]
81. Moreau, E., Vogel, C.: Style-based Distance Features for Author Verification—Notebook
for PAN at CLEF 2013. In: [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ]
82. Nikolov, S., Tabakova, D., Savov, S., Kiprov, Y., Nakov, P.: SUPAN’2015: Experiments in
        </p>
        <p>
          Author Verification—Notebook for PAN at CLEF 2015. In: [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]
83. Pacheco, M., Fernandes, K., Porco, A.: Random Forest with Increased Generalization: A
Universal Background Approach for Authorship Verification—Notebook for PAN at
CLEF 2015. In: [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]
84. Petmanson, T.: Authorship Identification using Correlations of Frequent
        </p>
        <p>
          Features—Notebook for PAN at CLEF 2013. In: [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ]
85. Pimas, O., Kröll, M., Kern, R.: Know-Center at PAN 2015 Author
        </p>
        <p>
          Identification—Notebook for PAN at CLEF 2015. In: [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]
86. Posadas-Durán, J.P., Sidorov, G., Batyrshin, I., Mirasol-Meléndez, E.: Author Verification
        </p>
        <p>
          Using Syntactic N-grams—Notebook for PAN at CLEF 2015. In: [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]
87. Post, M.: Judging grammaticality with tree substitution grammar derivations. In:
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics:
Human Language Technologies. pp. 217–222. Association for Computational Linguistics,
Portland, Oregon, USA (June 2011), http://www.aclweb.org/anthology/P11-2038
88. Potthast, M., Braun, S., Buz, T., Duffhauss, F., Friedrich, F., Gülzow, J., Köhler, J.,
Lötzsch, W., Müller, F., Müller, M., Paßmann, R., Reinke, B., Rettenmeier, L., Rometsch,
T., Sommer, T., Träger, M., Wilhelm, S., Stein, B., Stamatatos, E., Hagen, M.: Who Wrote
the Web? Revisiting Influential Author Identification Research Applicable to Information
Retrieval. In: Ferro, N., Crestani, F., Moens, M.F., Mothe, J., Silvestri, F., Di Nunzio, G.,
Hauff, C., Silvello, G. (eds.) Advances in Information Retrieval. 38th European
Conference on IR Resarch (ECIR 16). Lecture Notes in Computer Science, vol. 9626, pp.
393–407. Springer, Berlin Heidelberg New York (Mar 2016)
89. Potthast, M., Gollub, T., Rangel, F., Rosso, P., Stamatatos, E., Stein, B.: Improving the
Reproducibility of PAN’s Shared Tasks: Plagiarism Detection, Author Identification, and
Author Profiling. In: Kanoulas, E., Lupu, M., Clough, P., Sanderson, M., Hall, M.,
Hanbury, A., Toms, E. (eds.) Information Access Evaluation meets Multilinguality,
Multimodality, and Visualization. 5th International Conference of the CLEF Initiative
(CLEF 14). pp. 268–299. Springer, Berlin Heidelberg New York (Sep 2014)
90. Potthast, M., Trenkmann, M., Stein, B.: Netspeak: Assisting Writers in Choosing Words.
        </p>
        <p>In: Gurrin, C., He, Y., Kazai, G., Kruschwitz, U., Little, S., Roelleke, T., Rüger, S., van
Rijsbergen, K. (eds.) Advances in Information Retrieval. 32nd European Conference on
Information Retrieval (ECIR 10). Lecture Notes in Computer Science, vol. 5993, p. 672.</p>
        <p>Springer, Berlin Heidelberg New York (Mar 2010)
91. Rao, J., Rohatgi, P.: Can Pseudonymity Really Guarantee Privacy? In: Bellovin, S., Rose,
G. (eds.) 9th USENIX Security Symposium, Denver, Colorado, USA, August 14-17, 2000.
USENIX Association (2000),
https://www.usenix.org/conference/9th-usenix-securitysymposium/can-pseudonymity-really-guarantee-privacy
92. Rosso, P., Rangel, F., Potthast, M., Stamatatos, E., Tschuggnall, M., Stein, B.: Overview of
PAN’16—New Challenges for Authorship Analysis: Cross-genre Profiling, Clustering,
Diarization, and Obfuscation. In: Fuhr, N., Quaresma, P., Larsen, B., Gonçalves, T., Balog,
K., Macdonald, C., Cappellato, L., Ferro, N. (eds.) Experimental IR Meets Multilinguality,
Multimodality, and Interaction. 7th International Conference of the CLEF Initiative (CLEF
16). Springer, Berlin Heidelberg New York (Sep 2016)
93. Sari, Y., Stevenson, M.: A Machine Learning-based Intrinsic Method for Cross-topic and</p>
        <p>
          Cross-genre Authorship Verification—Notebook for PAN at CLEF 2015. In: [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]
94. Satyam, Anand, Dawn, A., , Saha, S.: Statistical Analysis Approach to Author
Identification Using Latent Semantic Analysis—Notebook for PAN at CLEF 2014. In:
[
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]
95. Seidman, S.: Authorship Verification Using the Impostors Method—Notebook for PAN at
        </p>
        <p>
          CLEF 2013. In: [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ]
96. Solórzano, J., Mijangos, V., Pimentel, A., López-Escobedo, F., Montes, A., Sierra, G.:
Authorship Verification by Combining SVMs with Kernels Optimized for Different
Feature Categories—Notebook for PAN at CLEF 2015. In: [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]
97. Stamatatos, E., amd Ben Verhoeven, W.D., Juola, P., López-López, A., Potthast, M., Stein,
        </p>
        <p>
          B.: Overview of the Author Identification Task at PAN 2015. In: [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]
98. Stamatatos, E., Daelemans, W., Verhoeven, B., Potthast, M., Stein, B., Juola, P.,
Sanchez-Perez, M., Barrón-Cedeño, A.: Overview of the Author Identification Task at
PAN 2014. In: [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]
99. Stein, B., Hagen, M., Bräutigam, C.: Generating Acrostics via Paraphrasing and Heuristic
Search. In: Tsujii, J., Hajic, J. (eds.) 25th International Conference on Computational
Linguistics (COLING 14). pp. 2018–2029. Association for Computational Linguistics
(Aug 2014)
100. Sun, G., Liu, X., Cong, G., Zhou, M., Xiong, Z., Lee, J., Lin, C.: Detecting erroneous
sentences using automatically mined sequential patterns. In: Carroll, J.A., van den Bosch,
A., Zaenen, A. (eds.) ACL 2007, Proceedings of the 45th Annual Meeting of the
Association for Computational Linguistics, June 23-30, 2007, Prague, Czech Republic.
The Association for Computational Linguistics (2007),
http://aclweb.org/anthology-new/P/P07/P07-1011.pdf
101. Tweedie, F.J., Singh, S., Holmes, D.I.: Neural Network Applications in Stylometry: The
Federalist Papers. Computers and the Humanities 30(1), 1–10 (1996),
http://dx.doi.org/10.1007/BF00054024
102. van Dam, M.: A Basic Character N-gram Approach to Authorship Verification—Notebook
for PAN at CLEF 2013. In: [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ]
103. Vartapetiance, A., Gillam, L.: A Textual Modus Operandi: Surrey’s Simple System for
        </p>
        <p>
          Author Identification—Notebook for PAN at CLEF 2013. In: [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ]
104. Vartapetiance, A., Gillam, L.: A Trinity of Trials: Surrey’s 2014 Attempts at Author
        </p>
        <p>
          Verification—Notebook for PAN at CLEF 2014. In: [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]
105. Vartapetiance, A., Gillam, L.: Adapting for Subject-Specific Term Length using Topic
        </p>
        <p>
          Cost in Author Verification—Notebook for PAN at CLEF 2015. In: [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]
106. Veenman, C., Li, Z.: Authorship Verification with Compression Features. In: [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ]
107. Vilariño, D., Pinto, D., Gómez, H., León, S., Castillo, E.: Lexical-Syntactic and
Graph-Based Features for Authorship Verification—Notebook for PAN at CLEF 2013. In:
[
          <xref ref-type="bibr" rid="ref26">26</xref>
          ]
108. Wagner, J., Foster, J.: The effect of correcting grammatical errors on parse probabilities.
        </p>
        <p>In: Proceedings of the 11th International Conference on Parsing Technologies (IWPT’09).
pp. 176–179. Association for Computational Linguistics, Paris, France (October 2009),
http://www.aclweb.org/anthology/W09-3827
109. Wagner, J., Foster, J., van Genabith, J.: Judging grammaticality: Experiments in sentence
classification. CALICO Journal 26(3), 474–490 (2009)
110. Weese, J., Ganitkevitch, J., Callison-Burch, C.: Paradigm: Paraphrase diagnostics through
grammar matching. In: Proceedings of the 14th Conference of the European Chapter of
the Association for Computational Linguistics. pp. 192–201. Association for
Computational Linguistics, Gothenburg, Sweden (April 2014),
http://www.aclweb.org/anthology/E14-1021
111. Wong, S.M.J., Dras, M.: Parser features for sentence grammaticality classification. In:
Proceedings of the Australasian Language Technology Association Workshop 2010. pp.
67–75. Melbourne, Australia (December 2010)
112. Xu, W., Ritter, A., Dolan, B., Grishman, R., Cherry, C.: Paraphrasing for style. In:
Proceedings of COLING 2012. pp. 2899–2914. The COLING 2012 Organizing
Committee, Mumbai, India (December 2012), http://www.aclweb.org/anthology/C12-1177
113. Zamani, H., Abnar, S., Dehghani, M., Forati, M., Babaei, P.: Submission to the Author
Identification Task at PAN 2014. http://www.uni-weimar.de/medien/webis/events/pan-14
(2014), http://www.clef-initiative.eu/publication/working-notes, From the University of
Tehran, Iran
114. Zheng, R., Li, J., Chen, H., Huang, Z.: A framework for authorship identification of online
messages: Writing-style features and classification techniques. Journal of the American
Society for Information Science and Technology 57(3), 378–393 (2006)</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Abbasi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
          </string-name>
          , H.:
          <article-title>Writeprints: A stylometric approach to identity-level identification and similarity detection in cyberspace</article-title>
          .
          <source>ACM Trans. Inf. Syst</source>
          .
          <volume>26</volume>
          (
          <issue>2</issue>
          ), 7:
          <fpage>1</fpage>
          -
          <lpage>7</lpage>
          :
          <fpage>29</fpage>
          (Apr
          <year>2008</year>
          ), http://doi.acm.
          <source>org/10</source>
          .1145/1344411.1344413
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Afroz</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brennan</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Greenstadt</surname>
          </string-name>
          , R.: Detecting Hoaxes, Frauds, and
          <article-title>Deception in Writing Style Online</article-title>
          .
          <source>In: 2012 IEEE Symposium on Security and Privacy</source>
          . pp.
          <fpage>461</fpage>
          -
          <lpage>475</lpage>
          (May
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Almishari</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Oguz</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tsudik</surname>
          </string-name>
          , G.:
          <article-title>Fighting Authorship Linkability with Crowdsourcing</article-title>
          . In: Sala,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Goel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Gummadi</surname>
          </string-name>
          ,
          <string-name>
            <surname>K</surname>
          </string-name>
          . (eds.)
          <article-title>Proceedings of the second ACM conference on Online social networks</article-title>
          ,
          <source>COSN</source>
          <year>2014</year>
          , Dublin, Ireland, October 1-
          <issue>2</issue>
          ,
          <year>2014</year>
          . pp.
          <fpage>69</fpage>
          -
          <lpage>82</lpage>
          . ACM (
          <year>2014</year>
          ), http://doi.acm.
          <source>org/10</source>
          .1145/2660460.2660486
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Backes</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Berrang</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Manoharan</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Poster: Assessing the effectiveness of countermeasures against authorship recognition</article-title>
          .
          <source>In: 2015 IEEE Symposium on Security and Privacy</source>
          ,
          <string-name>
            <surname>SP</surname>
          </string-name>
          <year>2015</year>
          , San Jose, CA, USA, May
          <volume>17</volume>
          -21,
          <year>2015</year>
          . IEEE Computer Society (
          <year>2015</year>
          ), http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=
          <fpage>7160813</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Bagnall</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Author Identification using multi-headed Recurrent Neural Networks-Notebook for PAN at CLEF 2015</article-title>
          . In: [16]
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Balog</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cappellato</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ferro</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Macdonald</surname>
          </string-name>
          , C. (eds.):
          <article-title>CLEF 2016 Evaluation Labs</article-title>
          and Workshop - Working Notes Papers,
          <fpage>5</fpage>
          -
          <lpage>8</lpage>
          September, Évora, Portugal. CEUR Workshop Proceedings, CEUR-WS.org (
          <year>2016</year>
          ), http://www.clef-initiative.eu/publication/working-notes
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Bartoli</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dagri</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>De Lorenzo</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Medvet</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tarlao</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>An Author Verification Approach Based on Differential Features-Notebook for PAN at CLEF 2015</article-title>
          . In: [16]
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Bobicev</surname>
          </string-name>
          , V.:
          <article-title>Authorship Detection with PPM-Notebook for PAN at CLEF 2013</article-title>
          . In: [26]
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Brants</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Franz</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Web 1T 5-gram Version 1</article-title>
          .
          <source>Linguistic Data Consortium LDC2006T13</source>
          , Philadelphia (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Brennan</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Afroz</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Greenstadt</surname>
          </string-name>
          , R.: Adversarial Stylometry:
          <article-title>Circumventing Authorship Recognition to Preserve Privacy and Anonymity</article-title>
          .
          <source>ACM Trans. Inf. Syst. Secur</source>
          .
          <volume>15</volume>
          (
          <issue>3</issue>
          ),
          <volume>12</volume>
          (
          <year>2012</year>
          ), http://doi.acm.
          <source>org/10</source>
          .1145/2382448.2382450
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Brennan</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Greenstadt</surname>
          </string-name>
          , R.:
          <article-title>Practical Attacks Against Authorship Recognition Techniques</article-title>
          . In: Haigh,
          <string-name>
            <given-names>K.</given-names>
            ,
            <surname>Rychtyckyj</surname>
          </string-name>
          , N. (eds.)
          <source>Proceedings of the Twenty-First Conference on Innovative Applications of Artificial Intelligence, July 14-16</source>
          ,
          <year>2009</year>
          , Pasadena, California, USA. AAAI (
          <year>2009</year>
          ), http://aaai.org/ocs/index.php/IAAI/IAAI09/paper/view/257
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Burrows</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Potthast</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stein</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Paraphrase Acquisition via Crowdsourcing and Machine Learning</article-title>
          .
          <source>Transactions on Intelligent Systems and Technology (ACM TIST) 4</source>
          (
          <issue>3</issue>
          ),
          <volume>43</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>43</lpage>
          :
          <fpage>21</fpage>
          (Jun
          <year>2013</year>
          ), http://dl.acm.org/citation.cfm?id=
          <fpage>2483676</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Caliskan</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Greenstadt</surname>
          </string-name>
          , R.:
          <article-title>Translate once, translate twice, translate thrice and attribute: Identifying authors and machine translation tools in translated text</article-title>
          .
          <source>In: Sixth IEEE International Conference on Semantic Computing, ICSC</source>
          <year>2012</year>
          , Palermo, Italy,
          <source>September 19-21</source>
          ,
          <year>2012</year>
          . pp.
          <fpage>121</fpage>
          -
          <lpage>125</lpage>
          . IEEE Computer Society (
          <year>2012</year>
          ), http://dx.doi.org/10.1109/ICSC.
          <year>2012</year>
          .46
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Callison-Burch</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cohn</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lapata</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <string-name>
            <surname>Parametric</surname>
          </string-name>
          :
          <article-title>An automatic evaluation metric for paraphrasing</article-title>
          .
          <source>In: Proceedings of the 22nd International Conference on Computational Linguistics (Coling</source>
          <year>2008</year>
          ). pp.
          <fpage>97</fpage>
          -
          <lpage>104</lpage>
          .
          <article-title>Coling 2008 Organizing Committee</article-title>
          , Manchester,
          <string-name>
            <surname>UK</surname>
          </string-name>
          (
          <year>August 2008</year>
          ), http://www.aclweb.org/anthology/C08-1013
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Cappellato</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ferro</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Halvey</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kraaij</surname>
          </string-name>
          , W. (eds.):
          <article-title>CLEF 2014 Evaluation Labs</article-title>
          and Workshop - Working Notes Papers,
          <volume>15</volume>
          -
          <fpage>18</fpage>
          September, Sheffield, UK. CEUR Workshop Proceedings, CEUR-WS.org (
          <year>2014</year>
          ), http://www.clef-initiative.eu/publication/working-notes
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Cappellato</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ferro</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jones</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          , San Juan, E. (eds.):
          <article-title>CLEF 2015 Evaluation Labs</article-title>
          and Workshop - Working Notes Papers,
          <fpage>8</fpage>
          -
          <lpage>11</lpage>
          September, Toulouse, France. CEUR Workshop Proceedings, CEUR-WS.org (
          <year>2015</year>
          ), http://www.clef-initiative.eu/publication/working-notes
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Castillo</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cervantes</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vilariño</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pinto</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , ,
          <string-name>
            <surname>León</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Unsupervised Method for the Authorship Identification Task-Notebook for PAN at CLEF 2014</article-title>
          . In: [15]
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Castro</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Adame</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pelaez</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Muñoz</surname>
          </string-name>
          , R.: Authorship Verification,
          <article-title>Combining Linguistic Features and Different Similarity Functions-Notebook for PAN at CLEF 2015</article-title>
          . In: [16]
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>C.Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Clark</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Linguistic steganography using automatically generated paraphrases</article-title>
          . In: Human Language Technologies:
          <article-title>The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics</article-title>
          . pp.
          <fpage>591</fpage>
          -
          <lpage>599</lpage>
          . Association for Computational Linguistics, Los Angeles, California (
          <year>June 2010</year>
          ), http://www.aclweb.org/anthology/N10-1084
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dolan</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          :
          <article-title>Collecting Highly Parallel Data for Paraphrase Evaluation</article-title>
          . In: Lin,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Matsumoto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            ,
            <surname>Mihalcea</surname>
          </string-name>
          ,
          <string-name>
            <surname>R</surname>
          </string-name>
          . (eds.)
          <article-title>Proceedings of the Forty-Ninth Annual Meeting of the Association for Computational Linguistics: Human Language Technologies</article-title>
          . pp.
          <fpage>190</fpage>
          -
          <lpage>200</lpage>
          . Association for Computational Linguistics, Portland,
          <source>Oregon (Jun</source>
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Cherry</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Quirk</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <string-name>
            <surname>Discriminative</surname>
          </string-name>
          ,
          <article-title>Syntactic Language Modeling through Latent SVMs</article-title>
          .
          <source>In: Proceedings of AMTA</source>
          (
          <year>2008</year>
          ), http://research.microsoft.com/pubs/72874/lsvm_amta.pdf
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Clark</surname>
            ,
            <given-names>J.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hannon</surname>
            ,
            <given-names>C.J.:</given-names>
          </string-name>
          <article-title>A Classifier System for Author Recognition Using Synonym-Based Features</article-title>
          , pp.
          <fpage>839</fpage>
          -
          <lpage>849</lpage>
          . Springer (
          <year>2007</year>
          ), http://dx.doi.org/10.1007/978-3-
          <fpage>540</fpage>
          -76631-5_
          <fpage>80</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Dagan</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Roth</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sammons</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zanzotto</surname>
            ,
            <given-names>F.M.</given-names>
          </string-name>
          :
          <article-title>Recognizing Textual Entailment: Models and Applications</article-title>
          .
          <source>Synthesis Lectures on Human Language Technologies</source>
          , Morgan &amp; Claypool Publishers (
          <year>2013</year>
          ), http://dx.doi.org/10.2200/S00509ED1V01Y201305HLT023
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Feng</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hirst</surname>
          </string-name>
          , G.:
          <article-title>Authorship Verification with Entity Coherence and Other Rich Linguistic Features-Notebook for PAN at CLEF 2013</article-title>
          . In: [26]
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <surname>Ferraro</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Post</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Van Durme</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Judging grammaticality with count-induced tree substitution grammars</article-title>
          .
          <source>In: Proceedings of the Seventh Workshop on Building Educational Applications Using NLP</source>
          . pp.
          <fpage>116</fpage>
          -
          <lpage>121</lpage>
          . Association for Computational Linguistics, Montréal, Canada (
          <year>June 2012</year>
          ), http://www.aclweb.org/anthology/W12-2013
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26.
          <string-name>
            <surname>Forner</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Navigli</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tufis</surname>
            ,
            <given-names>D</given-names>
          </string-name>
          . (eds.):
          <article-title>CLEF 2013 Evaluation Labs</article-title>
          and Workshop - Working Notes Papers,
          <volume>23</volume>
          -
          <fpage>26</fpage>
          September, Valencia, Spain (
          <year>2013</year>
          ), http://www.clef-initiative.eu/publication/working-notes
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          27.
          <string-name>
            <surname>Francis</surname>
            ,
            <given-names>W.N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kucera</surname>
          </string-name>
          , H.:
          <article-title>Brown corpus manual</article-title>
          . Brown University (
          <year>1979</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          28.
          <string-name>
            <surname>Fratila</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Submission to the Author Identification Task from</article-title>
          the Polytechnic University of Bucharest, Romania. http://www.uni-weimar.de/medien/webis/events/pan-13 (
          <year>2013</year>
          ), http://www.clef-initiative.eu/publication/working-notes
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          29.
          <string-name>
            <surname>Fréry</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Largeron</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Juganaru-Mathieu</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>UJM at CLEF in Author Identification-Notebook for PAN at CLEF 2014</article-title>
          . In: [15]
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          30.
          <string-name>
            <surname>Ganitkevitch</surname>
          </string-name>
          , J.,
          <string-name>
            <surname>Van Durme</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Callison-Burch</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Ppdb: The paraphrase database</article-title>
          .
          <source>In: Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, Proceedings, June</source>
          <volume>9</volume>
          -14,
          <year>2013</year>
          , Westin Peachtree Plaza Hotel, Atlanta, Georgia, USA. pp.
          <fpage>758</fpage>
          -
          <lpage>764</lpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          31.
          <string-name>
            <surname>Ghaeini</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Intrinsic Author Identification Using ModifiedWeighted KNN-Notebook for PAN at CLEF 2013</article-title>
          . In: [26]
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          32.
          <string-name>
            <surname>Gollub</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stein</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Burrows</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          : Ousting Ivory Tower Research:
          <article-title>Towards a Web Framework for Providing Experiments as a Service</article-title>
          . In: Hersh,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Callan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Maarek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            ,
            <surname>Sanderson</surname>
          </string-name>
          , M. (eds.) 35th
          <source>International ACM Conference on Research and Development in Information Retrieval (SIGIR 12)</source>
          . pp.
          <fpage>1125</fpage>
          -
          <lpage>1126</lpage>
          . ACM (Aug
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          33.
          <string-name>
            <surname>Gómez-Adorno</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sidorov</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pinto</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Markov</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>A Graph Based Authorship Identification Approach-Notebook for PAN at CLEF 2015</article-title>
          . In: [16]
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          34.
          <string-name>
            <surname>Grozea</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Popescu</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Submission to the Author Identification Task from Fraunhofer FOKUS</article-title>
          , Germany, and the University of Bucharest, Romania. http://www.uni-weimar.de/medien/webis/events/pan-13 (
          <year>2013</year>
          ), http://www.clef-initiative.eu/publication/working-notes, From Fraunhofer FOKUS and the University of Bucharest
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          35.
          <string-name>
            <surname>Gutierrez</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Casillas</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ledesma</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fuentes</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Meza</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Homotopy Based Classification for Author Verification Task-Notebook for PAN at CLEF 2015</article-title>
          . In: [16]
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          36.
          <string-name>
            <surname>Halvani</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Steinebach</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <string-name>
            <surname>VEBAV - A Simple</surname>
          </string-name>
          ,
          <article-title>Scalable and Fast Authorship Verification Scheme-Notebook for PAN at CLEF 2014</article-title>
          . In: [15]
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          37.
          <string-name>
            <surname>Halvani</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Steinebach</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zimmermann</surname>
          </string-name>
          , R.:
          <article-title>Authorship Verification via k-Nearest Neighbor Estimation-Notebook for PAN at CLEF 2013</article-title>
          . In: [26]
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          38.
          <string-name>
            <surname>Halvani</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Winter</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>A Generic Authorship Verification Scheme Based on Equal Error Rates-Notebook for PAN at CLEF 2015</article-title>
          . In: [16]
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          39.
          <string-name>
            <surname>Hanbury</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Müller</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Balog</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brodt</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cormack</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Eggel</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gollub</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hopfgartner</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kalpathy-Cramer</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kando</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Krithara</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lin</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mercer</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Potthast</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Evaluation-as-a-Service: Overview and Outlook</article-title>
          . ArXiv e-prints (
          <year>Dec 2015</year>
          ), http://arxiv.org/abs/1512.07454
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          40.
          <string-name>
            <surname>Harvey</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Author Verification using PPM with Parts of Speech Tagging-Notebook for PAN at CLEF 2014</article-title>
          . In: [15]
        </mixed-citation>
      </ref>
      <ref id="ref41">
        <mixed-citation>
          41.
          <string-name>
            <surname>Heilman</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cahill</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Madnani</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lopez</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mulholland</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tetreault</surname>
          </string-name>
          , J.:
          <article-title>Predicting grammaticality on an ordinal scale</article-title>
          .
          <source>In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)</source>
          . pp.
          <fpage>174</fpage>
          -
          <lpage>180</lpage>
          . Association for Computational Linguistics, Baltimore, Maryland (
          <year>June 2014</year>
          ), http://www.aclweb.org/anthology/P14-2029
        </mixed-citation>
      </ref>
      <ref id="ref42">
        <mixed-citation>
          42.
          <string-name>
            <surname>Hürlimann</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weck</surname>
          </string-name>
          , B., van den Berg, E.,
          <string-name>
            <surname>Šuster</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nissim</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>GLAD: Groningen Lightweight Authorship Detection-Notebook for PAN at CLEF 2015</article-title>
          . In: [16]
        </mixed-citation>
      </ref>
      <ref id="ref43">
        <mixed-citation>
          43.
          <string-name>
            <surname>Jankowska</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kešelj</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          , ,
          <string-name>
            <surname>Milios</surname>
          </string-name>
          , E.:
          <article-title>Proximity based One-class Classification with Common N-Gram Dissimilarity for Authorship Verification Task-Notebook for PAN at CLEF 2013</article-title>
          . In: [26]
        </mixed-citation>
      </ref>
      <ref id="ref44">
        <mixed-citation>
          44.
          <string-name>
            <surname>Jankowska</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kešelj</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Milios</surname>
          </string-name>
          , E.:
          <article-title>Ensembles of Proximity-Based One-Class Classifiers for Author Verification-Notebook for PAN at CLEF 2014</article-title>
          . In: [15]
        </mixed-citation>
      </ref>
      <ref id="ref45">
        <mixed-citation>
          45.
          <string-name>
            <surname>Jayapal</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goswami</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Vector space model and Overlap metric for Author Identification-Notebook for PAN at CLEF 2013</article-title>
          . In: [26]
        </mixed-citation>
      </ref>
      <ref id="ref46">
        <mixed-citation>
          46.
          <string-name>
            <surname>Juola</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Detecting stylistic deception</article-title>
          .
          <source>In: Proceedings of the Workshop on Computational Approaches</source>
          to Deception Detection. pp.
          <fpage>91</fpage>
          -
          <lpage>96</lpage>
          . Association for Computational Linguistics, Avignon, France (
          <year>April 2012</year>
          ), http://www.aclweb.org/anthology/W12-0414
        </mixed-citation>
      </ref>
      <ref id="ref47">
        <mixed-citation>
          47.
          <string-name>
            <surname>Juola</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stamatatos</surname>
          </string-name>
          , E.:
          <article-title>Overview of the Author Identification Task at PAN 2013</article-title>
          . In: [26]
        </mixed-citation>
      </ref>
      <ref id="ref48">
        <mixed-citation>
          48.
          <string-name>
            <surname>Juola</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vescovi</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Empirical Evaluation of Authorship Obfuscation using JGAAP</article-title>
          . In: Greenstadt,
          <string-name>
            <surname>R</surname>
          </string-name>
          . (ed.)
          <source>Proceedings of the 3rd ACM Workshop on Security and Artificial Intelligence</source>
          ,
          <source>AISec</source>
          <year>2010</year>
          , Chicago, Illinois, USA, October
          <volume>8</volume>
          ,
          <year>2010</year>
          . pp.
          <fpage>14</fpage>
          -
          <lpage>18</lpage>
          . ACM (
          <year>2010</year>
          ), http://doi.acm.
          <source>org/10</source>
          .1145/1866423.1866427
        </mixed-citation>
      </ref>
      <ref id="ref49">
        <mixed-citation>
          49.
          <string-name>
            <surname>Juola</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vescovi</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Analyzing Stylometric Approaches to Author Obfuscation</article-title>
          . In: Peterson,
          <string-name>
            <given-names>G.</given-names>
            ,
            <surname>Shenoi</surname>
          </string-name>
          , S. (eds.)
          <source>Advances in Digital Forensics VII - 7th IFIP WG 11</source>
          .9 International Conference on Digital Forensics, Orlando, FL, USA, January 31 - February 2,
          <year>2011</year>
          ,
          <string-name>
            <given-names>Revised</given-names>
            <surname>Selected</surname>
          </string-name>
          <article-title>Papers</article-title>
          .
          <source>IFIP Advances in Information and Communication Technology</source>
          , vol.
          <volume>361</volume>
          , pp.
          <fpage>115</fpage>
          -
          <lpage>125</lpage>
          . Springer (
          <year>2011</year>
          ), http://dx.doi.org/10.1007/978-3-
          <fpage>642</fpage>
          -24212-
          <issue>0</issue>
          _
          <fpage>9</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref50">
        <mixed-citation>
          50.
          <string-name>
            <surname>Kacmarcik</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gamon</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Obfuscating Document Stylometry to Preserve Author Anonymity</article-title>
          . In: Calzolari,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Cardie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Isabelle</surname>
          </string-name>
          , P. (eds.)
          <source>ACL</source>
          <year>2006</year>
          ,
          <article-title>21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics</article-title>
          ,
          <source>Proceedings of the Conference</source>
          , Sydney, Australia,
          <fpage>17</fpage>
          -
          <issue>21</issue>
          <year>July 2006</year>
          .
          <article-title>The Association for Computer Linguistics (</article-title>
          <year>2006</year>
          ), http://aclweb.org/anthology/P06-2058
        </mixed-citation>
      </ref>
      <ref id="ref51">
        <mixed-citation>
          51.
          <string-name>
            <surname>Kern</surname>
          </string-name>
          , R.:
          <article-title>Grammar Checker Features for Author Identification and Author Profiling-Notebook for PAN at CLEF 2013</article-title>
          . In: [26]
        </mixed-citation>
      </ref>
      <ref id="ref52">
        <mixed-citation>
          52.
          <string-name>
            <surname>Keswani</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Trivedi</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mehta</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Majumder</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Author Masking through Translation-Notebook for PAN at CLEF 2016</article-title>
          . In: [6], http://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>1609</volume>
          /
        </mixed-citation>
      </ref>
      <ref id="ref53">
        <mixed-citation>
          53.
          <string-name>
            <surname>Khonji</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Iraqi</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>A Slightly-modified GI-based Author-verifier with Lots of Features (ASGALF)-Notebook for PAN at CLEF 2014</article-title>
          . In: [15]
        </mixed-citation>
      </ref>
      <ref id="ref54">
        <mixed-citation>
          54.
          <string-name>
            <surname>Khosmood</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Comparison of Sentence-level Paraphrasing Approaches for Statistical Style Transformation</article-title>
          .
          <source>In: Proceedings of the 2012 International Conference on Artificial Intelligence</source>
          . CSREA Press, Las Vegas (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref55">
        <mixed-citation>
          55.
          <string-name>
            <surname>Khosmood</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Levinson</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          :
          <source>Toward Automated Stylistic Transformation of Natural Language Text. In: Proceedings of the Digital Humanities</source>
          <year>2009</year>
          , June 22-25. pp.
          <fpage>177</fpage>
          -
          <lpage>181</lpage>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref56">
        <mixed-citation>
          56.
          <string-name>
            <surname>Khosmood</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Levinson</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          :
          <article-title>Automatic Synonym and Phrase Replacement Show Promise for Style Transformation</article-title>
          . In: Draghici,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Khoshgoftaar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Palade</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            ,
            <surname>Pedrycz</surname>
          </string-name>
          ,
          <string-name>
            <surname>W.</surname>
          </string-name>
          ,
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>