<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Towards the Development of a Cyber Analysis &amp; Advisement Tool (CAAT) for Mitigating De-Anonymization Attacks</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Siobahn</forename><forename type="middle">C</forename><surname>Day</surname></persName>
							<email>scday@aggies.ncat.edu</email>
						</author>
						<author>
							<persName><forename type="first">Henry</forename><surname>Williams</surname></persName>
							<email>hcwillia@aggies.ncat.edu</email>
						</author>
						<author>
							<persName><forename type="first">Joseph</forename><surname>Shelton</surname></persName>
							<email>jashelt1@aggies.ncat.edu</email>
						</author>
						<author>
							<persName><forename type="first">Gerry</forename><surname>Dozier</surname></persName>
							<email>gvdozier@ncat.edu</email>
						</author>
						<author>
							<affiliation key="aff0">
								<orgName type="department">Department of Computer Science</orgName>
								<orgName type="institution">North Carolina A&amp;T State University</orgName>
								<address>
									<settlement>Greensboro</settlement>
									<country>U.S</country>
								</address>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff1">
								<orgName type="department">Center for Advanced Studies in Identity Science</orgName>
							</affiliation>
						</author>
						<title level="a" type="main">Towards the Development of a Cyber Analysis &amp; Advisement Tool (CAAT) for Mitigating De-Anonymization Attacks</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">2ED857EFBB505F6CCDC0CCB8818297DC</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-23T23:41+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>We are seeing a rise in the number of Anonymous Social Networks (ASN) that claim to provide a sense of user anonymity. However, what many users of ASNs do not know that a person can be identified by their writing style. In this paper, we provide an overview of a number of author concealment techniques, their impact on the semantic meaning of an author's original text, and introduce AuthorCAAT, an application for mitigating de-anonymization attacks. Our results show that iterative paraphrasing performs the best in terms of author concealment and performs well with respect to Latent Semantic Analysis.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Introduction</head><p>Anonymous Social Networks (ASN) can provide users with a false sense of anonymity; however, research in the area of Author Identification (Attribution) has shown that users can be identified simply by their writing style <ref type="bibr" target="#b10">(Stamatatos 2009)</ref>. <ref type="bibr" target="#b9">Narayanan et al. (2012)</ref>, introduces the concept of a de-anonymization attack where hackers apply sophisticated Author Identification techniques (AITs) in an effort to uncover the identity of an author of a text. Once this occurs the hackers can track a victim across the web and even through other ASNs.</p><p>Recently researchers, M. <ref type="bibr" target="#b2">Brennan, Afroz, and Greenstadt (2012)</ref>; <ref type="bibr" target="#b8">Kacmarcik and Gamon (2006)</ref>; <ref type="bibr" target="#b9">Rao and Rohatgi (2000)</ref>, have developed a number of techniques for author concealment. These techniques as well as their ability to conceal one's writing style are as follows: adversarial stylometry, iterative language translation and iterative paraphrasing.</p><p>Presently there exist two forms of adversarial stylometry <ref type="bibr" target="#b0">(Afroz, Brennan, and Greenstadt 2012;</ref><ref type="bibr">M. Brennan et al. 2012</ref>; M. R. <ref type="bibr" target="#b1">Brennan and Greenstadt 2009)</ref>. The first form, obfuscation, is when an author tries not to write like them-Copyright held by the author(s).</p><p>selves while the second form, imitation, is when an author tries to 'mimic' the writing style of another author. Research shows that both of these techniques are effective in concealing one's writing style. In the case of disguising one's writing style, M. <ref type="bibr" target="#b2">Brennan et al. (2012)</ref> demonstrate that obfuscation and imitation are easy on the short term but more difficult to maintain on the long term. In Section IV, it will be shown how AuthorCAAT can be used to provide authors with the ability to perform long-term adversarial stylometry.</p><p>Another form of author concealment is Iterative Language Translation (ILT) <ref type="bibr" target="#b9">(Mack, Bowers, Williams, Dozier, and Shelton 2015)</ref>. ILT is where an original text is translated to another language and then back to its original language. This technique was first presented in <ref type="bibr" target="#b9">Rao and Rohatgi (2000)</ref>, where the authors describe this approach as being "somewhat facetious" and "drastic." They believed that this approach would change the meaning of a message thus making it an impractical approach. It was also mentioned by <ref type="bibr" target="#b8">Kacmarcik and Gamon (2006)</ref>, that this approach could be a good starting point for someone looking to "scramble" their words. ILT is effective in concealing the writing style of an author; however, it is vulnerable to fingerprinting, <ref type="bibr" target="#b3">(Caliskan and Greenstadt 2012)</ref>. If one knows the language used in translating the text, one can then recover the original writing style of the author.</p><p>The last form of author concealment is Iterative Paraphrasing (IP). The use of IP was originally mentioned in <ref type="bibr" target="#b8">Kacmarcik and Gamon (2006)</ref>. In IP, one will take the original text and use a paraphrasing tool to convert it into a paraphrased text. Concerning IP, to the authors' knowledge, no one has as of yet analyzed its effectiveness in author concealment, semantics, and its vulnerability to fingerprinting (this will be discussed in Section III).</p><p>The remainder of the paper will be as follows. In Section II, we discuss our experiments. In Section III, we discuss our results. In Section IV, we provide a brief discus-sion of AuthorCAAT. In Section V, we provide our conclusion and future work.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Author Concealment &amp; Fingerprinting Experiments Our Dataset</head><p>The datasets we used for our experiments were gathered from blogs written by 100 different authors. For every author in our dataset, there are 4 instances. Those instances in the dataset are as follows: the first instance served as the probe and the remaining 3 instances served as the gallery. This results in 100 instances in the probe set and 300 instances in the gallery set.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Our Translators &amp; Paraphrasers</head><p>Our ILT dataset, used Google translation tools for English to Spanish, Spanish to English, English to Chinese, and Chinese to English. The ILT text was prepared in iterations. We consider an iteration to be a full round trip cycle of translation (e.g. English-Spanish-English and English-Chinese-English). Therefore, Iteration 1 would be E-X-E, Iteration 2 would be E-X-E-X-E, and Iteration 3 would be E-X-E-X-E-X-E, where E stands for English and X ∈ {Spanish, Chinese}. Therefore, a total of six ILT datasets were developed consisting of 300 gallery instances of the 100 authors.</p><p>Our IP dataset was created using an online tool known as Plagarisma. The Iterations for IP are similar to ILT. Combining ILT with IP we have X ∈ {Spanish, Chinese, Paraphraser}. Therefore, three IP datasets were developed consisting of 300 gallery instances of the 100 authors. For ILT/IP, there were a total of nine datasets.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Experiment I: Author Concealment via ILT/IP</head><p>For Experiment I, the feature extractor used in <ref type="bibr" target="#b9">Mack, Bowers, Williams, Dozier, and Shelton (2015)</ref>, referred to as the Hybrid-II Author Identification System (AIS), was applied to the instances of the nine datasets (and the probe set) to create feature vectors where each feature vector consisted of 1282 features. The Hybrid-II AIS, is composed of 95 features from the Unigram feature extractor <ref type="bibr" target="#b6">(Forsyth 1997)</ref>, 170 stylometric features from De Vel, Anderson, Corney, and Mohay (2001) feature extractor, as well as 256 features in the form of function words and 761 features that come from the Stanford Parser in the form of Parts-of-Speech parent child pairs for a total of 1282 features.</p><p>In Experiment I, the baseline performance was the author recognition rate of the 100 authors (English only) using no ILT/IP iterations. While, the ILT/IP experiments were used to determine how well ILT/IP reduces the author recognition rate with respect to the baseline.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Experiment II: Fingerprinting the Translators and the Paraphrasers</head><p>For Experiment II, a tool known as JGAAP, Java Graphical Author Attribution Program, <ref type="bibr" target="#b7">(Juola, Sofko, and Brennan 2006)</ref> was used to fingerprint the translators and the paraphraser. This tool allows for text analysis using various stylometry and textometry techniques. We used the first 100 authors from each ILT/IP Iteration using the first gallery instance as the 'unknown' author and the remaining two instances from the gallery as the 'known' authors. The 'known' authors were labeled by languages and/or paraphraser. This was used for all three Iterations of ILT/IP. The analysis was processed by using WEKA SMO, with the results ordered with event culling from most to least. Character N Grams, where n=2, was used as the event driver.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Experiment III: Fingerprinting the Number of Iterations Used to Conceal an Author's Writing Style</head><p>In Experiment III, the 'unknown' authors were chosen from the first gallery instances of all Iterations of ILT/IP. The 'known' authors were chosen from the remaining two instances of the gallery and were labeled by the number of ILT/IP Iterations that were applied. The same settings as Experiment II were used with respect to the event driver, analysis, and event culling.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Results</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Results of Experiment I</head><p>The results of Experiment I, Author Concealment via ILT/IP, are shown in Figure <ref type="figure" target="#fig_0">1</ref>. Figure <ref type="figure" target="#fig_0">1</ref> shows the affect that ILT/IP has on the accuracy of the AIS. In Figure <ref type="figure" target="#fig_0">1</ref>, the x-axis represents the iteration number (Iteration 1, Iteration 2, Iteration 3) and the y-axis represents the accuracy of the AIS.</p><p>In Figure <ref type="figure" target="#fig_0">1</ref>, the accuracy of the AIS is 54% percent. In the first iteration of ILT/IP, the author identification rates drop. At Iteration 1, ILT-Spanish has the best performance in terms of reducing the AIS rate to 6%, followed by IP at 7% and ILT-Chinese at 10%. In the second iteration, IP has the best performance in reducing the AIS rate to 1%, followed by ILT-Chinese at 11% and ILT-Spanish at 6%. At Iteration 3, IP continues to outperform ILT. At Iteration 3, IP reduces the AIS rate to 6 %, followed by ILT-Spanish at 7% and ILT-Chinese at 11%. These results show the effectiveness of ILT/IP in concealing an authors identity. , that ILT/IP is naïve as well as problematic due to the resulting text being unable to retain its original meaning. In in order to address this issue, we applied Latent Semantic Analysis (LSA) on all iterations of the dataset. Latent Semantic Analysis (LSA) "…is a theory and method for extracting and representing the contextualusage meaning of words by statistical computations applied to a large corpus of text" <ref type="bibr" target="#b9">(Landauer, Foltz, and Laham 1998)</ref>. Using a LSA tool developed by the University of Colorado Boulder, we compared our original text with the resulting text of ILT/IP.</p><p>In the Table <ref type="table" target="#tab_0">1</ref>, the results of using the LSA tool on our dataset are shown. Given two samples of text, the LSA tool will provide an output of 1 if the semantics of the two text samples are exact and -1 if the semantics of the two text samples do not match at all. Given the output of the LSA tool on our dataset, we ran an ANOVA test as well a t-test to break the performances of ILT/IP into equivalence classes as shown in Table <ref type="table" target="#tab_0">1</ref>.</p><p>In Table <ref type="table" target="#tab_0">1</ref>, the first column represents the ILT/IP method used, the second column represents the average output of the LSA tool with the standard deviation in parenthesis, and the third column, labeled EC, represents the equivalence class. The equivalence classes are ordered from best to worst in terms of performance. The equivalent classes were determined by applying ANOVA and a t-test to check for statistical significance. The p-value used for the ANO-VA test was 0.05.</p><p>The results displayed in Table <ref type="table" target="#tab_0">1</ref>, show that the resulting text from ILT-Spanish is closest to the semantics of the original text with an output of 0.862 followed by IP at 0.802 and ILT-Chinese at 0.773. This indicates that ILT/IP is not only non-problematic but effective at preserving the semantics of the original text. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Results of Experiment II</head><p>The results of Experiment II, Fingerprinting the Translators and the Paraphrasers, are shown in Figure <ref type="figure" target="#fig_1">2</ref>. In Figure <ref type="figure" target="#fig_1">2</ref>, the x-axis shows the iterations (Iteration 1, Iteration 2, Iteration 3) and on the y-axis it shows the accuracy in determining the ILT/IP method used. In Figure <ref type="figure" target="#fig_1">2</ref>, one can see as the number of iterations increases so does the accuracy for each ILT/IP method that is being used.</p><p>In Figure <ref type="figure" target="#fig_1">2</ref>, at Iteration 1, ILT-Spanish has the best fingerprinting accuracy at 93%, followed by ILT-Chinese at 90%, and IP at 86%. In Iteration 2, ILT-Spanish leads at 98% followed by ILT-Chinese 97%, and IP at 91%. In Iteration 3, ILT-Chinese comes in at 99%, followed by ILT-Spanish at 98%, and IP at 95%. The results not only show that the translators can be accurately fingerprinted, but they also show that of the three IP is hardest to fingerprint but only at the first iteration. On the other hand, these results show that the translator and paraphrasers are able to be identified which can potentially allow for reversibility or the uncovering of the original text, thus revealing an authors writing style. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Results of Experiment III</head><p>The results of Experiment III, Fingerprinting the Number of Iterations Used to Conceal an Author's Writing Style, are shown in Figure <ref type="figure" target="#fig_2">3</ref>. In Figure <ref type="figure" target="#fig_2">3</ref>, the x-axis shows the iterations (Iteration 1, Iteration 2, Iteration 3) and the yaxis shows the accuracy of an iteration of ILT/IP in being fingerprinted. Figure <ref type="figure" target="#fig_2">3</ref> shows determining which Iteration of ILT/IP of a given text proves to be more difficult; however, the accuracy rises over iterations.</p><p>In Figure <ref type="figure" target="#fig_2">3</ref>, at Iteration 1, ILT-Spanish leads at 70%, followed by ILT-Chinese at 61%, and IP at 47%. At Iteration 2, IP performs best at 31%, followed by ILT-Spanish at 18%, and ILT-Chinese at 15%. At Iteration 3, ILT-Chinese is the best performer at 60%, followed by ILT-Spanish at 53 % and IP at 49% making it the worst performer. The results show that fingerprinting ILT/IP by iteration is harder to fingerprint but not impossible. Thus allowing an original text and author to be revealed. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>DISCUSSION: THE DEVELOPMENT OF AUTHORCAAT</head><p>The results presented earlier show that translators and paraphrasers can be fingerprinted. Even the iterations can be fingerprinted. In order to conceal one's identity in an efficient and effective way, the authors' believe that a system must be developed that will allow a user to use all of the author concealment methods mentioned in this paper simultaneously while authoring a text. The Center for Advanced Studies in Identity Sciences (CASIS) has developed such a system for author concealment known as Au-thorCAAT (Author Cyber Analysis &amp; Advisement Tool).</p><p>Figure <ref type="figure">5</ref> provides a screenshot of AuthorCAAT. Au-thorCAAT has a window that allows an author to type in text. As the author types, their writing style is analyzed. The feature vector associated with their writing style is shown just below the window. To the right of the window, is a pane that displays the author samples that match the sample written within the window based on a user specified by the slide bar. For example, if the slide bar is at '10' this means that the pane will display the authors whose writing samples are within the closest 10% to the author sample that was typed in the window.</p><p>Below the Matches to, pane is a drop-down box that will allow an author to translate what is currently in the window in either Spanish, Chinese, or Paraphrase and back to English. Once a language or paraphraser has been selected, the user (author) presses the 'Translate' button to execute one cycle of ILT on the text currently within the author window. In Figure <ref type="figure">5</ref>, one can see that AuthorCAAT allows a user to perform both forms of Adversarial Stylometry. If the user sees that their writing style is detected and shown in the pane, then they can choose to re-write their text is such a way that it is not shown in the pane. A user can also monitor the pane in an effort to perform imitation authorship. As long as a particular author ID is shown in the pane (while their author ID is not in the pane) then they are writing like that particular author.</p><p>Finally, AuthorCAAT allows for ILT/IP at the sentence level. For example, an author can type in the first sentence and apply ILT/IP to that sentence. After this, the author can add a second sentence and then apply ILT/IP to both sentences in the window and/or edit the resulting sentences further (Adversarial Stylometry).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Conclusions and Future Work</head><p>In this paper, ILT/IP dramatically reduces the author recognition rate. Secondly, translators and paraphraser are good enough to preserve the semantics. This is based on our results from our LSA table. Thirdly that not only can language translators be fingerprinted but we can fingerprint paraphrasers too. Lastly we show that the iteration of a particular ILT/IP can be fingerprinted as well. This all leads to a development tool, AuthorCAAT that can do all of things at the sentence level. This will allow fingerprinting to be more difficult. Our Future work will include increasing our dataset from 100 to 1000 to see if the finger-</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: A Comparison of the Effectiveness of ILT/IP on Reducing Author Recognition RatesPrior research suggests,<ref type="bibr" target="#b3">(Caliskan and Greenstadt 2012;</ref><ref type="bibr" target="#b8">Kacmarcik and Gamon 2006;</ref><ref type="bibr" target="#b9">Rao and Rohatgi 2000)</ref>, that ILT/IP is naïve as well as problematic due to the resulting text being unable to retain its original meaning. In in order to address this issue, we applied Latent Semantic Analysis (LSA) on all iterations of the dataset. Latent Semantic Analysis (LSA) "…is a theory and method for extracting and representing the contextualusage meaning of words by statistical computations applied to a large corpus of text"<ref type="bibr" target="#b9">(Landauer, Foltz, and Laham 1998)</ref>. Using a LSA tool developed by the University of Colorado Boulder, we compared our original text with the resulting text of ILT/IP.In the Table1, the results of using the LSA tool on our dataset are shown. Given two samples of text, the LSA tool will provide an output of 1 if the semantics of the two text samples are exact and -1 if the semantics of the two text samples do not match at all. Given the output of the LSA tool on our dataset, we ran an ANOVA test as well a t-test to break the performances of ILT/IP into equivalence classes as shown in Table1.In Table1, the first column represents the ILT/IP method used, the second column represents the average output of the LSA tool with the standard deviation in parenthesis, and the third column, labeled EC, represents the equivalence class. The equivalence classes are ordered from best to worst in terms of performance. The equivalent classes were determined by applying ANOVA and a t-test to check for statistical significance. The p-value used for the ANO-VA test was 0.05.The results displayed in Table1, show that the resulting text from ILT-Spanish is closest to the semantics of the original text with an output of 0.862 followed by IP at 0.802 and ILT-Chinese at 0.773. This indicates that ILT/IP is not only non-problematic but effective at preserving the semantics of the original text.</figDesc><graphic coords="3,66.94,71.39,235.86,121.12" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: A Fingerprinting Analysis of ILT/IP over 3 Iterations</figDesc><graphic coords="3,319.22,442.30,242.93,136.64" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: A Fingerprinting Analysis of the Number of Iterations of ILT/IP over 3 Iterations</figDesc><graphic coords="4,66.94,202.56,238.14,125.46" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Figure</head><label></label><figDesc>Figure 5: AuthorCAAT</figDesc><graphic coords="4,319.22,120.44,232.21,221.49" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 :</head><label>1</label><figDesc>LSA Results from Comparing the Original Text with Resulting Text from ILT/IP</figDesc><table><row><cell>ILT/IP Method</cell><cell>LSA Results</cell><cell>EC</cell></row><row><cell>Spanish</cell><cell>0.862 (0.11)</cell><cell>1</cell></row><row><cell>Paraphraser</cell><cell>0.802 (0.09)</cell><cell>2</cell></row><row><cell>Chinese</cell><cell>0.773 .16)</cell><cell>3</cell></row></table></figure>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>This research is based upon work supported by the United States Government including the National Science Foundation. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon.</p></div>
			</div>

			<div type="annex">
<div xmlns="http://www.tei-c.org/ns/1.0"><p>printing becomes more accurate with more authors in terms of ILT/IP. We suspect the accuracy of fingerprinting iterations at Iteration 1 and 2 will increase with the number of authors analyzed. This is a contrast to what was stated in <ref type="bibr" target="#b3">Caliskan and Greenstadt (2012)</ref>.</p></div>			</div>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Detecting hoaxes, frauds, and in writing style online</title>
		<author>
			<persName><forename type="first">S</forename><surname>Afroz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Brennan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Greenstadt</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Security and Privacy (SP)</title>
				<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2012-05">2012. May. 2012</date>
			<biblScope unit="page" from="461" to="475" />
		</imprint>
	</monogr>
	<note>IEEE Symposium on</note>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Practical Attacks Against Authorship Recognition Techniques</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">R</forename><surname>Brennan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Greenstadt</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IAAI</title>
				<imprint>
			<date type="published" when="2009-07">2009. July</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Adversarial stylometry: Circumventing authorship recognition to preserve privacy and anonymity</title>
		<author>
			<persName><forename type="first">M</forename><surname>Brennan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Afroz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Greenstadt</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM Transactions on Information and System Security (TISSEC)</title>
		<imprint>
			<biblScope unit="volume">15</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page">12</biblScope>
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Translate once, translate twice, translate thrice and attribute: Identifying authors and machine translation tools in translated text</title>
		<author>
			<persName><forename type="first">A</forename><surname>Caliskan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Greenstadt</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Semantic Computing (ICSC), 2012 IEEE Sixth International Conference on</title>
				<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2012-09">2012. September</date>
			<biblScope unit="page" from="121" to="125" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<title/>
		<author>
			<persName><surname>De</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Vel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Anderson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Corney</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Mohay</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2001">2001</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Mining e-mail content for author identification forensics</title>
	</analytic>
	<monogr>
		<title level="j">ACM Sigmod Record</title>
		<imprint>
			<biblScope unit="volume">30</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page" from="55" to="64" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Short substrings as document discriminators: An empirical study</title>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">S</forename><surname>Forsyth</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACH-ALLC</title>
		<imprint>
			<biblScope unit="volume">97</biblScope>
			<date type="published" when="1997">1997</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">A prototype for authorship attribution studies</title>
		<author>
			<persName><forename type="first">P</forename><surname>Juola</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Sofko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Brennan</surname></persName>
		</author>
		<ptr target="https://translate" />
	</analytic>
	<monogr>
		<title level="m">Teachers, Scholars, Educators, Scientists, Essayists, Writers. Free TurnItIn and Copyscape Alternative</title>
				<imprint>
			<date type="published" when="2006">February 02, 2016. February 04, 2016. 2006</date>
			<biblScope unit="volume">21</biblScope>
			<biblScope unit="page" from="169" to="178" />
		</imprint>
	</monogr>
	<note>Free Online Plagiarism Checker for Students</note>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Obfuscating document stylometry to preserve author anonymity</title>
		<author>
			<persName><forename type="first">G</forename><surname>Kacmarcik</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Gamon</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the COLING/ACL on Main conference poster sessions</title>
				<meeting>the COLING/ACL on Main conference poster sessions</meeting>
		<imprint>
			<date type="published" when="2006">2006</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">The Best Way to a Strong Defense is a Strong Offense: Mitigating Deanonymization Attacks via Iterative Language Translation</title>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">K</forename><surname>Landauer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">W</forename><surname>Foltz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Laham</surname></persName>
		</author>
		<author>
			<persName><forename type="first">;</forename><forename type="middle">A</forename><surname>Narayanan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Paskov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">Z</forename><surname>Gong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Bethencourt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Stefanov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">C R</forename><surname>Shin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Song</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Nathan</forename><surname>Mack</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jasmine</forename><surname>Bowers</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Henry</forename><surname>Williams</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Gerry</forename><surname>Dozier</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Joseph</forename><surname>Shelton ; Rao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">R</forename><surname>Rohatgi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Can pseudonymity really guarantee privacy? Paper presented at the USENIX Security Symposium</title>
				<imprint>
			<publisher>LSA @ CU Boulder</publisher>
			<date type="published" when="1998-02-02">1998. February 02, 2016. 2012. May. 2015. 2000</date>
			<biblScope unit="volume">25</biblScope>
			<biblScope unit="page" from="409" to="413" />
		</imprint>
	</monogr>
	<note>Security and Privacy (SP), 2012 IEEE Symposium on</note>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">A survey of modern authorship attribution methods</title>
		<author>
			<persName><forename type="first">E</forename><surname>Stamatatos</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of the American Society for information Science and Technology</title>
		<imprint>
			<biblScope unit="volume">60</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="538" to="556" />
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
