<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Detection of Social Network Toxic Comments with Usage of Syntactic Dependencies in the Sentences</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author role="corresp">
							<persName><forename type="first">Serhiy</forename><surname>Shtovba</surname></persName>
							<email>shtovba@vntu.edu.ua</email>
							<affiliation key="aff0">
								<orgName type="institution">Vinnytsia National Technical University</orgName>
								<address>
									<addrLine>], Olena Shtovba, ], Khmelnytske Shose, 95</addrLine>
									<postCode>0000-0003-1302-4899, 0000-0003-1418-4907, 0000-0001-6836-7843, 21021</postCode>
									<settlement>Mykola Petrychko, Vinnytsia</settlement>
									<country key="UA">Ukraine</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Detection of Social Network Toxic Comments with Usage of Syntactic Dependencies in the Sentences</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">6BB2AABACC99C377A6FE4EFF8DC81146</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-25T08:31+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>natural language processing</term>
					<term>syntactic dependencies</term>
					<term>toxic comments</term>
					<term>social network</term>
					<term>machine learning</term>
					<term>features selection</term>
					<term>balanced accuracy</term>
					<term>decision tree</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Social networks sometimes become a medium for threats, insults and other components of cyberbullying. A huge number of people are involved in online social networks. Hence, a protection of network users from anti-social behavior is an important activity. One of the major tasks of such activity is automated detecting the toxic comments with threats, insults, obscene etc. The bag of words statistics and bag of symbols statistics are typical features for the toxic comments detection. The effect of syntactic dependencies in sentences on the quality of detection of the social network toxic comments is studied in the article for the first time. Syntactic dependences are relationships with proper nouns, personal pronouns, possessive pronouns, etc. Twenty syntactic features of sentences have been verified in the total. The paper shows that 3 additional specific features significantly improve the quality of toxic comments detection. These three features are: the number of dependences with proper nouns in the singular, the number of dependences that contain bad words, and the number of dependences between personal pronouns and bad words. The experiments are based on data from kaggle competition "Toxic Comment Classification Challenge". For our experiments, the original dataset with 159751 comments was reduced to 106590 comments due to problems with human-free extraction of the syntactic features. We use mean of the error rates for each types of misclassification as the metric of quality due to unbalanced dataset. A decision tree is used as a classifier. The decision trees were synthesized for two splitting rules: Gini index and deviance criterion.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Social networks sometimes become a place for threats, insults and other components of cyberbullying. A huge number of people are involved in online social networks. Hence, a protection of network users from anti-social behavior is an important activity. One of the major tasks of such activity is automated detecting the toxic com-ments. Toxic comments are textual comments with threats, insults, obscene, racism etc.</p><p>The various techniques are used for human-free detecting the toxic comments. Bag of words statistics and bag of symbols statistics are typical source information for the toxic comments detection. Usually the following statistics-based features are used: length of the comment, number of capital letters, number of exclamation marks, number of question marks, number of spelling errors, number of tokens with non-alphabet symbols, number of abusive, aggressive, and threatening words in the comment, etc. <ref type="bibr" target="#b0">[1]</ref>. High count of bad words in the comment increases a chance to classify it as toxic. However, there are some difficulties with usage of the bad words statistics. Some outof-vocabulary words are produced by typos and by spelling errors. Often authors of toxic comments distort their bad words purposely. They convert the bad words to phonetically identical forms by replacing letter combinations oo to u, for to 4, too to 2 etc. Another variant is to distort to visual similar forms, for example, 5h1t, b!tch, b1tch. Scientists develop special technologies for detecting the masked bad words <ref type="bibr" target="#b1">[2,</ref><ref type="bibr" target="#b2">3]</ref>, but vandals have a reserve in time and in persons. In addition to analyzing the separated keywords, some methods take into account the order of the words in sentences. For example, authors of <ref type="bibr" target="#b3">[4,</ref><ref type="bibr" target="#b4">5]</ref> used n-grams-based approach, but such modeling does not reflect the whole relations in sentences.</p><p>The aim of the paper is to study an effect of syntactic dependencies in sentences on the quality of detecting the social network toxic comments. Syntactic dependences are relationships with proper nouns, personal pronouns, possessive pronouns, etc. Opposite to n-gram method and naive Bayesian approach, the model based on the syntactic dependencies does not directly tie with the training set vocabulary. All the various proper names, personal pronouns, possessive pronouns are allocated into separate groups. It allows to use the vocabulary-free generalized features in the model. Another instance from this group in the test set will not affect the simulation negatively. We use the information technology from <ref type="bibr" target="#b5">[6]</ref> for extraction the syntactic features from the data set. We compare the results of toxic comments detection on two sets of features. The first set is typical features that based on bag of words statistics and bag of symbols statistics. The second one is extended set that contains typical features together with syntactic features. The experiments are performed on the "Toxic Comment Classification Challenge" data set.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Data sets and preprocessing</head><p>Data set "Toxic Comment Classification Challenge" is collected by Conversation AI team, a research initiative founded by Jigsaw and Google, both a part of Alphabet. The data set is used in kaggle-competition <ref type="bibr" target="#b6">[7]</ref>. The data set consists of 159751 Wikipedia comments which have been labeled by human raters for toxic behavior. Most of the comments are English <ref type="bibr" target="#b7">[8]</ref>.</p><p>Each comment is manually categorized with 6 binary labels: toxic, severe toxic, obscene, threat, insult, and identity hate. Some comments have toxic multiplicity. In this case a comment belongs to 2, 3, and even 6 toxic categories simultaneously (Fig-ure <ref type="figure" target="#fig_0">1</ref>). Also a comment may be neutral, i.e. it does not belong to any toxic category. For example, the following comment "Your vandalism to the Matt Shirvington article has been reverted. Please don't do it again, or you will be banned." is neutral. Comment "Hi! I am back again! Last warning! Stop undoing my edits or die!" is toxic and threated, and comment "Would you both shut up, you don't run Wikipedia, especially a stupid kid." is toxic and insult. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Fig. 2. Distribution of multiplicity (m) of toxic comments</head><p>Figure <ref type="figure">3</ref> shows the combinations of toxic categories in one comment. The set of toxic comments with the same category is represented by a color square. Each toxic category represents the corresponding color. The area of the square equals to the number of comments with the same toxic category. The intersection of squares reflects the number of comments that belong to two relevant toxic categories simultaneously. Figure <ref type="figure">3</ref> shows that all the severe toxic comments also belong to toxic cate-gory -the blue square is completely inside the red square. Also, almost all the severe toxic comments are obscene and insult. There are 3 very low intersecting categories: severe toxic, threat, and identity hate. Few comments belong simultaneously to two out these three categories. Figure <ref type="figure">3</ref> also shows the degree of similarity for two finite sets in form of Jaccard index (k j ). It is calculated as the cardinality of the intersection of the sets divided by the cardinality of the union of the sets. For our case Jaccard index corresponds to the ratio of the area of intersection of two squares over the area of the union of two squares.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Fig. 3. Jaccard similarity indexes for various toxic categories</head><p>We propose to add several specific features to the typical feature set that based on statistics of a bag of the words and statistics of a bag of the symbols. The specific features are taking into account some syntax dependencies between words in comment. The specific features extraction was done using the technology from <ref type="bibr" target="#b5">[6]</ref>. The specific features were extracted automatically for 106590 comments. Features extraction for some comments was unsuccessful due to non-English text and out-ofvocabulary words. As a result, the modified data set consists 66.8% of the source data set. Neutral comments compose 87.2% of the modified data set. It is slightly less than in the source data set where the neutral ratio is 89.8%. Distributions of the comments on toxic categories are almost equal for two data sets (Table <ref type="table" target="#tab_0">1</ref>). 3</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Features and quality metric</head><p>The following features are used for formalized description of each comment:</p><p>1</p><p>x is a number of words; 2</p><p>x is a number of unique words; 3</p><p>x is a ration of unique words; 4</p><p>x is a number of tokens without the stop-words; 5</p><p>x is a number of spelling errors; 6</p><p>x is a number of all-caps words; 7</p><p>x is a ratio of all-caps words; 8</p><p>x is a length of the comment; 9</p><p>x is a number of capital letters; 10</p><p>x is a number of explanation marks; 11</p><p>x is a number of question marks; 12</p><p>x is a number of punctuation marks; 13</p><p>x is a number of masking symbols (*, &amp;, $, %); 14</p><p>x is a number of happy smiles; 15</p><p>x is a ratio of explanation marks; 16</p><p>x is a ratio of question marks; 17</p><p>x is a ratio of spaces; 18</p><p>x is a ratio of capital letters; 19</p><p>x is a ratio of lowercase letters; 20</p><p>x is a number of the comment's words that included into the bad word list at https://www.cs.cmu.edu/~biglou/resources/bad-words.txt; 21</p><p>x is a number of the comment's words that included into the swear word list at http://www.bannedwordlist.com;</p><p>x is a number of the comment's words that included into facebook black list at https://www.frontgatemedia.com/a-list-of-723-bad-words-to-blacklist-and-how-touse-facebooks-moderation-tool/; 23 x is a number of the comment's words that included into google blacklist at https://www.freewebheaders.com/full-list-of-bad-words-banned-by-google/; 24 x is a number of the comment's words that included into the naughty word list at https://gist.github.com/ryanlewis/a37739d710ccdb4b406d; 25</p><p>x is a number of the comment's words that included into <ref type="bibr" target="#b4">5</ref>  x is a number of dependencies between proper nouns in the singular and the words from dependencies with denial; 36 x is a number of dependencies between proper nouns in the plural and the words from dependencies with denial; 37 x is a number of dependencies between personal pronouns and the words from dependencies with denial; 38 x is a number of dependencies between possessive pronouns and the words from dependencies with denial; 39 x is a number of dependencies that contain the bad words; 40</p><p>x is a number of dependencies with denial that contain the bad words; 41</p><p>x is a number of dependencies between proper nouns in the singular and the bad words; 42</p><p>x is a number of dependencies between proper nouns in the plural and the bad words;</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>43</head><p>x is a number of dependencies between personal pronouns and the bad words; 44</p><p>x is a number of dependencies between possessive pronouns and the bad words; 45</p><p>x is a number of dependencies between pronouns and the bad words.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Twenty specific features 26</head><p>x -45 x are examined for toxic comments detection for the first time. Let us modify the original kaggle-task of categorizing the toxic comments to the classification one with two alternatives: a neutral comment and a general toxic comment. It allows to easy checkup the informative levels of the proposed syntactic features.</p><p>The data set is unbalanced with class proportion about 9 to 1. Hence, misclassification rate is not suitable metric for quality of the classifier. According to <ref type="bibr" target="#b8">[9]</ref> we use balanced accuracy approach. The metric of quality of the classifier is as follows:</p><formula xml:id="formula_0">2 tn nt aver P P Q   ,</formula><p>where nt P denotes probability of n→t type classifying errors, when a neutral comment is recognized as a general toxic comment; tn P denotes probability of t→n type classifying errors, when a general toxic comment is recognized as a neutral comment. aver Q is mean of probabilities of each type misclassification. It is simple and interpretable metric for examination a classifier on unbalanced data set.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Computational experiments</head><p>A decision tree is used as a classifier. We choose this kind of classifier taking into account the following reasons: 1) a synthesis of the decision tree is a fast procedure even for large training set, hence, it is possible to carry out several experiments; 2) features selection is carried during the decision tree synthesis; it is easy to check the informative levels of the proposed syntactic features. We divide the data set on training data and test data. The test set consists of every sixth comment. The rest comments are in the training set. Thus, the test set contains 17765 comments and training set contains 88825 comments. We use the training data for decision tree synthesis. After this, the decision tree is pruned for minimization aver Q on the test set. We check up two sets of the features: typical set -1</p><p>x -25</p><p>x and extended set -1 x -44</p><p>x . Rebalancing the class distribution is yielded by a sampling in way of increasing the weight of minor class objects. We suppose that correct classification of the comment with high toxic multiplicity is more important than the comment with low toxic multiplicity. Weight w of toxic comment C is defined by the following heuristic for- mula:</p><formula xml:id="formula_1">) ( ) ( C m b C w   ,</formula><p>where b denotes a bias of toxic comment weight; } . Figure <ref type="figure" target="#fig_1">4</ref> shows that the extended set of features significantly improves the classifier quality. . The best tree correctly detects almost the all comments with high and average toxic multiplicities (Figure <ref type="figure">6</ref>). The best tree correctly detects almost all the toxic comments with labels severe toxic, obscene, and identity hate (Figure <ref type="figure">7</ref>).</p><formula xml:id="formula_2">6 ..., , 2 , 1 { ) (  C m denotes toxic multiplicity of comment C .</formula><p>Let us analyze 5 best trees. All the trees use the following features: 3 x -9</p><p>x , 15 x , 17</p><p>x -19 x , 22 x , 24</p><p>x -26 x , 39 x , and 43 x . 4 out 5 trees use feature 1</p><p>x additionally. Among their most important features are 3 new syntactic ones: a number of dependencies with proper nouns in the singular ( <ref type="formula">26</ref>x ); a number of dependencies that contain the bad words ( 39 x ) and a number of dependencies between personal pronouns and the bad words ( 43 x ). We also point to 4 following slightly less important features. Typical features 2</p><p>x , 10</p><p>x , and 12 x are in 2 out 5 the best trees. Syntactic feature 28</p><p>x is selected for 1 out 5 the best trees. The mentioned 4 extra features may be used for more complicated models for toxic comment detection. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Conclusion</head><p>The problem of detecting the toxic comments in social networks was considered. For our experiments we used kaggle data set "Toxic Comment Classification Challenge". The bag of words statistics and bag of symbols statistics are typical features for detecting the toxic comments. The effect of syntactic dependencies in sentences on the quality of the social network toxic comments detection was studied in the article. Syntactic dependences are relationships with proper nouns, personal pronouns, possessive pronouns, etc. In total 20 syntactic features of sentences had been checked. A novelty of the research consists of the experimental confirmation that 3 additional specific features significantly improve the quality of toxic comments detection. Those three features are: the number of dependences with proper nouns in the singular, the number of dependences that contain bad words, and the number of dependences between personal pronouns and bad words. The selection of 3 specific features allows to significantly reduces the computational complexity of text comment preprocessing, since the calculation of all 20 specific features requires a lot of resources. Accordingly, with 3 specific features added to the typical set, the identification of the toxic comments can be done in real time with good quality.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Fig. 1 .</head><label>1</label><figDesc>Fig. 1. Categories of the first 115 non-neutral comments 16225 comments have the toxic labels. The rest of the comments are neutral. A distribution of the comments on toxic multiplicities is presented on Figure 2. It shows that only comments with high toxicity multiplicity are rarely encountered. Most of toxic comments (60.8%) belong to several toxic categories (m&gt;1).</figDesc><graphic coords="3,255.60,494.40,83.88,91.56" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 4</head><label>4</label><figDesc>shows the dependences of the classifier quality under the bias of toxic comment weight. The decision trees were synthesized with two splitting rules: Gini index-based rule and deviance criterion-based rule. The experiments show that Gini index-based rule provides better decision trees. aver Q is low, when the bias of toxic comment weight belongs to [4.5, 5.8]. Minimal value of aver Q</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Fig. 4 .</head><label>4</label><figDesc>Fig. 4. Experimental dependencies of toxic comments classifier quality</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Fig. 5 .Fig. 6 .Fig. 7 .</head><label>567</label><figDesc>Fig. 5. The best decision tree (1 -neutral comment, 2 -general toxic comment)</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0"><head></head><label></label><figDesc></figDesc><graphic coords="4,161.64,255.36,271.92,233.04" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 .</head><label>1</label><figDesc>Source data sets and modified data sets</figDesc><table><row><cell>Category</cell><cell>Comments in source</cell><cell>Comments in</cell><cell>Share of source</cell></row><row><cell></cell><cell>data set</cell><cell>modified data set</cell><cell>data set, %</cell></row><row><cell>Toxic</cell><cell>15294</cell><cell>12948</cell><cell>84.7</cell></row><row><cell>Severe toxic</cell><cell>1595</cell><cell>1492</cell><cell>93.5</cell></row><row><cell>Obscene</cell><cell>8449</cell><cell>7303</cell><cell>86.4</cell></row><row><cell>Threat</cell><cell>478</cell><cell>442</cell><cell>92.5</cell></row><row><cell>Insult</cell><cell>7877</cell><cell>6943</cell><cell>88.1</cell></row><row><cell>Identity hate</cell><cell>1405</cell><cell>1251</cell><cell>89</cell></row></table></figure>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Acknowledgements. Authors thank Olexandr Yahimovych for extraction the syntactic features from the data set of toxic comments. This research is supported by government scientific project 46-G-388 «Fuzzy logic and computational linguistics based the identification of hidden dependencies in online social networks».</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Anatomy of online hate: developing a taxonomy and machine learning models for identifying and classifying hate in online news media</title>
		<author>
			<persName><forename type="first">J</forename><surname>Salminen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceeding of the Twelfth International AAAI Conference on Web and Social Media</title>
				<meeting>eeding of the Twelfth International AAAI Conference on Web and Social Media</meeting>
		<imprint>
			<date type="published" when="2018">2018. 2018</date>
			<biblScope unit="page" from="330" to="339" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Identifying Aggression and Toxicity in Comments using Capsule Network</title>
		<author>
			<persName><forename type="first">S</forename><surname>Srivastava</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Khurana</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Tewari</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceeding of the First Workshop on Trolling, Aggression and Cyberbullying</title>
				<meeting>eeding of the First Workshop on Trolling, Aggression and Cyberbullying<address><addrLine>TRAC-</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2018">2018. 2018</date>
			<biblScope unit="page" from="98" to="105" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Using Crowdsourcing to Improve Profanity Detection</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">O</forename><surname>Sood</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Antin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">F</forename><surname>Churchill</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceeding of Association for the Advancement of Artificial Intelligence. Spring Symposium: Wisdom of the Crowd</title>
				<meeting>eeding of Association for the Advancement of Artificial Intelligence. Spring Symposium: Wisdom of the Crowd</meeting>
		<imprint>
			<date type="published" when="2012">2012. 2012</date>
			<biblScope unit="page" from="69" to="74" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Is preprocessing of text really worth your time for toxic comment classification?</title>
		<author>
			<persName><forename type="first">F</forename><surname>Mohammad</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceeding of International Conference on Artificial Intelligence</title>
				<meeting>eeding of International Conference on Artificial Intelligence</meeting>
		<imprint>
			<date type="published" when="2018">2018. 2018</date>
			<biblScope unit="page" from="447" to="453" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Detecting hate speech on the world wide web</title>
		<author>
			<persName><forename type="first">W</forename><surname>Warner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Hirschberg</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Second Workshop on Language in Social Media. Association for Computational Linguistics</title>
				<meeting>the Second Workshop on Language in Social Media. Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2012">2012. 2012</date>
			<biblScope unit="page" from="19" to="26" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Development of the method for filtering verbal noise while search keywords for the English text</title>
		<author>
			<persName><forename type="first">O</forename><surname>Bisikalo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Yahimovich</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Yahimovich</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Technology Audit and Production Reserves</title>
		<imprint>
			<biblScope unit="volume">6</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="33" to="41" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<ptr target="https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge" />
		<title level="m">Toxic Comment Classification Challenge</title>
				<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<title level="m" type="main">Stop Illegal Comments: A Multi-Task Deep Learning Approach</title>
		<author>
			<persName><forename type="first">A</forename><surname>Elnaggar</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1810.06665</idno>
		<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">The balanced accuracy and its posterior distribution</title>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">H</forename><surname>Brodersen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">S</forename><surname>Ong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">E</forename><surname>Stephan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">M</forename><surname>Buhmann</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 20th IEEE International Conference on Pattern Recognition</title>
				<meeting>the 20th IEEE International Conference on Pattern Recognition</meeting>
		<imprint>
			<date type="published" when="2010">2010. 2010</date>
			<biblScope unit="page" from="3121" to="3124" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
