<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Huiping Shi@HASOC 2020: Multi-Top 𝑘 Self-Attention with K-Max pooling for discrimination between Hate , profane and offensive posts</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Huiping</forename><surname>Shi</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">School of information Science and Engineering</orgName>
								<orgName type="institution">Yunnan University</orgName>
								<address>
									<country>Yunnan.P.R china</country>
								</address>
							</affiliation>
						</author>
						<author role="corresp">
							<persName><forename type="first">Xiaobing</forename><surname>Zhou</surname></persName>
							<email>zhouxb@ynu.edu.cn</email>
							<affiliation key="aff0">
								<orgName type="department">School of information Science and Engineering</orgName>
								<orgName type="institution">Yunnan University</orgName>
								<address>
									<country>Yunnan.P.R china</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Huiping Shi@HASOC 2020: Multi-Top 𝑘 Self-Attention with K-Max pooling for discrimination between Hate , profane and offensive posts</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">199758D34A8BB6C5432D71DF483859FE</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T13:49+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>HASOC 2020</term>
					<term>offensive language</term>
					<term>multi-Top 𝑘 Self-Attention</term>
					<term>K-Max pooling</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>This paper describes are system submitted to HASOC2020: Hate speech and offensive content recognition. The purpose of the task is to discrimination offensive language in social media. We participated in a subtask A and B of English and German. The subtasks A are to identify hate speech, The subtasks B are to identify hate speech,offensive speech and profane speech, offensive blasphemy from a fine-grained perspective. To accomplish these subtasks, we proposed a system based on Multi-Top 𝑘 Self-Attention and K-Max pooling model and used 𝑘-fold method for training fitting. We get the macro F1-score of our model is 0.5042 of subtask A in English, 0.2396 of subtask B in English.0.5121 of subtask A in German,0.2736 of subtask B in German.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>With the development of information technology, social media provides people with more and more convenient forms of communication. People can freely express their opinions on social networks. At the same time, some people will use social networks to release their one-sided emotions to guide the public to attack the innocent, undermine objective discussions, and intensify conflicts. An immeasurable language environment is the foundation of a harmonious atmosphere and the cornerstone of universal progress <ref type="bibr" target="#b0">[1]</ref>. To purify the Internet environment, we need to identify the negative emotions on the Internet. Identifying hate speech and insulting, derogatory or obscene content and another negative emotional speech on social networks belongs to the research direction of emotion classification in natural language processing. Sentiment classification <ref type="bibr" target="#b1">[2]</ref> is one of the main tasks in classification tasks in natural language processing, and it is also a hot spot in domestic and foreign research in recent years. The task of sentiment classification is to help researchers quickly obtain, organize, and analyze relevant text information, and analyze, summarize, and infer the emotion contained in the text. Traditionally, regular text language is more beneficial to the analysis, processing, induction, and reasoning of natural language processing. This is because the regular text used conforms to specific rules, it is critical for natural language processing to find the rules in the document.</p><p>However, in emotional classification, the language used is more perceptual rather than rational, and many sensitive texts are somewhat different from regular texts. Most of the languages used on social networks are related to personal life experience, and these texts are even affected by the personal national language in different ways of expression. Although the texts on social network media are in the same language, its meaning is quite different. Besides, the language text on social media is updated very quickly which is because frequently updated hot events on social media make netizens reach a consensus on a certain event. Besides, the language text on social media is updated very quickly. Facing the ultra-fast updating of social media texts, some scholars of natural language processing still insist on studying the characteristics of hate texts. From the initial manual feature extraction to the rule-based feature extraction of machine learning, and now, the feature extraction of neural networks. The feature extraction has experienced a long and complicated text research process.</p><p>HASOC competition, which is an evaluation task to identify hate speech and offensive content in Indo-European languages <ref type="bibr" target="#b2">[3,</ref><ref type="bibr" target="#b3">4]</ref>. This year, HASOC provides 2 subtasks with each for Langusge. Subtask A: Coarse-grained classification, offensive and profanity content. Subtask B: Fine-grained classification, the distinction between profanity and offensive posts.</p><p>We participated in subtasks A and B in English and German, and the datasets discussed in this article come from HASOC. Based on the method of deep learning, we have developed an end-to-end neural network model, which takes Multi-Top 𝑘 Self-Attention as the core and joins K-Max pooling. During training, K-Folding that can alleviate data imbalance and data overfitting, and the approach of fitting generation is used for batch training. This model has achieved an excellent result of subtasks A and B in English and German of HASOC 2020.</p><p>The structure of this paper is as follows: In Second 2, we introduce the related work of identifying hate speech and offensive language. In Section 3, we describe the data set and the model structure in detail. In Section 4, we describe the experimental result and data analysis.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related Work</head><p>The research of natural language processing on hate speech on social networks can be traced back to 2010. Gries et al. collected and annotated the text on social media(tweets), completed a love hate data set about social media speech, and provided these data sets to scholars who need to do research <ref type="bibr" target="#b4">[5]</ref>. Provided these datasets to scholars who needed to do research. The scholars who made the dataset also held regular competitions to improve the dataset regularly to encourage more people to participate in this task. Dhillon et al. proposed the SAS model <ref type="bibr" target="#b5">[6]</ref>. SAS provides a direction for studying computer network methods, technologies and systems. SAS uses natural language processing technology to identify sentence components (such as subject, verb, and object), disambiguate and identify entities so that SAS can identify whether there is a basic relationship. SAS provided one or more user interface tools for sentiment analysis. In 1996, Hochreiter et al. proposed Long Short Term Memory (LSTM) <ref type="bibr" target="#b6">[7]</ref>. The emergence of LSTM made the technology of natural language processing an epoch-making milestone. Malmaisi et al. combined the method of feature representation (character n-gram, word n-gram and word skip-gram) with the Support Vector Machine (SVM) classifier to distinguish hate speech on social media from general profanity, and the accuracy of 0.78 was obtained <ref type="bibr" target="#b7">[8,</ref><ref type="bibr" target="#b8">9]</ref>. Feature representation plays an important role in natural language processing. In 2018, Malmaisi et al. learned from the previous research, and based on the feature representation model, they tried to integrate multiple natural language classification models and achieved an accuracy of 0.80 on the three classification task <ref type="bibr" target="#b9">[10]</ref>.</p><p>The study of hate speech is not limited to the study of feature representation <ref type="bibr" target="#b10">[11,</ref><ref type="bibr" target="#b11">12]</ref>. In the negative polarity and emotional intensity <ref type="bibr" target="#b12">[13,</ref><ref type="bibr" target="#b13">14]</ref>, the research of hatred classification features <ref type="bibr" target="#b14">[15,</ref><ref type="bibr" target="#b15">16]</ref> has also made great progress. In deep learning, there are some similarities between natural language processing and image processing. Some neural network researchers apply the attention mechanism originally used for image processing to natural language processing, such as <ref type="bibr" target="#b16">[17,</ref><ref type="bibr" target="#b17">18]</ref>. It is found that the attention mechanism is highly adaptive in natural language processing, and can be applied to various tasks, and achieve better than the neural network model originally used for natural language processing. In the attention mechanism, in addition to determining the 'value', the initial values of the 'key' and the 'query' must be manually set. The manual operation has not only increased the experimental error. In 2017, the self-attention mechanism was first proposed by Lin et al <ref type="bibr" target="#b18">[19]</ref>, and they set the value, key and query to the same value, thereby reducing the external influence and making the model self-optimizing. The self-attention mechanism is used in various tasks. In 2019, based on the study of self-attention mechanism, Child et al. proposed a sparse self-attention mechanism <ref type="bibr" target="#b17">[18]</ref>. While reducing the calculation time of the algorithm, the model uses the filtering function to optimize the self-attention research, and the model also maintains a good effect on various tasks.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Methods</head><p>We preprocess the data set using word representation (Fasttext) to convert the text into vector that the computer can process. Two Multi-Top 𝑘 Self-Attention were applied to refine the representative features of the text. After model using the K-Max pooling and Tanh functions get text characteristics, finally, softmax and tanh function were used to calculate scores of each classification. The model is shown in figure <ref type="figure" target="#fig_0">1</ref> </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Input Layer</head><p>The input layer accepts the preprocessed text data. Put the sorted data into the model, which data are in accordance with the format of data processed by the model</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Embedding Layer</head><p>This layer accepts the input of the Input layer and vectorizes the words in the existing dictionary, which are input by the pre-training vector model.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.">Encode layer</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.1.">Multi-Top 𝑘 Self-Attention</head><p>In 2019, the results of Transformer <ref type="bibr" target="#b19">[20]</ref> are obvious to all, but the problems of Transformer are also very obvious. The transformer takes a non-discriminatory approach to feature rep-   here 𝑑 is the first dimension of Embedding. 𝑄, 𝐾, and 𝑉 all come from input, but the weights of the matrix of a linear transformation are different. Then we multiply 𝑄 and 𝐾 by dot product to get the dependency between the input word and the word. Finally, all values are mapped to a space with dimension 𝑑. We get the score for the attention mechanism, and 𝑃 contains all the characteristic merit scores. At this point, our model is the same as that of the Transformer.</p><p>Then we start to filter the feature score of 𝑃.</p><formula xml:id="formula_0">𝑃 ′ = 𝑇 𝑜𝑝 𝑘 − ℎ𝑒𝑎𝑝(𝑃)<label>(2)</label></formula><p>The 𝑃 ′ here is the filtered matrix. we use the heap algorithm <ref type="bibr" target="#b20">[21]</ref>, maintains a small top heap of size 𝐾 and puts the data into the heap in turn. When the heap size is full, we only need to compare the top element with the next number. If it is larger than the top element, the current top element is discarded and inserted into the heap. Finally, all of the 𝑡𝑜𝑝 𝑘 are in the heap. But this p prime right here is the same thing as 𝑝. The purpose of this 𝑡𝑜𝑝 𝑘 operation is simply to mark 𝑡𝑜𝑝 𝑘 elements in 𝑃. We maintain a matrix called 𝑀 𝑞 * 𝑘 .</p><formula xml:id="formula_1">𝑀(𝑖, 𝑗) = { 1 𝑖𝑓 𝑃(𝑖, 𝑗) 𝑖𝑛 𝑡𝑜𝑝 𝑘 −∞ 𝑖𝑓 𝑃(𝑖, 𝑗) 𝑛𝑜𝑡 𝑖𝑛 𝑡𝑜𝑝 𝑘<label>(3)</label></formula><formula xml:id="formula_2">𝑆𝑐𝑜𝑟𝑒 = 𝑃 × 𝑀<label>(4)</label></formula><p>When 𝑃(𝑖, 𝑗) is in the range of 𝑡𝑜𝑝 𝑘 , 𝑀(𝑖, 𝑗) is 1, and the reverse is negative infinity. Then we filter out the values of the 𝑡𝑜𝑝 𝑘 .</p><p>𝑇 𝑜𝑝 𝑘 𝑆𝑒𝑙𝑓 𝐴𝑡𝑡𝑒𝑛𝑡𝑖𝑜𝑛 = 𝑆𝑜𝑓 𝑡𝑚𝑎𝑥(𝑆𝑐𝑜𝑟𝑒) × 𝑉</p><p>At this moment, we have completed an Attention block. We use the output of the Attention block as the input for the next Attention block. Finally, all values are combined.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>𝑀𝑢𝑙𝑡𝑖 − 𝑇 𝑜𝑝</head><formula xml:id="formula_4">𝑘 𝑆𝑒𝑙𝑓 𝐴𝑡𝑡𝑒𝑛𝑡𝑖𝑜𝑛 = 𝐻 ∑ 0&lt;ℎ&lt;=𝐻 𝑇 𝑜𝑝 𝑘 𝑆𝑒𝑙𝑓 𝐴𝑡𝑡𝑒𝑛𝑡𝑖𝑜𝑛 ℎ<label>(6)</label></formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.2.">K-Max pooling</head><p>In the previous operation, we connected parameters from two Multi-𝑇 𝑜𝑝 𝑘 Self-Attention. At this point, we take advantage of the 𝑇 𝑜𝑝 𝑘 filtering mechanism again. But the difference is, here's 𝑇 𝑜𝑝 𝑘 , we use the algorithm is the merge algorithm <ref type="bibr" target="#b21">[22]</ref>, take the first 𝑘 value. The idea of the integration algorithm is to combine and sort the parameters on the dimension by using the division method. That is.</p><formula xml:id="formula_5">𝑜𝑢𝑡𝑝𝑢𝑡 = 𝐾 − 𝑀𝑎𝑥𝑃𝑜𝑜𝑙𝑖𝑛𝑔(𝑀𝑢𝑙𝑡𝑖 − 𝑇 𝑜𝑝 𝑘 𝑆𝑒𝑙𝑓 𝐴𝑡𝑡𝑒𝑛𝑡𝑖𝑜𝑛)<label>(7)</label></formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.4.">Output Layer</head><p>Concerning the output layer, in order to improve training efficiency, we first use the 𝑇 𝑎𝑛ℎ function and finally use 𝑇 𝑎𝑛ℎ function and softmax function to calculate the probability output of each class. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Experiment and Result</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">Result</head><p>In this experiment, we took part in subtasks A and B in English languages and German languages provided by HASOC. We have completed a total of 4 subtasks. We submit the result on the platform given by HASOC <ref type="bibr" target="#b3">[4]</ref>, use F1-macro as the evaluation criteria. In these tasks, We get the macro F1-score of our model is 0.5042 of subtask A in English, 0.2396 of subtask B in English.0.5121 of subtask A in German,0.2736 of subtask B in German. The result and ranking as shown in table <ref type="table" target="#tab_0">1</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">Data distribution</head><p>The distribution of the data is shown in table <ref type="table">2</ref>. All data is sourced from the data provided by the HASOC platform. Subtask A in English and German, identification hate or none hate positions, Hate speech and offensive posts in Subtask A fall into two categories: hate speech(hate), normal text(NOT). Subtask B in English and German, identification hate, profanity and offensive positions. This subtask is a fine classification of English, German, and Hindi datasets. Hate speech and offensive posts in Subtask B fall into four categories: hate speech(hateful), (OFFN) offensive, (PRFN) profanity, (NONE) normal text <ref type="bibr" target="#b3">[4]</ref>. We can see that the total amount of data is small and unbalanced distributed, which can easily lead to model overfitting. Due to the small amount of data, we can obtain too few representative features during the training process. So that the model to learn that weak representation or only individual text has characteristics. These features are often irrelevant to the expected features that need to be extracted. To address the irregular amount and distribution of data, we use cross-validation. We messed up the data set into five parts, validate five times, each time taking 1/5 of the dataset as the validation set.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.">Conclusion</head><p>In this paper, we describe the attention mechanism model based on Multi-Top 𝑘 Self-Attention and K-Max pooling, which used to subtask A and B in English and German of HASOC 2020. And in these subtasks,we have achieved a good result. Unbalanced data distribution will lead to poor model generalization and easy over-fitting. Therefore, in the future, we will focus on research on data imbalance processing.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: The architecture of 𝑀𝑢𝑖𝑙𝑡 − 𝑇 𝑜𝑝 𝑘 𝑆𝑒𝑙𝑓 𝐴𝑡𝑡𝑒𝑛𝑡𝑖𝑜𝑛 with 𝐾 − 𝑀𝑎𝑥𝑝𝑜𝑙𝑙𝑖𝑛𝑔 for Subtask B in English</figDesc><graphic coords="4,129.64,84.19,336.00,187.60" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: The architecture of Muilt-Top 𝑘 Self-Attention in figure 1</figDesc><graphic coords="4,175.49,310.80,244.30,73.85" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: The architecture of Attention in figure 2</figDesc><graphic coords="4,186.86,423.66,221.55,90.65" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc></figDesc><table><row><cell>Result</cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell>Tasks</cell><cell cols="4">precision recall f1-macro ranking</cell></row><row><cell>Task A in English</cell><cell>-</cell><cell>-</cell><cell>0.5042</cell><cell>6</cell></row><row><cell>Task B in English</cell><cell>-</cell><cell>-</cell><cell>0.2396</cell><cell>13</cell></row><row><cell>Task B in German</cell><cell>-</cell><cell>-</cell><cell>0.5121</cell><cell>7</cell></row><row><cell>Task B in German</cell><cell>-</cell><cell>-</cell><cell>0.2736</cell><cell>4</cell></row><row><cell>Table 2</cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell>Data distribution</cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell>Label</cell><cell cols="3">English TaskA German TaskA</cell><cell></cell></row><row><cell>HOF</cell><cell>1861</cell><cell></cell><cell>601</cell><cell></cell></row><row><cell>NOT</cell><cell>1953</cell><cell></cell><cell>1851</cell><cell></cell></row><row><cell>Label</cell><cell cols="3">English TaskB German TaskB</cell><cell></cell></row><row><cell>HATE</cell><cell>154</cell><cell></cell><cell>102</cell><cell></cell></row><row><cell>OFFN</cell><cell>311</cell><cell></cell><cell>126</cell><cell></cell></row><row><cell>PRFN</cell><cell>1374</cell><cell></cell><cell>364</cell><cell></cell></row><row><cell>NONE</cell><cell>1953</cell><cell></cell><cell>1860</cell><cell></cell></row></table></figure>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Ynu_wb at hasoc 2019: Ordered neurons lstm with attention for identifying hate speech and offensive language</title>
		<author>
			<persName><forename type="first">B</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Ding</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Zhou</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">FIRE (Working Notes</title>
				<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="191" to="198" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Nlp based sentiment analysis on twitter data using ensemble classifiers</title>
		<author>
			<persName><forename type="first">M</forename><surname>Kanakaraj</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">M R</forename><surname>Guddeti</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">2015 3rd International Conference on Signal Processing, Communication and Networking (ICSCN), IEEE</title>
				<imprint>
			<date type="published" when="2015">2015</date>
			<biblScope unit="page" from="1" to="5" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<author>
			<persName><forename type="first">P</forename><surname>Majumder</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Patel</surname></persName>
		</author>
		<title level="m">Overview of the hasoc track at fire 2019: Hate speech and offensive content identification in indo-european languages</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Overview of the HASOC track at FIRE 2020: Hate Speech and Offensive Content Identification in Indo-European Languages)</title>
		<author>
			<persName><forename type="first">T</forename><surname>Mandl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Modha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">K</forename><surname>Shahi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">K</forename><surname>Jaiswal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Nandini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Patel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Majumder</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Schäfer</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Working Notes of FIRE 2020 -Forum for Information Retrieval Evaluation</title>
				<imprint>
			<publisher>CEUR</publisher>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Corpus linguistics and theoretical linguistics: A love-hate relationship? not necessarily</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">T</forename><surname>Gries</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">International Journal of Corpus Linguistics</title>
		<imprint>
			<biblScope unit="volume">15</biblScope>
			<biblScope unit="page" from="327" to="343" />
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Nlp-based sentiment analysis</title>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">S</forename><surname>Dhillon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Liang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">US Patent</title>
		<imprint>
			<biblScope unit="volume">8</biblScope>
			<biblScope unit="page">633</biblScope>
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Long short-term memory</title>
		<author>
			<persName><forename type="first">S</forename><surname>Hochreiter</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Schmidhuber</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Neural computation</title>
		<imprint>
			<biblScope unit="volume">9</biblScope>
			<biblScope unit="page" from="1735" to="1780" />
			<date type="published" when="1997">1997</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<author>
			<persName><forename type="first">S</forename><surname>Malmasi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Zampieri</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1712.06427</idno>
		<title level="m">Detecting hate speech in social media</title>
				<imprint>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
		<title level="m" type="main">Visualizing and understanding neural models in nlp</title>
		<author>
			<persName><forename type="first">J</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Hovy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Jurafsky</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1506.01066</idno>
		<imprint>
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Challenges in discriminating profanity from hate speech</title>
		<author>
			<persName><forename type="first">S</forename><surname>Malmasi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Zampieri</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Experimental &amp; Theoretical Artificial Intelligence</title>
		<imprint>
			<biblScope unit="volume">30</biblScope>
			<biblScope unit="page" from="187" to="202" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Do characters abuse more than words?</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Mehdad</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Tetreault</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue</title>
				<meeting>the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue</meeting>
		<imprint>
			<date type="published" when="2016">2016</date>
			<biblScope unit="page" from="299" to="303" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Abusive language detection in online user content</title>
		<author>
			<persName><forename type="first">C</forename><surname>Nobata</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Tetreault</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Thomas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Mehdad</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Chang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 25th international conference on world wide web</title>
				<meeting>the 25th international conference on world wide web</meeting>
		<imprint>
			<date type="published" when="2016">2016</date>
			<biblScope unit="page" from="145" to="153" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Common sense reasoning for detection, prevention, and mitigation of cyberbullying</title>
		<author>
			<persName><forename type="first">K</forename><surname>Dinakar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Jones</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Havasi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Lieberman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Picard</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM Transactions on Interactive Intelligent Systems (TiiS)</title>
		<imprint>
			<biblScope unit="volume">2</biblScope>
			<biblScope unit="page" from="1" to="30" />
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Automatic identification of personal insults on social news sites</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">O</forename><surname>Sood</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">F</forename><surname>Churchill</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Antin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of the American Society for Information Science and Technology</title>
		<imprint>
			<biblScope unit="volume">63</biblScope>
			<biblScope unit="page" from="270" to="285" />
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Detection and fine-grained classification of cyberbullying events</title>
		<author>
			<persName><forename type="first">C</forename><surname>Van Hee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Lefever</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Verhoeven</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Mennes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Desmet</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>De Pauw</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Daelemans</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Hoste</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference Recent Advances in Natural Language Processing (RANLP)</title>
				<imprint>
			<date type="published" when="2015">2015</date>
			<biblScope unit="page" from="672" to="680" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Detecting tension in online communities with computational twitter analysis</title>
		<author>
			<persName><forename type="first">P</forename><surname>Burnap</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><forename type="middle">F</forename><surname>Rana</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Avis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Williams</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Housley</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Edwards</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Morgan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Sloan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Technological Forecasting and Social Change</title>
		<imprint>
			<biblScope unit="volume">95</biblScope>
			<biblScope unit="page" from="96" to="108" />
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">A capsule network for recommendation and explaining what you like and dislike</title>
		<author>
			<persName><forename type="first">C</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Quan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Peng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Qi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Deng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Wu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval</title>
				<meeting>the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval</meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="275" to="284" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<monogr>
		<author>
			<persName><forename type="first">R</forename><surname>Child</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Gray</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Radford</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Sutskever</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1904.10509</idno>
		<title level="m">Generating long sequences with sparse transformers</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b18">
	<monogr>
		<author>
			<persName><forename type="first">Z</forename><surname>Lin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Feng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">N D</forename><surname>Santos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Xiang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Bengio</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1703.03130</idno>
		<title level="m">A structured self-attentive sentence embedding</title>
				<imprint>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Attention is all you need</title>
		<author>
			<persName><forename type="first">A</forename><surname>Vaswani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Shazeer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Parmar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Uszkoreit</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Jones</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">N</forename><surname>Gomez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ł</forename><surname>Kaiser</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Polosukhin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Advances in neural information processing systems</title>
				<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="page" from="5998" to="6008" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">An efficient agglomerative clustering algorithm using a heap</title>
		<author>
			<persName><forename type="first">T</forename><surname>Kurita</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Pattern Recognition</title>
		<imprint>
			<biblScope unit="volume">24</biblScope>
			<biblScope unit="page" from="205" to="209" />
			<date type="published" when="1991">1991</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Segmentation by texture using a co-occurrence matrix and a split-and-merge algorithm</title>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">C</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Pavlidis</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Computer graphics and image processing</title>
		<imprint>
			<biblScope unit="volume">10</biblScope>
			<biblScope unit="page" from="172" to="182" />
			<date type="published" when="1979">1979</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
