<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Aggression Identification in Posts -two machine learning approaches</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author role="corresp">
							<persName><forename type="first">Faneva</forename><surname>Ramiandrisoa</surname></persName>
							<email>faneva.ramiandrisoa@irit.fr</email>
							<affiliation key="aff0">
								<orgName type="institution" key="instit1">IRIT</orgName>
								<orgName type="institution" key="instit2">Université de Toulouse</orgName>
								<address>
									<country key="FR">France</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="institution">Université d&apos;Antananarivo</orgName>
							</affiliation>
						</author>
						<title level="a" type="main">Aggression Identification in Posts -two machine learning approaches</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">F43A59024EC2462FA8BBEE5045952E62</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-25T05:11+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Social media</term>
					<term>Social media analysis</term>
					<term>Cyber-agression</term>
					<term>TRAC Trolling, Aggression and Cyberbulling</term>
					<term>Machine learning based model</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Social media have changed the way people communicate. One of the aspects is cyber-aggression and interpersonal aggression that can be catalyzed by perceived anonymity. Automatically monitoring user-generated content in order to help moderating it is thus a hot topic. In this paper, we present and evaluate two supervised machine learning models to identify aggressive content and the level of aggressiveness. The first model uses random forest and linear regression while the second model uses deep learning techniques.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Social media have changed the way people communicate <ref type="bibr" target="#b2">[3,</ref><ref type="bibr" target="#b12">13,</ref><ref type="bibr" target="#b13">14,</ref><ref type="bibr" target="#b4">5]</ref>. One of these aspects is cyber-aggression and interpersonal aggression that can be catalyzed by perceived anonymity <ref type="bibr" target="#b15">[16]</ref>. Automatically monitoring user-generated content in order to help moderating social media is thus an important although difficult topic <ref type="bibr" target="#b3">[4,</ref><ref type="bibr" target="#b16">17]</ref>.</p><p>In 2018, the Shared Task on Aggression Identification was organised as part of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC -1) at COL-ING 2018 <ref type="bibr" target="#b8">[9]</ref>. The objective of this task is to detect aggressive content and the level of aggressiveness. Thirty teams submitted their test runs. The best system obtained a weighted F-score of 0.64 on a data set composed of annotated Facebook comments.</p><p>In this paper, we report two models we developed in order to answer the aggression identification task. The first model uses random forest and linear regression which can be considered as relatively mature approaches while the second model combines CNN and LSTM recent deep learning techniques. No strong conclusion could be made on the superiority of one or the other model since it depends on the collection. This paper is organized as follows: Section 2 reports related works, Section 3 describes our two approaches, Section 4 describes the dataset used in this work, reports the results and discuss them while Section 5 concludes this paper and presents future works.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Related work</head><p>Approaches based on features and supervised classifiers such as Support Vector Machines (SVM) are often used in order to learn to detect whether a text contains aggressiveness <ref type="bibr" target="#b23">[24]</ref>; in recent years, deep learning has been also employed for this task <ref type="bibr" target="#b18">[19,</ref><ref type="bibr" target="#b1">2]</ref>.</p><p>Deep learning has also been used by TRAC challenge participants. TRAC <ref type="bibr" target="#b8">[9]</ref> challenge is the first that focuses on detecting aggressive text. The task training set is composed of Facebook posts/comments; there is also two kinds of test sets: one from Facebook and another from Twitter.</p><p>Among the thirty participants, Saroyehun <ref type="bibr" target="#b1">[2]</ref> obtained the best results. The authors investigated the efficacy of deep neural network by experimenting different models : CNN, LSTM, BiLSTM, and combinations thereof. In their experiments they used translation technique to enlarge the training set and added an external dataset on hate speech <ref type="foot" target="#foot_0">3</ref> . The LSTM model which was trained on the augmented training set only, achieved the best weighted F1 score of 0.6425 on Facebook test set ; it is the first ranked system on TRAC challenge ; the same system does not performed as well on the Twitter data set. The other system of the same team which implements a combination of CNN and LSTM and which was trained on the augmented training set and the additional dataset, achieved a weighted F1 score of 0.5920 and the third rank on the twitter test set.</p><p>Raiyani et. al. <ref type="bibr" target="#b19">[20]</ref>, meanwhile, tested different models for text classification in TRAC, from classic machine learning model to deep learning models. At the end, they kept three models: FastText model, Dense neural networks, and Voting of the two. The Dense neural networks gives better performance than the two others and achieved a weighted F1 score of 0.5813 on Facebook test set; it is the fourteenth rank on TRAC challenge. While it achieved the best weighted F1 score of 0.6009 and the first rank on the twitter test set, although it was trained on a Facebook dataset.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Machine learning based models</head><p>We developed two supervised machine learning based models that we evaluated in this paper. The first method combines random forest and logistic regression while the second approach is deep learning based. We also developed a model based on CNN only for which results can be found in <ref type="bibr" target="#b20">[21]</ref>; it performs in between the two models reported in this paper.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Trac-RF LR: combination of two classifiers</head><p>In this model we combined random forest (RF) based on surface features and linguistic features with logistic regression (LR) based on document vectorization. We chose this combination because a combination of multiple machine learning models placed first in many prestigious machine learning competitions <ref type="bibr" target="#b17">[18]</ref>, such as Netflix Competition, Kaggle,... Moreover, when using non-combined models on the training dataset, the results were lower in the case of TRAC as well and this was confirmed on the test set (see section 4.3).</p><p>RF Classifier. The random forest model uses different features extracted from the comments as presented in Table <ref type="table" target="#tab_1">1</ref>. Some are adapted from <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b21">22]</ref> where the authors tried to detect depression from texts; another source of inspiration is <ref type="bibr" target="#b6">[7]</ref> where the authors suggested an information nutritional label for describing text qualities. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Emotions</head><p>Frequency of emotions from specific categories: anger, fear, surprise, sadness and disgust. The idea behind is to check the categories related to aggressiveness.</p><p>Gunning Fog Index Estimate of the years of education that a person needs to understand the text at first reading.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Flesch Reading Ease</head><p>Measure how difficult to understand a text is.</p><p>Linsear Write Formula Developed for the U.S. Air Force to calculate the readability of their technical manuals 5 .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>New Dale-Chall Readability</head><p>Measure the difficulty of comprehension that persons encounter when reading a text. It is inspired from Flesch Reading Ease measure.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Swear words</head><p>The intuition behind is that the texts containing insults are often aggressive.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Lexical analysis with python library empath</head><p>Empath is a tool for analyzing text across lexical categories. By default, it has 194 lexical categories and each category is considered as feature. Some of these features are used for abusive language detection, hate speech, cyberbullying and the others are used for sentiment or personality analysis that we judged useful for aggression detection.</p><p>A RF classifier was trained on train and validation sets by representing each text (Facebook comment or tweet) with a vector composed by the features we mentioned in Table <ref type="table" target="#tab_1">1</ref>.</p><p>The following parameters were used during the training: class weight="balanced", max features="sqrt", n estimators=60, min weight fraction leaf=0.0, criterion='entropy', random state=2.</p><p>At prediction time, a text from the test set is represented with features and then run the trained model. The output is the estimated probabilities for the three classes (overtly aggressive, covertly aggressive and non-aggressive).</p><p>LR Classifier. This model is based on document vectorization using Doc2vec <ref type="bibr" target="#b11">[12]</ref>. Doc2vec is used to represent sentences, paragraphs, or whole documents as vectors and it can be trained on small corpora, which is case of the task datasets.</p><p>Before building the LR Classifier, we first trained two separate Doc2vec models: a Distributed Bag of Words and a Distributed Memory model <ref type="bibr" target="#b11">[12]</ref>. For the training, we used the same configuration as in <ref type="bibr" target="#b24">[25]</ref> for representing user's text. The two Doc2vec models were trained on the train and validation sets. We used the Python package gensim 6 <ref type="bibr" target="#b22">[23]</ref>. We also concatenated the output vectors of these two models, as done in <ref type="bibr" target="#b24">[25]</ref>, resulting in a representation by a 200-dimension vector per text.</p><p>Then a logistic regression classifier was trained on the vectors for both the train and validation sets with the following parameters : class weight="balanced", random state=1, max iter=100, solver="liblinear".</p><p>At prediction time, the texts from the test set were vectorized by using the two Doc2vec models and the 200-dimension vectors were given as input of trained classifier. The output is also a set of class probabilities.</p><p>Combination of two classifiers. The class probabilities obtained from RF classifier and LR Classifier were averaged and finally the class with the highest probability was considered as the class the text belongs to. We also tested different ways to combine the output probabilities obtained from the two classifiers RF and LR, such as maximum, minimum, etc., but the average method gave the best results.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Trac-CNN LSTM: Combination of CNN and LSTM</head><p>This model combines two deep learning techniques: CNN and LSTM. The main idea is to pass input representation (sentence matrix in Figure <ref type="figure">1</ref>) to the CNN and pass the local features learnt by the CNN (concatenated vectors in Figure <ref type="figure">1</ref>) to the LSTM. Indeed, CNN and LSTM are complementary due to the fact that each of them captures information at different scales <ref type="bibr" target="#b1">[2]</ref>.</p><p>The architecture of our combined model is illustrated in Figure <ref type="figure">1</ref>. It is as follows: first, we convert sentences/texts into sentences matrix <ref type="foot" target="#foot_2">7</ref> where each row is a vector representation <ref type="foot" target="#foot_3">8</ref> of each word in the sentences/texts. Then, convolutions are applied on the sentences matrix where we used three filter region sizes: bigrams (height = 2), trigrams (height = 3) and fourgrams (height = 4). Each region has 100 filters; thus, in total there are 300 filters. The result of convolutions is called feature maps; vectors with variablelength according to the region filter and each filter region has 100 feature maps. Afterwards, a 1-max pooling is performed over feature maps. More precisely, for each region the largest number from each feature map is kept and then concatenated to form a vector. As a result, we obtain one vector of size 100<ref type="foot" target="#foot_4">9</ref> per region filter. Then, these three vectors are concatenated to form a feature vector and a dropout is applied on this feature vector. The concatenated feature vector is passed to the LSTM layer. Then, we added one fully connected hidden layer to reduce the dimension of the concatenated vector, followed by a dropout. Finally, an output layer, which is also a fully connected layer with three possible output states, is added. On the output layer, the activation function used is the softmax function.</p><p>The architecture of our model is inspired from the CNN architecture Zhang et. al. <ref type="bibr" target="#b25">[26]</ref> proposed and which is used for sentences classification. In that task, their CNN architecture outperforms baseline methods which use SVM as well as the one that used CNN in <ref type="bibr" target="#b7">[8]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Evaluation</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1">Data set</head><p>The evaluation is based on the TRAC 2018 shared task <ref type="bibr" target="#b8">[9]</ref>. The task dataset is a subset of Kumar et al' <ref type="bibr" target="#b9">[10]</ref> and consists in English and Hindi randomly sampled Facebook comments. In this study, we focused on the English part of the dataset which is detailed in Table <ref type="table" target="#tab_2">2</ref>. It is composed of (a) 11,999 Facebook comments for training and 3,001 comments for validation. It is annotated with 3 levels of aggression -Overtly Aggressive (OAG), Covertly Aggressive (CAG) and Non-Aggressive (NAG), (b) 916 English comments for test. Additionally, 1,257 English tweets were given as a second test set.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">Evaluation measure</head><p>The evaluation metric used in this paper is the weighted F1 which was also used in the TRAC shared task. The weighted F1 is equal to the average, weighted by the number of instances for each label, of the F1 (given by equation 1) of each class label.</p><formula xml:id="formula_0">F1 = 2 R * P R + P<label>(1)</label></formula><p>Fig. <ref type="figure">1</ref>: Illustration of a CNN + LSTM architecture for aggression detection inspired from <ref type="bibr" target="#b25">[26]</ref>. where P = t p t p+ f p is the precision, R = t p t p+ f n is the recall, t p denotes the true positives, f p the false positives, and f n the false negatives.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3">Results</head><p>Table <ref type="table" target="#tab_3">3</ref> reports the results we obtained with the two models presented above. For comparison, we report also results obtained with the RF classifier only and with the LR classifier only. The baseline mentioned in the first row was given by the TRAC shared task organizers while the second row is the best result from participants in the TRAC workshop. We can see that our two models outperform the baseline on both Facebook and Twitter subsets. Trac-RF-LR is better than Trac-CNN-LSTM on the Facebook collection while it is the opposite on the Twitter collection. This could be due to the train dataset which is only composed of texts crawled from Facebook. Indeed, we can observe the same behaviour for the other systems that participated to the challenge <ref type="bibr" target="#b8">[9]</ref>. The only exception is for Saroyehun <ref type="bibr" target="#b1">[2]</ref> system which performs better on the Twitter dataset.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>System</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Conclusion</head><p>In this paper, we presented two different supervised machine learning approaches for aggression identification on TRAC 2018 English collections (Facebook and Twitter based). The combination of random forest and linear regression classifiers based on a set of surface features and document vectorization leaded to the sixteenth ranked system out of thirty on the Facebook collection. The combination of CNN and Long Short-Term Memory was ranked fifteenth out of thirty systems.</p><p>To extend this work, we plan to update our models by adding new features such as bag of words or features more specific to the aggression. We also plan to apply feature engineering on the features we used in this paper in order to see which one are the most useful. On the other hand, feature selection could also be applied to build models that use features as less as possible <ref type="bibr" target="#b10">[11,</ref><ref type="bibr" target="#b5">6]</ref>. Finally, an investigation on deep learning models will be conducted by using different architectures such as hierarchical attention network. We do believe that these tracks can help designing more performing models.</p><p>Ethical issue. While TRAC challenge has its proper ethical policies, detecting aggressive content from user's posts raises ethical issues that are beyond the scope of the paper.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 1 :</head><label>1</label><figDesc>List of features used in RF to represent texts (Facebook comments or tweets).</figDesc><table><row><cell>4 http://saifmohammad.com/WebPages/NRC-Emotion-Lexicon.htm, accessed on 2017-</cell></row><row><cell>02-23</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 2 :</head><label>2</label><figDesc>Distribution of training, validation and testing data on TRAC 2018 data collection.</figDesc><table><row><cell>Number of</cell><cell>Train</cell><cell>Validation</cell><cell>Facebook</cell><cell>Test</cell><cell>Twitter</cell></row><row><cell>texts (=posts+comments)</cell><cell>11,999</cell><cell>3,001</cell><cell>916</cell><cell></cell><cell>1,257</cell></row><row><cell>Overt aggression</cell><cell>2,708</cell><cell>711</cell><cell>144</cell><cell></cell><cell>361</cell></row><row><cell>Covert aggression</cell><cell>4,240</cell><cell>1,057</cell><cell>142</cell><cell></cell><cell>413</cell></row><row><cell>No aggression</cell><cell>5,051</cell><cell>1,233</cell><cell>630</cell><cell></cell><cell>483</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 3 :</head><label>3</label><figDesc>Results for the English (Facebook and Twitter) task. Bold value is the best performance for our approaches.</figDesc><table><row><cell></cell><cell>Weighted F1</cell><cell></cell></row><row><cell></cell><cell>Facebook</cell><cell>Twitter</cell></row><row><cell>Random Baseline</cell><cell>0.354</cell><cell>0.348</cell></row><row><cell>Saroyehun [2]</cell><cell>0.642</cell><cell>0.592</cell></row><row><cell>Trac-RF LR</cell><cell>0.581</cell><cell>0.409</cell></row><row><cell>Trac-CNN LSTM</cell><cell>0.559</cell><cell>0.511</cell></row><row><cell>Trac-RF only</cell><cell>0.573</cell><cell>0.397</cell></row><row><cell>Trac-LR only</cell><cell>0.569</cell><cell>0.452</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_0">https://github.com/ZeerakW/hatespeech, accessed on January 10, 2020</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_1">http://www.streetdirectory.com/travel_guide/15675/writing/how_to_choose_ the_best_readability_formula_for_your_document.html, accessed on 2018-02-25</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="7" xml:id="foot_2">The dimension of a sentence matrix is l × d, where l is the length of the longest text/sentence in the dataset and d is the dimension of word vector representation.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="8" xml:id="foot_3">The word vector representation is obtained with word2vec model<ref type="bibr" target="#b14">[15]</ref> trained on the training and validation sets.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="9" xml:id="foot_4">Because there is</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="100" xml:id="foot_5">feature maps.</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Acknowledgement. This work has been partially funded by the European Union's Horizon 2020 H2020-SU-SEC-2018 under the Grant Agreement n°833115 (PREVI-SION project). This work has also been partially supported by the Ministère des Affaires étrangères et du Développement international under the scholarship EIFFEL-DOCTORAT 2017/ n°P707544H for Faneva Ramiandrisoa's PhD thesis.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">IRIT at e-Risk (regular paper)</title>
		<author>
			<persName><forename type="first">I</forename><surname>Abdou Malam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Arziki</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Nezar Bellazrak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Benamara</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>El Kaidi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Es-Saghir</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Housni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Moriceau</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Mothe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Ramiandrisoa</surname></persName>
		</author>
		<ptr target="http://CEUR-WS.org" />
	</analytic>
	<monogr>
		<title level="m">International Conference of the CLEF Association, CLEF 2017 Labs Working Notes</title>
				<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="volume">1866</biblScope>
		</imprint>
	</monogr>
	<note>CEUR Workshop Proceedings</note>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Aggression detection in social media: Using deep neural networks, data augmentation, and pseudo labeling</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">T</forename><surname>Aroyehun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Gelbukh</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying</title>
				<meeting>the First Workshop on Trolling, Aggression and Cyberbullying<address><addrLine>TRAC-</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2018">2018. 2018</date>
			<biblScope unit="page" from="90" to="97" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">social media has opened a world of &apos;open communication:&apos;&quot; experiences of adults with cerebral palsy who use augmentative and alternative communication and social media</title>
		<author>
			<persName><forename type="first">J</forename><surname>Caron</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Light</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Augmentative and Alternative Communication</title>
		<imprint>
			<biblScope unit="volume">32</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="25" to="40" />
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Moderated online communities and quality of usergenerated content</title>
		<author>
			<persName><forename type="first">J</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">B</forename><surname>Whinston</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Management Information Systems</title>
		<imprint>
			<biblScope unit="volume">28</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="237" to="268" />
			<date type="published" when="2011">2011</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Social media and its role in friendship-driven interactions among young people: A mixed methods study</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">P</forename><surname>Décieux</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Heinen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Willems</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">YOUNG</title>
		<imprint>
			<biblScope unit="volume">27</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="18" to="31" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Forward and Backward Feature Selection for Query Performance Prediction</title>
		<author>
			<persName><forename type="first">S</forename><surname>Déjean</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">T</forename><surname>Ionescu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Mothe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">Z</forename><surname>Ullah</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ACM Symposium on Applied Computing (SAC)</title>
				<imprint>
			<publisher>Association for Computing Machinery</publisher>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">An information nutritional label for online documents</title>
		<author>
			<persName><forename type="first">N</forename><surname>Fuhr</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Giachanou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Grefenstette</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Gurevych</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Hanselowski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Jarvelin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Jones</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Mothe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Nejdl</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ACM SIGIR Forum</title>
				<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2018">2018</date>
			<biblScope unit="volume">51</biblScope>
			<biblScope unit="page" from="46" to="66" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<title level="m" type="main">Convolutional neural networks for sentence classification</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Kim</surname></persName>
		</author>
		<idno>CoRR abs/1408.5882</idno>
		<imprint>
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Benchmarking Aggression Identification in Social Media</title>
		<author>
			<persName><forename type="first">R</forename><surname>Kumar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">K</forename><surname>Ojha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Malmasi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Zampieri</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the First Workshop on Trolling, Aggression and Cyberbulling (TRAC)</title>
				<meeting>the First Workshop on Trolling, Aggression and Cyberbulling (TRAC)<address><addrLine>Santa Fe, USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<title level="m" type="main">Aggression-annotated corpus of hindienglish code-mixed data</title>
		<author>
			<persName><forename type="first">R</forename><surname>Kumar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">N</forename><surname>Reganti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Bhatia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Maheshwari</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1803.09402</idno>
		<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Non-convex Regularizations for Feature Selection in Ranking with Sparse SVM</title>
		<author>
			<persName><forename type="first">L</forename><surname>Laporte</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Flamary</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Canu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Déjean</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Mothe</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Neural Networks and Learning Systems</title>
		<imprint>
			<biblScope unit="volume">25</biblScope>
			<biblScope unit="issue">6</biblScope>
			<biblScope unit="page" from="1118" to="1130" />
			<date type="published" when="2014-06">june 2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Distributed representations of sentences and documents</title>
		<author>
			<persName><forename type="first">Q</forename><forename type="middle">V</forename><surname>Le</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Mikolov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 31th International Conference on Machine Learning, ICML 2014</title>
				<meeting>the 31th International Conference on Machine Learning, ICML 2014<address><addrLine>Beijing, China</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2014-06">June 2014. 2014</date>
			<biblScope unit="page" from="1188" to="1196" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<monogr>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">H</forename><surname>Lipschultz</surname></persName>
		</author>
		<title level="m">Social media communication: Concepts, practices, data, law and ethics</title>
				<imprint>
			<publisher>Routledge</publisher>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Intimate partner violence victimization in the cyber and real world: Examining the extent of cyber aggression experiences and its association with inperson dating violence</title>
		<author>
			<persName><forename type="first">A</forename><surname>Marganski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Melander</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of interpersonal violence</title>
		<imprint>
			<biblScope unit="volume">33</biblScope>
			<biblScope unit="issue">7</biblScope>
			<biblScope unit="page" from="1071" to="1095" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Distributed representations of words and phrases and their compositionality</title>
		<author>
			<persName><forename type="first">T</forename><surname>Mikolov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Sutskever</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">S</forename><surname>Corrado</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Dean</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013</title>
				<meeting><address><addrLine>Lake Tahoe, Nevada, United States</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2013">December 5-8, 2013. 2013</date>
			<biblScope unit="page" from="3111" to="3119" />
		</imprint>
	</monogr>
	<note>Proceedings of a meeting held</note>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Social media, cyber-aggression and student mental health on a university campus</title>
		<author>
			<persName><forename type="first">F</forename><surname>Mishna</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Regehr</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Lacombe-Duncan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Daciuk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Fearing</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Van Wert</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of mental health</title>
		<imprint>
			<biblScope unit="volume">27</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="222" to="229" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Censored, suspended, shadowbanned: User interpretations of content moderation on social media platforms</title>
		<author>
			<persName><forename type="first">S</forename><surname>Myers West</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">New Media &amp; Society</title>
		<imprint>
			<biblScope unit="volume">20</biblScope>
			<biblScope unit="issue">11</biblScope>
			<biblScope unit="page" from="4366" to="4383" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">A transfer learning approach for emotion intensity prediction in microblog text</title>
		<author>
			<persName><forename type="first">M</forename><surname>Osama</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">R</forename><surname>El-Beltagy</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-030-31129-2_47</idno>
		<idno>030-31129-2 47</idno>
		<ptr target="https://doi.org/10.1007/978-3-" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the International Conference on Advanced Intelligent Systems and Informatics 2019</title>
				<meeting>the International Conference on Advanced Intelligent Systems and Informatics 2019<address><addrLine>AISI; Cairo, Egypt</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2019-10">2019. October 2019. 2019</date>
			<biblScope unit="page" from="512" to="522" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<monogr>
		<title level="m" type="main">A pragmatic supervised learning methodology of hate speech detection in social media</title>
		<author>
			<persName><forename type="first">G</forename><surname>Priyadharshini</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Fully connected neural network with advance preprocessor to identify aggression over facebook and twitter</title>
		<author>
			<persName><forename type="first">K</forename><surname>Raiyani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Gonc ¸alves</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Quaresma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">B</forename><surname>Nogueira</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying, TRAC@COLING</title>
				<meeting>the First Workshop on Trolling, Aggression and Cyberbullying, TRAC@COLING<address><addrLine>Santa Fe, New Mexico, USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2018">2018</date>
			<biblScope unit="page" from="28" to="41" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Irit at trac</title>
		<author>
			<persName><forename type="first">F</forename><surname>Ramiandrisoa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Mothe</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the First Workshop on Trolling, Aggression and Cyberbulling, TRAC@COLING</title>
				<meeting>the First Workshop on Trolling, Aggression and Cyberbulling, TRAC@COLING<address><addrLine>Santa Fe, New Mexico, USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2018">2018. 2018</date>
			<biblScope unit="page" from="19" to="27" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">IRIT at e-Risk</title>
		<author>
			<persName><forename type="first">F</forename><surname>Ramiandrisoa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Mothe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Benamara</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Moriceau</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Conference and Labs of the Evaluation Forum, Living Labs (CLEF 2018)</title>
				<meeting><address><addrLine>Avignon, France</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2018-10-09">2018. 10/09/2018-14/09/2018. 2018</date>
		</imprint>
	</monogr>
	<note>regular paper. on line). CEUR-WS : Workshop proceedings</note>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">Software framework for topic modelling with large corpora</title>
		<author>
			<persName><forename type="first">R</forename><surname>Rehurek</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Sojka</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks</title>
				<meeting>the LREC 2010 Workshop on New Challenges for NLP Frameworks</meeting>
		<imprint>
			<publisher>Citeseer</publisher>
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">A Survey on Hate Speech Detection Using Natural Language Processing</title>
		<author>
			<persName><forename type="first">A</forename><surname>Schmidt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Wiegand</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Fifth International Workshop on Natural Language Processing for Social Media. SocialNLP@EACL 2017</title>
				<meeting>the Fifth International Workshop on Natural Language Processing for Social Media. SocialNLP@EACL 2017<address><addrLine>Valencia, Spain</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="page" from="1" to="10" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">Linguistic metadata augmented classifiers at the CLEF 2017 task for early detection of depression</title>
		<author>
			<persName><forename type="first">M</forename><surname>Trotzek</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Koitka</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">M</forename><surname>Friedrich</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Working Notes of CLEF 2017 -Conference and Labs of the Evaluation Forum</title>
				<meeting><address><addrLine>Dublin, Ireland</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2017">September 11-14, 2017. 2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<monogr>
		<title level="m" type="main">A sensitivity analysis of (and practitioners&apos; guide to) convolutional neural networks for sentence classification</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">C</forename><surname>Wallace</surname></persName>
		</author>
		<idno>CoRR abs/1510.03820</idno>
		<imprint>
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
