<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">AMI at IberEval2018 Automatic Misogyny Identification in Spanish and English Tweets</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author role="corresp">
							<persName><forename type="first">Victor</forename><surname>Nina-Alcocer</surname></persName>
							<email>vicnial@inf.upv.es</email>
							<affiliation key="aff0">
								<orgName type="department">Department of Computer Systems and Computation</orgName>
								<orgName type="institution">Universitat Politècnica de València</orgName>
								<address>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">AMI at IberEval2018 Automatic Misogyny Identification in Spanish and English Tweets</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">7820A1FB0EA10A0DDD80755C321955C1</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T20:26+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>In this paper we describe the submission for the Automatic Misogyny Identification in Spanish and English Tweets shared task organized at IberEval 1 . This work proposes an approach based on weights of ngrams, word categories, structural information and lexical analysis to discover whether these components allow us to discriminate between misogynous and no misogynous tweets and their respective categories and targets in case of misogynous tweets. Moreover, we analyze the use of some features created by these components to investigate their impact.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>AMI is the first task on automatic misogyny identification <ref type="bibr" target="#b1">[2]</ref>. Its aim was to identify cases of aggressiveness and hate speech towards women in social media <ref type="bibr" target="#b0">[1]</ref>. Poland's work <ref type="bibr" target="#b2">[3]</ref> was the first attempt to manually classify misogynous tweets. Now this shared task will consider two subtasks for this classification:</p><p>-subtask1: Misogyny identification.</p><p>-subtask2a: Misogynistic Behaviour.</p><p>-subtask2b: Target Classification.</p><p>The aim of subtask1 is to identify whether a tweet is misogynous or not, and the second subtask2a aims to identify the category, if a misogynous tweet belongs to: discredit, dominance, sexual harassment, stereotype, and derailing. Finally, subtask2b is in charge to identify whether a misogynous tweet is active or passive i.e. if its target is generic (women in general) or individual. In this work, each of these tasks is approached as a classification task. We will use natural language processing (NLP), machine learning and feature engineering to identify patterns and learn classification models respectively.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Approach</head><p>This section tries to describe the main approaches that have been used. Generally, misogyny can be expressed written, orally, in a subtle or explicit way, also directly or indirectly addressed to someone. In order to investigate how people may express misogyny in tweets, we propose an approach that allows us to discover some aspects about how misogyny is expressed in the corpus provided by the organizers. Hence this approach takes into account some features that we considered important in order to understand if some of them contribute to recognizing misogynous content and its respective category.</p><p>Structure (str): Basically, knowing how many words are used in a tweet or if most of those words are written in capital letters, even if some of them use excessively punctuation marks could reveal important information. As we know a tweet is composed of words, punctuations, mentions, URLs, etc. In this approach, we will pay attention to these aspects to see if all of them in some way help to better discriminate between misogynous tweets and not misogynous one. A summary of these features is given below:</p><p>-The number of symbols or punctuation marks (!' ?,.").</p><p>-The number of words written in capital letters.</p><p>-The number of words and characters, including stop words.</p><p>-Mean of the numbers of words and characters.</p><p>-The number of mentions, URLs, and hash-tags.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>LIWC categories (lc):</head><p>Another component that we consider important is the possibility to get features from Linguistic Inquiry and Word Count (LIWC)<ref type="foot" target="#foot_1">2</ref> . We have just taken into account some categories related to misogynous emotions such as: angry, sexual, swear, positive, negative, etc. <ref type="bibr" target="#b3">[4]</ref> The idea behind this component is to calculate for instance the percentage of positive or negative emotions, or even if a tweet has sexual content as we can see in Figure <ref type="figure" target="#fig_0">1</ref>. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Ngrams (ng):</head><p>In this component Term frequency -Inverse document frequency based on Words (TFIDFW) or Chars (TFIDFC) schemes are used. For instance in misogynous TFIDFW (see Table <ref type="table" target="#tab_0">1</ref>) the term bitch (first place) is more used among the misogynous tweets than among the ones that are not misogynous (fourth place), e.g. in our case the uni-gram bitch has a different weight, it means that this word has a specific weight in a misogynous tweet and has another weight in a non misogynous one. The same logic is followed for subtask2a and subtask2b using TFIDFW of their categories and targets respectively. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Part of Speech (pos):</head><p>The last component of our approach takes into account part of speech information, which has the task of tagging each word in a sentence with its appropriate part of speech. We decide whether each word is a noun, adjective, verbs, etc. Using this component we can identify some patterns, for instance in our corpus some nouns are followed by punctuation marks e.g. bitch!!!!!!.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Experiments and Results</head><p>Thanks to the organizers we count with a dataset of 3307 Spanish and 3251 English tweets respectively. Each tweet is labeled as misogynous <ref type="bibr" target="#b0">(1)</ref> or nomisogynous (0) and both datasets are balanced. Regarding the type of misogyny and target, each tweet is labeled as: discredit, dominance, sexual harassment, stereotype, derailing and active or passive in case of the target. With respect to the category and target information, the corpus is unbalanced. The first one is biased in favor of discredit (60%) and regarding the target is biased in favor of active (almost 75%). Moreover, to evaluate our system, a test dataset with 831 and 726 unlabeled tweets in Spanish and English respectively was provided.</p><p>For the experiments, we employed a set of feature combinations which has been used to feed some classifiers: Support Vector Machine (SVM), Multi-layer Perceptron (MLP) and MultinomialNB (MNB). SVM and 10 K-fold cross-validation were used. The first one was chosen because its performance was good enough with thousands of features, and the second one allows us to avoid over-fitting in all the experiments. Firstly, the main goal was to face the classification of misogynous tweets in Spanish in order to apply the best performing approach to the rest of subtasks Proceedings of the Third Workshop on Evaluation of Human Language Technologies for Iberian Languages <ref type="bibr">(IberEval 2018)</ref> in English or Spanish. Table <ref type="table" target="#tab_1">2</ref> shows how the experiments were set up. Approaches ap1 and ap2 had the aim to find out whether features created by TFIDFW, TFIDFC, Bag of word-grams (BOW) or Bag of char-grams (BOC) are useful. ap1 uses the whole group of features (thousands of them) created by TFIDFW or TFIDFC, while ap2 obtains the 60 best features using truncated singular value decomposition (SVD) on TFIDFW and TFIDFC then combines with BOW. Unfortunately, those approaches were interesting but we did not obtain results over our baselines with any of the classifiers (MLP, SVM, MNB). ap3 tries to reduce the number of features: firstly we classified a tweet using MNB and then we obtained their respective probabilities to use them as features (2), additionally we got the best 20 features using SVD on TFIDFW and lastly, we added the features str (5) and lc (10). Unfortunately, with these 37 features we did not achieve results over our baselines in subtask1 and subtask2ab respectively. Now we proceed to analyze the results that we got with the approach proposed in Section 2. ap4 and ap5 follow the same logic, but ap5 obtains better results than ap4 because it uses TFIDFW. Tables <ref type="table" target="#tab_3">3 and 4</ref> show the best val- ues that we achieved: run4 in Table <ref type="table" target="#tab_2">3</ref> uses TFIDFW plus structure, category and weight of ngrams(unigrams+bigrams+trigrams) as features and we obtained 0.782 of accuracy applying linear SVM on subtask1 . While with respect to the subtask2a, we added part of speech as feature and we obtained 0.370 of F1-macro. Looking at Table <ref type="table" target="#tab_3">4</ref>, we may observe that in run2 we obtained 0.780 of F1-macro in subtask2b just using lc as feature. Also, that just using str and ng(bigram) we obtained 0.503 of F1-macro on subtask2a.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Official ranking</head><p>We did not expect good results in English (see Table <ref type="table" target="#tab_4">5</ref>), but we obtained scores slightly above the average F1-baseline (0.3374) in subtask2a and sub-task2b (see run3 and run4). While in subtask1 we were below the accuracy baseline (0.7837). These results can be due to a bad combination of our features. Table <ref type="table" target="#tab_5">6</ref> shows the better results we obtained in Spanish (between the first five teams). However, we think that classifying misogynous tweets in this corpus was quite difficult because the performance of the teams was approximately 80% in terms of accuracy. Similarly, in subtask2a and subtask2b, mostly the teams were not far from the baseline. </p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Fig. 1 .</head><label>1</label><figDesc>Fig. 1. The content (words) of a tweet belongs to some category: death, anger, etc.</figDesc><graphic coords="2,221.22,449.71,172.91,115.27" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 .</head><label>1</label><figDesc>Weight of uni-grams and bi-grams</figDesc><table><row><cell></cell><cell>uni-grams</cell><cell></cell><cell></cell><cell></cell><cell>bi-grams</cell><cell></cell></row><row><cell></cell><cell cols="2">misogynous no-misogynous</cell><cell></cell><cell></cell><cell cols="2">misogynous no-misogynous</cell></row><row><cell cols="2">N term weights</cell><cell>N term</cell><cell>weights</cell><cell>N term</cell><cell>weights</cell><cell>N term</cell><cell>weights</cell></row><row><cell cols="2">1 bitch 0.054913</cell><cell>1 rape</cell><cell cols="3">0.021782 1 stupid bitch 0.010204</cell><cell>1 stupid cunt</cell><cell>0.006159</cell></row><row><cell>2 dick</cell><cell>0.027398</cell><cell>2 dick</cell><cell cols="2">0.019902 2 ass bitch</cell><cell>0.006658</cell><cell>2 son bitch</cell><cell>0.002429</cell></row><row><cell cols="2">3 stupid 0.024436</cell><cell>3 cunt</cell><cell cols="2">0.019422 3 suck dick</cell><cell>0.004807</cell><cell>3 men rights</cell><cell>0.002079</cell></row><row><cell>4 like</cell><cell>0.024388</cell><cell>4 bitch</cell><cell>0.018755</cell><cell></cell><cell></cell><cell></cell></row><row><cell cols="2">5 woman 0.023752</cell><cell>5 hoe</cell><cell>0.017120</cell><cell></cell><cell></cell><cell></cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2 .</head><label>2</label><figDesc>Configuration of the main experiments</figDesc><table><row><cell>Name</cell><cell>Set up</cell></row><row><cell>ap1 .</cell><cell>TFIDFW + TFIDFC + BOW + BOC</cell></row><row><cell>ap2 .</cell><cell>SVD30(TFIDFW) + SVD30(TFIDFC) + BOW</cell></row><row><cell cols="2">ap3 . MNB(PREDICTED) + SVD20(TFIDFW) + str + lc</cell></row><row><cell>ap4 .</cell><cell>BOW + str + lc + ng + pos</cell></row><row><cell>ap5 .</cell><cell>TFIDFW + str + lc + ng + pos</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 3 .</head><label>3</label><figDesc>Results with ap5 on English training tweets</figDesc><table><row><cell>run</cell><cell>subtask1</cell><cell></cell><cell cols="2">subtask2a subtask2b</cell></row><row><cell></cell><cell></cell><cell>Accuracy</cell><cell cols="2">F1-macro F1-macro</cell></row><row><cell>run1SVM on TFIDFW</cell><cell>+str+lc</cell><cell>0.733 +pos</cell><cell>0.299</cell><cell>0.721</cell></row><row><cell>run2SVM on TFIDFW</cell><cell>+str+lc+ng(u)</cell><cell>0.781 +pos</cell><cell>0.302</cell><cell>0.762</cell></row><row><cell>run3SVM on TFIDFW</cell><cell>+str+lc+ng(u+b)</cell><cell>0.781 +pos</cell><cell>0.343</cell><cell>0.763</cell></row><row><cell cols="4">run4SVM on TFIDFW +str+lc+ng(u+b+t) 0.782 +pos 0.370</cell><cell>0.764</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 4 .</head><label>4</label><figDesc>Results with ap5 on Spanish training tweets</figDesc><table><row><cell>run</cell><cell>subtask1</cell><cell></cell><cell cols="2">subtask2a subtask2b</cell><cell></cell></row><row><cell></cell><cell></cell><cell>Accuracy</cell><cell>F1-macro</cell><cell></cell><cell>F1-macro</cell></row><row><cell>run1SVM on TFIDFW</cell><cell>+str+lc</cell><cell>0.804</cell><cell>0.472</cell><cell>-str</cell><cell>0.781</cell></row><row><cell cols="2">run2SVM on TFIDFW +str+lc+ng(b)</cell><cell>0.860 -lc</cell><cell cols="2">0.503 -str-ng(b)</cell><cell>0.780</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_4"><head>Table 5 .</head><label>5</label><figDesc>Official results for English subtask1, subtask2a and subtask2b</figDesc><table><row><cell></cell><cell>subtask1</cell><cell></cell><cell></cell><cell>subtask2ab</cell></row><row><cell>Rank</cell><cell>Run</cell><cell cols="3">Accuracy Rank Average F1-macro</cell></row><row><cell>16</cell><cell cols="2">Our approach.run2 0.7809</cell><cell>17</cell><cell>0.336433966</cell></row><row><cell>17</cell><cell cols="2">Our approach.run3 0.7809</cell><cell>14</cell><cell>0.33914113</cell></row><row><cell>18</cell><cell cols="2">Our approach.run4 0.7809</cell><cell>13</cell><cell>0.339590051</cell></row><row><cell>26</cell><cell cols="2">Our approach.run1 0.7094</cell><cell>23</cell><cell>0.316368399</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_5"><head>Table 6 .</head><label>6</label><figDesc>Official results for Spanish subtask1, subtask2a and subtask2b</figDesc><table><row><cell></cell><cell>subtask1</cell><cell></cell><cell>subtask2ab</cell></row><row><cell>Rank</cell><cell>Run</cell><cell cols="2">Accuracy Rank Average Macro F1</cell></row><row><cell>9</cell><cell cols="2">Our approach.run1 0.805054152 8</cell><cell>0.42722476</cell></row><row><cell>20</cell><cell cols="2">Our approach.run2 0.76654633 13</cell><cell>0.41174962</cell></row><row><cell>22</cell><cell cols="2">Our approach.run3 0.65944645 21</cell><cell>0.27271983</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">https://sites.google.com/view/ibereval-2018</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">https://www.receptiviti.ai/liwc-api-get-started Proceedings of the Third Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval 2018)</note>
		</body>
		<back>
			<div type="annex">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Conclusions</head><p>In this work, we proposed an approach that takes into account some aspects: weights of ngrams, LIWC categories, structural information and lexical analysis. We observed that each aspect contributes in some way to the different subtasks. Moreover, we notice that the four aspects contributed to obtaining a better accuracy and F1-macro in the corpus of English tweets. However, only the first three aspects were useful for the Spanish tweets. As future work, it is interesting to use some techniques to face unbalanced dataset and explore other features. Moreover, we plan deep learning to see what performance this technique could achieve.</p></div>			</div>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Haters: Harassment, abuse, and violence online by bailey poland</title>
		<author>
			<persName><forename type="first">M</forename><surname>Bailey</surname></persName>
		</author>
		<idno type="DOI">10.1086/693771</idno>
		<ptr target="https://doi.org/10.1086/693771" />
	</analytic>
	<monogr>
		<title level="j">Journal of Women in Culture and Society</title>
		<imprint>
			<biblScope unit="volume">43</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="495" to="497" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
	<note>Signs</note>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Overview of the task on automatic misogyny identification at ibereval</title>
		<author>
			<persName><forename type="first">E</forename><surname>Fersini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Anzovino</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Rosso</surname></persName>
		</author>
		<ptr target="CEUR-WS.org" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Third Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval 2018), co-located with 34th Conference of the Spanish Society for Natural Language Processing (SEPLN 2018) CEUR Workshop Proceedings</title>
				<meeting>the Third Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval 2018), co-located with 34th Conference of the Spanish Society for Natural Language Processing (SEPLN 2018) CEUR Workshop Proceedings<address><addrLine>Seville, Spain</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2018-09-18">September 18, 2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">The problem of identifying misogynist language on twitter (and other online social spaces)</title>
		<author>
			<persName><forename type="first">S</forename><surname>Hewitt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Tiropanis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Bokhove</surname></persName>
		</author>
		<idno type="DOI">10.1145/2908131.2908183</idno>
		<ptr target="http://doi.acm.org/10.1145/2908131.2908183" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 8th ACM Conference on Web Science</title>
				<meeting>the 8th ACM Conference on Web Science</meeting>
		<imprint>
			<date type="published" when="2016">2016</date>
			<biblScope unit="volume">16</biblScope>
			<biblScope unit="page" from="333" to="335" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<author>
			<persName><forename type="first">B</forename><surname>Poland</surname></persName>
		</author>
		<ptr target="http://www.jstor.org/stable/j.ctt1fq9wdp" />
		<title level="m">Haters: Harassment, Abuse, and Violence Online</title>
				<imprint>
			<publisher>University of Nebraska Press</publisher>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
