<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Automatic Identification of Misogyny in English and Italian Tweets at EVALITA 2018 with a Multilingual Hate Lexicon</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Endang Wahyu Pamungkas</string-name>
          <email>pamungka@di.unito.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alessandra Teresa Cignarella</string-name>
          <email>cigna@di.unito.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Valerio Basile</string-name>
          <email>basile@di.unito.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Viviana Patti</string-name>
          <email>patti@di.unito.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Dipartimento di Informatica, Università degli Studi di Torino</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>PRHLT Research Center, Universitat Politècnica de València</institution>
        </aff>
      </contrib-group>
      <fpage>2</fpage>
      <lpage>7</lpage>
      <abstract>
        <p>English. In this paper we describe our submission to the shared task of Automatic Misogyny Identification in English and Italian Tweets (AMI) organized at EVALITA 2018. Our approach is based on SVM classifiers and enhanced by stylistic and lexical features. Additionally, we analyze the use of the novel HurtLex multilingual linguistic resource, developed by enriching in a computational and multilingual perspective of the hate words Italian lexicon by the linguist Tullio De Mauro, in order to investigate its impact in this task.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>
        Hate Speech (HS) can be based on race, skin color,
ethnicity, gender, sexual orientation, nationality,
or religion, it incites to violence and
discrimination, abusive, insulting, intimidating, and
harassing. Hateful language is becoming a huge
problem in social media platforms such as Twitter and
Facebook
        <xref ref-type="bibr" rid="ref15">(Poland, 2016)</xref>
        . In particular, a type
of cyberhate that is increasingly worrying
nowadays is the use of hateful language that specifically
targets women, which is normally referred to as:
MISOGYNY
        <xref ref-type="bibr" rid="ref2">(Bartlett et al., 2014)</xref>
        .
      </p>
      <p>
        Misogyny can be linguistically manifested in
numerous ways, including social exclusion,
discrimination, hostility, threats of violence and
sexual objectification
        <xref ref-type="bibr" rid="ref1 ref8">(Anzovino et al., 2018)</xref>
        . Many
Internet companies and micro-blogs already tried
to tackle the problem of blocking this kind of
online contents, but, unfortunately, the issue is
far from being solved because of the complexity
of the natural language1
        <xref ref-type="bibr" rid="ref16">(Schmidt and Wiegand,
2017)</xref>
        . For the above-mentioned reasons, it has
become necessary to implement targeted NLP
techniques that can be automated to treat hate speech
online and misogyny.
      </p>
      <p>
        The first shared task specifically aimed at
Automatic Misogyny Identification (AMI) took place
at IberEval 20182 within SEPLN 2018 considering
English and Spanish tweets
        <xref ref-type="bibr" rid="ref1 ref8 ref9">(Fersini et al., 2018a)</xref>
        .
Hence, the aim of the proposed shared task is
to encourage participating teams in proposing the
best automatic system firstly to distinguish
misogynous and non-misogynous tweets, and secondly
to classify the type of misogynistic behaviour and
judge whether the target of the misogynistic
behaviour is a specific woman or a group of women.
In this paper, we describe our submission to the
2nd shared task of Automatic Misogyny
Identification (AMI)3 organized at EVALITA 2018,
organized in the same manner but focusing on Italian
tweets, rather than Spanish and English as in the
IberEval task.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Task Description</title>
      <p>
        The aim of the AMI task is to detect
misogynous tweets written in English and Italian (Task
A)
        <xref ref-type="bibr" rid="ref1 ref8 ref9">(Fersini et al., 2018b)</xref>
        . Furthermore, in Task
B, each system should also classify each
misogynous tweet into one of five different misogyny
behaviors (STEREOTYPE, DOMINANCE,
DERAILING, SEXUAL HARASSMENT, AND DISCREDIT)
and two targets of misogyny classes (active and
passive). Participants are allowed to submit up to
three runs for each language. Table 1 shows the
dataset label distribution for each class. Accuracy
will be used as an evaluation metric for Task A,
while macro F -score is used for Task B.
      </p>
      <p>The organizers provided the same amount of
data for both languages: 4,000 tweets in the
training set and 1,000 in the test set. The label
distribution for Task A is balanced, while in Task B the
distribution is highly unbalanced for both
misogyny behaviors and targets.
3</p>
      <p>Description of the System
We used two Support Vector Machine (SVM)
classifiers which exploit different kernels: linear and
radial basis function (RBF) kernels.</p>
      <p>SVM with Linear Kernel. Linear kernel was
used to find the optimal hyperplane when SVM
was firstly introduced in 1963 by Vapnik et al.,
long before Cortes and Vapnik (1995) proposed
to use the kernel trick. Joachims (1998)
recommends to use linear kernel for text classification,
based on the observation that text representation
features are frequently linearly separable.
SVM with RBF Kernel. Choosing the kernel
is usually a challenging task, because its
performance will be dataset dependent. Therefore, we
also experimenteed with a Radial Basis Function
(RBF) kernel, which has been already proven as
an effective classifier in text classification
problems. The drawback of RBF kernels is that they
are computationally expensive and obtain a worse
performance in big and sparse feature matrices.
3.1</p>
      <p>Features
We employed several lexical features, performing
a simple preprocessing step including
tokenization and stemming, using the NLTK (Natural
Language Toolkit) library4. A detailed description of
the features employed by our model follows.
Bag of Words (BoW). We used bags of words
in order to build the tweets representation.
Before producing the word vector, we changed all
the characters from upper to lower case. Our
vector space consists of the count of unigrams and
4https://www.nltk.org/
bigrams as a representation of the tweet. In
addition, we also employed Bag of Hashtags (BoH)
and Bag of Emojis (BoE) features, which are built
by using the same technique as BoW, focusing on
the presence of hashtags and emojis.</p>
      <p>Swear Words. This feature takes into account the
presence of a swear word and the number of its
occurrences in the tweet. For English, we took a list
of swear words from www.noswearing.com,
while for Italian we gathered the swear word list
from several sources5 including a translated
version of www.noswearing.com’s list and a list
of swear words from Capuano (2007).</p>
      <p>Sexist Slurs. Beside swear words, we also
considered sexist words, that are specifically
targeting women. We used a small set of sexist slurs
from previous work by Fasoli et al. (2015). We
translated and expanded that list manually for our
Italian systems. This feature has a binary value, 1
when at least one sexist slur presence on tweet and
0 when there is no sexist slur on tweet.</p>
      <p>
        Women Words. We manually built a small set of
words containing synonyms and several words
related to word “woman" in English and “donna" in
Italian. Based on our previous work
        <xref ref-type="bibr" rid="ref13">(Pamungkas
et al., 2018)</xref>
        , these words were effective to
detect the target of misogyny on the tweet.
Similar to sexist slur feature, this feature also has
binary value show the presence of women words on
tweet.
      </p>
      <p>Surface Features. We also considered several
surface level features including: upper case
character count, number of hashtags, number of
URLs, and the length of the tweet counting the
characters.</p>
      <p>
        Hate Words Lexicon. HurtLex
        <xref ref-type="bibr" rid="ref3">(Bassignana et
al., 2018)</xref>
        is a multilingual lexicon of hate words,
built starting from a list of words compiled
manually
        <xref ref-type="bibr" rid="ref6">(De Mauro, 2016)</xref>
        . The lexicon is
semiautomatically translated into 53 languages, and the
lexical items are divided into 17 categories (see
Table 2). For our system configuration, we
exploited the presence of the words in each category
as a single feature, thus obtaining 17 single
features, one for each HurtLex category.
      </p>
      <p>5https://www.parolacce.org/2016/12/
20/dati-frequenza-turpiloquio/ and https:
//it.wikipedia.org/wiki/Turpiloquio_
nella_lingua_italiana
Task A</p>
      <p>English</p>
      <p>Italian
Misogynistic
kernel (0:765 accuracy), while for Italian the best
result has been obtained by runs #2 and #3 with
the Linear kernel (0:893 accuracy). Different sets
of categories from HurtLex were able to improve
the classifier performance, depending on the
language.</p>
      <p>In order to classify the category and target of
misogyny (Task B), we adopted the same set of
features as Task A. Therefore, we did not build
new systems specifically for Task B.</p>
      <p>We experimented with different selections of
categories from the HurtLex lexicon, and
identified the most useful for the purpose of misogyny
identification. As it can be seen in Table 3, the
main categories are: physical disabilities and
diversity (DDP), words related to prostitution (PR),
words referring to male genitalia (ASM) and
female genitalia (ASF). But also: derogatory words
(CDS), words related to felonies and crime, and
also immoral behavior (RE).</p>
      <p>Language
Systems
Accuracy
Bag of Word
Bag of Hashtags
Bag of Emojis
S.W. Count
S.W. Presence
Sexist Slurs
Woman Word
Hashtag
Link Presence
Upper Case
Count
Text Length
ASF Count
PR Count
OM Count
DDF Count
CDS Count
DDP Count
AN Count
ASM Count
DMC Count
IS Count
OR Count
PA Count
PS Count
QAS Count
RCI Count
RE Count
SVP Count
Kernel</p>
      <p>English
run1 run2
0:765 0:72</p>
      <p>Italian
run3 run1 run2 run3
0:744 0:786 0:893 0:893
X
X
X
X
X
X</p>
      <p>X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X</p>
      <p>X
X
X
X
X</p>
      <p>X
X
X
X X - X X X
- - - X X X
X X - - -
- - - - -
X X - X X
X X - - - X
X X - - -
- - - X X
- - - - -
X X - - -
- - - - -
X X - - -
- - - - -
- - - - -
- - - - -
- - - X X
- - - - -</p>
      <p>RBF Linear RBF RBF Linear Linear</p>
    </sec>
    <sec id="sec-3">
      <title>Results</title>
      <p>the test sets. Our best system in Task A ranked 3rd
in Italian (0:839 in accuracy for run3) and 13th
in English (0:621 in accuracy for run3).
Interestingly, our best result on both languages were
obtained by the best configuration submitted at the
IberEval campaign. However, our English system
performance was way worse compared to the
result of IberEval (accuracy = 0:814). We will try to
analyze this problem in the Section 6.
Team
himani.c.run3.tsv
himani.c.run2.tsv
AMI-BASELINE
hateminers.c.run3
hateminers.c.run1
SB.c.run2.tsv
himani.c.run1.tsv
SB.c.run1.tsv
hateminers.c.run2
SB.c.run3.tsv
resham.c.run2
resham.c.run1
bakarov.c.run1
resham.c.run3
RCLN.c.run1
ITT.c.run2.tsv
bakarov.c.run2
14-exlab.c.run1
bakarov.c.run3
14-exlab.c.run3
ITT.c.run1.tsv
ITT.c.run3.tsv
14-exlab.c.run2</p>
      <p>ENGLISH</p>
      <p>Avg.
0.406
0.377
0.370
0.369
0.348
0.344
0.342
0.335
0.329
0.328
0.322
0.316
0.309
0.283
0.280
0.276
0.275
0.260
0.254
0.239
0.238
0.237
0.232
Cat.
0.361
0.323
0.342
0.302
0.264
0.282
0.280
0.282
0.229
0.269
0.246
0.235
0.260
0.214
0.165
0.173
0.176
0.124
0.151
0.107
0.140
0.138
0.205
Targ.
0.451
0.431
0.399
0.435
0.431
0.407
0.403
0.389
0.430
0.387
0.399
0.397
0.357
0.353
0.395
0.379
0.374
0.395
0.356
0.371
0.335
0.335
0.258
around 0:5 for Italian). Several under-represented
classes such as DERAILING and DOMINANCE are
very difficult to be detected in category
classification (See Table 1 for details). Similarly, the label
distribution was very unbalanced for target
classification, where most of the misogynous tweets are
attacking a specific target (ACTIVE).</p>
      <p>Several features which focus on the use of
offensive words were proven to be useful in English.
For Italian, a simple tweet representation which
involves Bag of Words, Bag of Hashtags, and Bag
of Emojis already produced a better result than
the baseline. Some of the HurtLex categories that
were improving the system’s performance during
training did not help the prediction on the test set
CDS, ASM for Italian). However, similarly to the
egories and discriminate whether the target is
acSpanish case, the system configuration which
utitive or passive. Both subtasks for both languages
lized ASF, PR, and DDP obtained the best result
have very low baselines (below 0:4 for English and
in Italian.</p>
    </sec>
    <sec id="sec-4">
      <title>Discussion</title>
      <p>We performed an error analysis on the gold
standard test set, and analyzed 160 Italian tweets that
our best system configuration mislabelled. The
label “misogynistic” was wrongly assigned to 147
instances (false positives, 91.9% of the errors),
while the contrary happened only 13 times (false
negatives, 8.1% of the errors). The same situation
happened in the English dataset, but with a less
striking impact, with 228 false positives (60.2% of
the errors), 151 false negatives (39.8% of the
errors). In this section we conduct a qualitative error
analysis, identifying and discussing several factors
that contribute to the misclassification.</p>
      <p>
        Presence of swear words. We encountered a lot
of “bad words” in the dataset of this shared task
for both English and Italian. In case of abusive
context, the presence of swear words can help to
spot abusive content such as misogyny. However,
they could also lead to false positives when the
swear word is used in a casual, not offensive
context
        <xref ref-type="bibr" rid="ref1 ref11 ref12 ref17 ref3 ref8 ref9">(Malmasi and Zampieri, 2018; Van Hee et
al., 2018; Nobata et al., 2016)</xref>
        . Consider the
following two examples containing the swear word
“bitch" in different contexts:
1. Im such a fucking cunt bitch and i dont
even mean to be goddammit
2. Bitch you aint the only one who hate
me, join the club, stand in the corner, and
stfu.
      </p>
      <p>
        In Example 1, the swear word “bitch" is used
just to arouse interest/show off, thus not directly
insulting the other person. This is a case of
idiomatic swearing
        <xref ref-type="bibr" rid="ref14">(Pinker, 2007)</xref>
        . In Example 2,
the swear word “bitch" is used to insult a specific
target in an abusive context, an instance of abusive
swearing
        <xref ref-type="bibr" rid="ref14">(Pinker, 2007)</xref>
        . Resolving swearing
context is still a challenging task for automatic system
which contributing to the difficulties of this task.
Reported speech. Tweets may contain
misogynistic content as an indirect quote of someone
else’s words, such as in the following example:
3. Quella volta che mia madre mi ha detto
quella cosa le ho risposto "Mannaggia! Non
sarò mai una brava donna schiava zitta e
lava! E adesso?!" Potrei morire per il
dispiacere.
! That time when my mom told me that thing
and I answered “Holy s**t! I will never be
a good slave who shuts up and cleans! What
now?”
According to task guidelines this should not be
labeled as a misogynistic tweet, because it is not
the user himself who is misogynistic. Therefore,
instances of this type tend to confuse a classifier
based on lexical features.
      </p>
      <p>
        Irony and world knowledge. In Example 3, the
sentence “Potrei morire per il dispiacere.”6 is
ironic. Humor is very hard to model for automatic
systems — sometimes, the presence of figurative
language even baffles human annotators.
Moreover, external world knowledge is often required
in order to infer whether an utterance is ironic
        <xref ref-type="bibr" rid="ref18">(Wallace et al., 2014)</xref>
        .
      </p>
      <p>Preprocessing and tokenization. In
computermediated communication, and specifically on
Twitter, users often resort to a language type that
is closer to speech, rather than written language.
This is reflected in less-than-clean orthography,
with forms and expressions that imitate the verbal
face-to-face conversation.</p>
      <p>4. @ XXXXXXXXX @ XXXXXXXXXX
@ XXXXXXX @ XXXXXX x me glob
prox2aa colpiran tutti incluso nemicinterno..
esterno colpopiúduro saràculogrande che
bevetropvodka e inoltre x questiondisoldi
progetta farmezzofallirsudfinitestampe: ciò
nnvàben xrchèindebolis
! 4 me glob next2aa will hit everyone included
internalenemy.. external harderhit willbebigass
who drinkstoomuchvodka and also 4
mattersofmoney isplanning
tomakethesouthfailwithprintings: dis notgood causeweaken
In Example 4, preprocessing steps like
tokenization and stemming are particularly hard to
perform, because of the lack of spaces between one
word and the other and the confused
orthography. Consequently all the classification pipeline
is compromised and error-prone.</p>
      <p>Gender of the target. As defined in the
Introduction, we know that misogyny is a specific type
of hateful language, targeting women. However,
detecting the gender of the target is a challenging
task in itself, especially in Twitter datasets.
5. @realDonaldTrump shut the FUCK up
you infected pussy fungus.</p>
      <p>6. @TomiLahren You’re a fucking skank!</p>
      <p>Both examples use bad words to abuse their
targets. However, the first example is labeled as not
misogyny since the target is Donald Trump (man),
while the second example is labeled as misogyny
with the target Tomi Lahren (woman).</p>
      <p>6Translation: I could die for heartbreak.</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusions</title>
      <p>Here we draw some considerations based on the
results of our participation to the EVALITA 2018
AMI shared task. In order to test the
multilingual potential of our model, one of the
systems we submitted for Italian at EVALITA (run
#3) was based on our best model for Spanish at
IberEval. Based on the official results, this system
performed well for Italian, consisting of features
such as: BoW, BoE, BoH and several HurtLex
categories specifically related to the hate against
women. Concerning English, we obtained lower
results in EVALITA in comparison to IberEval
with the same system configuration. It is worth
mentioning that even if the training set for the AMI
EVALITA task was substantially bigger, in
absolute terms all the AMI’s participants at EVALITA
obtained worse scores than the ones obtained by
the IberEval’s teams.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>Valerio Basile and Viviana Patti were partially
supported by Progetto di Ateneo/CSP 2016
(Immigrants, Hate and Prejudice in Social
MediaIhatePrejudice, S1618_L2_BOSC_01).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>Maria</given-names>
            <surname>Anzovino</surname>
          </string-name>
          , Elisabetta Fersini, and
          <string-name>
            <given-names>Paolo</given-names>
            <surname>Rosso</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Automatic Identification and Classification of Misogynistic Language on Twitter</article-title>
          .
          <source>In Proc. of the 23rd Int. Conf. on Applications of Natural Language &amp; Information Systems</source>
          , pages
          <fpage>57</fpage>
          -
          <lpage>64</lpage>
          . Springer.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Jamie</surname>
            <given-names>Bartlett</given-names>
          </string-name>
          , Richard Norrie, Sofia Patel, Rebekka Rumpel, and
          <string-name>
            <given-names>Simon</given-names>
            <surname>Wibberley</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Misogyny on twitter</article-title>
          .
          <source>Demos.</source>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <given-names>Elisa</given-names>
            <surname>Bassignana</surname>
          </string-name>
          , Valerio Basile, and
          <string-name>
            <given-names>Viviana</given-names>
            <surname>Patti</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Hurtlex: A Multilingual Lexicon of Words to Hurt</article-title>
          .
          <source>In Proc. of the 5th Italian Conference on Computational Linguistics</source>
          (CLiC-it
          <year>2018</year>
          ), Turin, Italy. CEUR.org.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <given-names>Romolo</given-names>
            <surname>Giovanni Capuano</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <article-title>Turpia: sociologia del turpiloquio e della bestemmia</article-title>
          .
          <source>Riscontri (Milano</source>
          , Italia).
          <source>Costa &amp; Nolan.</source>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <given-names>Corinna</given-names>
            <surname>Cortes</surname>
          </string-name>
          and
          <string-name>
            <given-names>Vladimir</given-names>
            <surname>Vapnik</surname>
          </string-name>
          .
          <year>1995</year>
          .
          <article-title>Supportvector networks</article-title>
          .
          <source>Machine learning</source>
          ,
          <volume>20</volume>
          (
          <issue>3</issue>
          ):
          <fpage>273</fpage>
          -
          <lpage>297</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>Tullio De Mauro</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Le parole per ferire</article-title>
          .
          <source>Internazionale</source>
          .
          <volume>27</volume>
          settembre
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <given-names>Fabio</given-names>
            <surname>Fasoli</surname>
          </string-name>
          , Andrea Carnaghi, and Maria Paola Paladino.
          <year>2015</year>
          .
          <article-title>Social acceptability of sexist derogatory and sexist objectifying slurs across contexts</article-title>
          .
          <source>Language Sciences</source>
          ,
          <volume>52</volume>
          :
          <fpage>98</fpage>
          -
          <lpage>107</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <given-names>Elisabetta</given-names>
            <surname>Fersini</surname>
          </string-name>
          , Maria Anzovino, and
          <string-name>
            <given-names>Paolo</given-names>
            <surname>Rosso</surname>
          </string-name>
          . 2018a.
          <article-title>Overview of the Task on Automatic Misogyny Identification at IberEval</article-title>
          .
          <source>In Proceedings of 3rd Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval</source>
          <year>2018</year>
          )), pages
          <fpage>57</fpage>
          -
          <lpage>64</lpage>
          . CEUR-WS.org, September.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <given-names>Elisabetta</given-names>
            <surname>Fersini</surname>
          </string-name>
          , Debora Nozza, and
          <string-name>
            <given-names>Paolo</given-names>
            <surname>Rosso</surname>
          </string-name>
          . 2018b.
          <article-title>Overview of the evalita 2018 task on automatic misogyny identification (ami)</article-title>
          .
          <source>In Proceedings of the 6th evaluation campaign of Natural Language Processing and Speech tools for Italian (EVALITA'18)</source>
          , Turin, Italy. CEUR.org.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <given-names>Thorsten</given-names>
            <surname>Joachims</surname>
          </string-name>
          .
          <year>1998</year>
          .
          <article-title>Text categorization with support vector machines: Learning with many relevant features</article-title>
          .
          <source>In European conference on machine learning</source>
          , pages
          <fpage>137</fpage>
          -
          <lpage>142</lpage>
          . Springer.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <given-names>Shervin</given-names>
            <surname>Malmasi</surname>
          </string-name>
          and
          <string-name>
            <given-names>Marcos</given-names>
            <surname>Zampieri</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Challenges in discriminating profanity from hate speech</article-title>
          .
          <source>Journal of Experimental &amp; Theoretical Artificial Intelligence</source>
          ,
          <volume>30</volume>
          (
          <issue>2</issue>
          ):
          <fpage>187</fpage>
          -
          <lpage>202</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <given-names>Chikashi</given-names>
            <surname>Nobata</surname>
          </string-name>
          , Joel Tetreault, Achint Thomas,
          <string-name>
            <given-names>Yashar</given-names>
            <surname>Mehdad</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Yi</given-names>
            <surname>Chang</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Abusive language detection in online user content</article-title>
          .
          <source>In Proceedings of the 25th international conference on world wide web</source>
          , pages
          <fpage>145</fpage>
          -
          <lpage>153</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <given-names>Endang</given-names>
            <surname>Wahyu</surname>
          </string-name>
          <string-name>
            <surname>Pamungkas</surname>
          </string-name>
          , Alessandra Teresa Cignarella, Valerio Basile, and
          <string-name>
            <given-names>Viviana</given-names>
            <surname>Patti</surname>
          </string-name>
          .
          <year>2018</year>
          . 14-ExLab@
          <article-title>UniTo for AMI at IberEval2018: Exploiting Lexical Knowledge for Detecting Misogyny in English and Spanish Tweets</article-title>
          .
          <source>In Proc. of 3rd Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval</source>
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <given-names>Steven</given-names>
            <surname>Pinker</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <article-title>The stuff of thought: Language as a window into human nature</article-title>
          .
          <source>Penguin.</source>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <given-names>Bailey</given-names>
            <surname>Poland</surname>
          </string-name>
          .
          <year>2016</year>
          . Haters: Harassment, Abuse, and
          <string-name>
            <given-names>Violence</given-names>
            <surname>Online</surname>
          </string-name>
          . Potomac Press.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <given-names>Anna</given-names>
            <surname>Schmidt</surname>
          </string-name>
          and
          <string-name>
            <given-names>Michael</given-names>
            <surname>Wiegand</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>A survey on hate speech detection using natural language processing</article-title>
          .
          <source>In Proceedings of the Fifth International Workshop on Natural Language Processing for Social Media</source>
          , pages
          <fpage>1</fpage>
          -
          <lpage>10</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <surname>Cynthia Van Hee</surname>
          </string-name>
          ,
          <string-name>
            <surname>Gilles Jacobs</surname>
            , Chris Emmery, Bart Desmet, Els Lefever, Ben Verhoeven, Guy De Pauw, Walter Daelemans, and
            <given-names>Véronique</given-names>
          </string-name>
          <string-name>
            <surname>Hoste</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Automatic detection of cyberbullying in social media text</article-title>
          . arXiv preprint arXiv:
          <year>1801</year>
          .05617.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <string-name>
            <given-names>Byron C.</given-names>
            <surname>Wallace</surname>
          </string-name>
          , Laura Kertz,
          <string-name>
            <given-names>Eugene</given-names>
            <surname>Charniak</surname>
          </string-name>
          , et al.
          <year>2014</year>
          .
          <article-title>Humans require context to infer ironic intent (so computers probably do, too)</article-title>
          .
          <source>In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume</source>
          <volume>2</volume>
          :
          <string-name>
            <surname>Short</surname>
            <given-names>Papers)</given-names>
          </string-name>
          , volume
          <volume>2</volume>
          , pages
          <fpage>512</fpage>
          -
          <lpage>516</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>