<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>PoliTeam @ AMI: Improving Sentence Embedding Similarity with Misogyny Lexicons for Automatic Misogyny Identification in Italian Tweets</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Giuseppe Attanasio</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Eliana Pastor</string-name>
          <email>eliana.pastorg@polito.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Control and Computer Engineering Politecnico di Torino</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>We present a multi-agent classification solution for identifying misogynous and aggressive content in Italian tweets. A first agent uses modern Sentence Embedding techniques to encode tweets and a SVM classifier to produce initial labels. A second agent, based on TF-IDF and Misogyny Italian lexicons, is jointly adopted to improve the first agent on uncertain predictions. We evaluate our approach in the Automatic Misogyny Identification Shared Task of the EVALITA 2020 campaign. Results show that TF-IDF and lexicons effectively improve the supervised agent trained on sentence embeddings.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>The increasing adoption of online
communication systems we experienced in the last decades
brought the rise of many public forums for our
own opinions, such as forums, blogs, and social
networks. In these platforms, where access
cannot - and must not - be restricted to anyone, the</p>
      <p>
        Copyright c 2020 for this paper by its authors. Use
permitted under Creative Commons License Attribution 4.0
International (CC BY 4.0).
problem of misconduct and hateful content
became soon compelling. The protection of the most
targeted subjects, such as races, ethnicities,
religious parties, genders, and sexual orientations,
is of paramount importance. Violence against
women manifests in social networks every time
the offensive language targets women directly or
indirectly
        <xref ref-type="bibr" rid="ref9">(Ellsberg et al., 2005)</xref>
        . We refer to these
cases as misogynous speech. As platform owners
are updating their regulatory terms at an
increasing pace1, the high number of contents due to a
fast publication rate still pose a challenge to
monitoring systems.
      </p>
      <p>
        Many recent works in the NLP community
show effective results in online monitoring of hate
speech
        <xref ref-type="bibr" rid="ref10 ref2">(Fortuna and Nunes, 2018)</xref>
        and
misogynous contents (Pamungkas et al. (2020), Frenda et
al. (2019), Anzovino et al. (2018)). Furthermore,
research communities propose evaluation
initiatives (Basile et al. (2019), Bosco et al. (2018)) to
challenge NLP practitioners in finding novel
solutions to shared tasks. Among these, the AMI
shared task proposed at EVALITA 2020
        <xref ref-type="bibr" rid="ref16 ref4">(Basile et
al., 2020)</xref>
        focuses on automatic identification of
misogynous content on Twitter in Italian
        <xref ref-type="bibr" rid="ref8">(Elisabetta Fersini, 2020)</xref>
        .
      </p>
      <p>The task counts two main subtasks. The goal of
the first subtask, Subtask A - Misogyny &amp;
Aggressive Behaviour Identification, is the identification
of misogynous speech in tweets, and in case of
misogyny, the classification of an aggressive
language. Subtask B - Unbiased Misogyny
Identification, aims at classifying misogynous speech while
guaranteeing the fairness of the model (in terms of
unintended bias) on a synthetic dataset. The
unintended bias is a known phenomenon in natural
lan1https://www.theverge.com/2020/3/5/21166940/twitterhate-speech-ban-age-disability-disease-dehumanize,
https://www.theverge.com/2020/8/11/21363890/facebookblackface-antisemitic-stereotypes-ban-misinformation,
https://www.theguardian.com/technology/2020/jun/29/redditthe-donald-twitch-social-media-hate-speech
guage models and recent works address its
identification and mitigation (Dixon et al. (2018), Nozza
et al. (2019), Kennedy et al. (2020)).</p>
      <p>In this work, we describe our solution to
address the AMI shared task. We propose a
multiagent classification. The system uses recent
Sentence Embedding techniques to encode tweets and
a SVM classifier to produce initial labels. A
second agent, based on TF-IDF and Misogyny
Italian lexicons, is jointly adopted to improve the first
agent on uncertain predictions. Results show that
the TF-IDF and misogyny lexicons effectively
improve sentence embeddings. For both subtasks, we
chose the constrained approach, effectively using
only the data provided by the organizers.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Description of the system</title>
      <p>Recent work has pointed out the efficiency of
sentence embeddings in many downstream tasks,
such as sentiment classification. Meanwhile, NLP
practitioners strive to migrate the existing
solutions to languages different from English. As
such, classical language models are trained on
large parallel corpora, and multi-lingual,
pretrained models are published for later uses.</p>
      <p>In this work, we adopt a multi-agent
classification procedure to address each proposed subtask.
Firstly, we encode tweets to their sentence
embeddings using a pre-trained multi-lingual sentence
encoder. Next, we train a supervised classifier (the
first agent) on the latent embedding space. In
parallel, we extract the smoothed TF-IDF of tweets
and enhance the representation with features built
upon Hate Speech and Misogyny lexicons. This
representation is then used to train a supervised
classifier (the second agent). Finally, we propose a
classification schema where uncertain predictions
from the first agent are corrected with certain ones
from the second agent.</p>
      <p>The following paragraphs describe the data
preprocessing step, expand on the classification
system, and provide insights on its application to
subtasks A and B.</p>
      <sec id="sec-2-1">
        <title>2.1 Sentence embedding</title>
        <p>
          Researchers devoted significant work to the
empirical construction of sentence embeddings for the
English language (Giorgi et al. (2020), Wang and
Kuo (2020), Reimers and Gurevych (2019), Cer
et al. (2018)). The most recent studies leverage
high-quality language models, such as the BERT
or Transformer-XL families, to build embeddings
that properly transfer to several downstream tasks.
Extending monolingual models, other works
assess the generalization performance of language
models pre-trained on multi-lingual corpora,
producing sentence embeddings either aligned
between languages
          <xref ref-type="bibr" rid="ref16 ref18 ref21">(Reimers and Gurevych, 2020)</xref>
          or not
          <xref ref-type="bibr" rid="ref1">(Aluru et al., 2020)</xref>
          .
        </p>
        <p>
          We build sentence embeddings testing two
models. On the one hand, we use
          <xref ref-type="bibr" rid="ref1">(Aluru et al.,
2020)</xref>
          , a monolingual BERT-based model
originally fine-tuned from multilingual BERT on an
Italian corpus for hate-speech detection tasks. The
model is then fine-tuned on our specific
subtasks. On the other hand, we choose the
multilingual adaptation of Sentence-BERT (Reimers
and Gurevych (2020)), which is based on the
DistilBERT architecture (Sanh et al. (2019)). We
use the implementation2 built on top of the
transformers library. Since results for the monolingual
BERT were not encouraging from the beginning,
in any of the subtasks, we will focus the discussion
on multi-lingual Sentence-BERT.
        </p>
        <p>Further, we run a fine-tuning round on
multilingual Sentence-BERT to our specific subtasks.
To tune the initial embeddings, we optimize a
contrastive loss on pairs generated from the training
set. For any pair of tweets, if the ground truth
labels are the same (e.g. both misogynous or both
non-aggressive) the distance between the two
embeddings is decreased, while it is increased
otherwise. Since computing the set of potential pairs
is hard, we sample only 20% of the initial tweets,
namely S, compute all the P possible pairs among
those, where jP j = (jSj jS 1j)=2, and use
them for fine-tuning. We anticipate this partial
fine-tuning achieved worse results than the
original model and leave other fine-tuning strategies
as future work.</p>
        <p>The final agent is then a supervised classifier
trained on multi-lingual sentence embeddings
(referred as the SE agent). We use a Support Vector
Machine (SVM) with Radial Basis Function
kernel, which achieves the best results on our
validation set. Please refer to Section 3 for more details
on parameter configuration and performance.
2.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>TF-IDF and Misogyny Lexicons</title>
        <sec id="sec-2-2-1">
          <title>2https://github.com/UKPLab/sentence-transformers</title>
          <p>Pre-processing. We firstly pre-process the data
by replacing every URL found in tweets with the
meta-token LINK. Next, we perform tokenization
and lemmatization using the spaCy’s3 pre-trained
Italian core model it core news lg.</p>
          <p>
            Input features. We use a smoothed TF-IDF
vectorization of pre-processed tweets. We then
enrich word representations using lexicons to encode
misogynous speech and tweet sentiment.
(i) Misogynous lexicon. Misogynous tweets
often contain sexist slurs, swear words, and sexual
references. We include specific lexicons as
input features for dealing with hate and misogynous
speech
            <xref ref-type="bibr" rid="ref11">(Frenda et al., 2018)</xref>
            . We collect Italian
lexicons from multiple online sources. We
divide lexicons into the following categories:
sexists, profanity, sexuality and female body as
described in Table 1. The complete list of Italian
lexica and sources are available at our repository4. As
for the text of the tweet, lexicons are firstly
lemmatized using spaCy. We then derive 4 features, one
for each misogynous lexicon category. For a given
category, we first count the occurrences of the
corresponding lexicons in each tweet. We then
normalize the occurrence with the tweet word count.
(ii) Sentiment Lexicon. We use a sentiment
lexicon to characterize the polarity of tweets. The
sentiment of words in a tweet is obtained with the
OpeNER Italian Sentiment Lexicon
            <xref ref-type="bibr" rid="ref19">(Russo et al.,
2016)</xref>
            . This sentiment lexicon consists of 24.293
lexical entries annotated with positive, negative
and neutral polarity. In our analysis, we consider
only positive and negative polarity.
          </p>
          <p>Evaluating the polarity of an individual word
in a tweet without considering its context,
however, prevents from considering the role of
negation on sentence polarity. To address this issue,
we consider the following negation handling
technique based on the dependency-based parse tree.
We search in the parse tree extracted by spaCy for
words affected by negation. For these words, we</p>
        </sec>
        <sec id="sec-2-2-2">
          <title>3https://spacy.io/ 4https://github.com/g8a9/ami20-improving-embedding</title>
          <p>nsubj
advmod
Le
DET
det
donne
NOUN
non
ADV
sono
AUX
cop
intelligenti</p>
          <p>ADJ
invert the polarity, if it is available. As an
example, consider the phrase “le donne non sono
intelligenti” (women are not intelligent). Figure 1 shows
the extracted parse tree. The polarity of the word
“intelligenti” (intelligent) is inverted, from
positive to negative, since it is affected by negation.</p>
          <p>Note that, as for the tweet text, we lemmatize
sentiment lexicons. Finally, we extract 2 features
that capture the tweet polarity. These are obtained
by counting the number of words with positive and
negative polarity respectively and then
normalizing them by the tweet word count.
(iii) Additional features. Tweets may contain
quotations of misogynous content, without being
misogynous themselves. We hence consider as an
additional feature the relative frequency of
quotation marks. We also consider as a feature the
length of the tweet (i.e. number of characters).</p>
          <p>Finally, we train a supervised classifier (the
second agent, referred as Lex agent) on the TF-IDF
representation enriched with the additional
features previously described. As for the first agent,
we use a SVM with Radial Basis Function kernel
model. We refer the reader again to Section 3 for
details on the experimental setting.
2.3</p>
        </sec>
      </sec>
      <sec id="sec-2-3">
        <title>Multi-agent prediction</title>
        <p>We designed the multi-agent system to maximize
prediction confidence by using only predictions
with a high probability score. Specifically, we
deem a prediction as confident if its associated
probability score is above a given threshold.</p>
        <p>We produce the final classification label by
combining the outcomes of the two agents as
follows. We first generate a prediction label and a
score associated with it using the first agent. It
entails encoding a given test point with
SentenceBERT and running the inference with SVM (SE
agent). Afterward, we use the confidence
threshold to decide whether to keep the label or not.
If the SE’s prediction is not confident, we probe
the second agent, which is built upon TF-IDF and
misogyny lexicons (Lex agent). Finally, if Lex’s
prediction is confident, we choose its label as the
final one. If this is not the case, we rollback to
SE’s class label. We kept the confidence threshold
value as a hyper-parameter of the system.</p>
        <p>By design, the proposed solution provides only
confident prediction labels, either from the SE or
the Lex agent. We applied the multi-agent
classification procedure for both subtasks.
2.4</p>
      </sec>
      <sec id="sec-2-4">
        <title>Approach to subtask A</title>
        <p>In this task, participants have to assign a label
indicating whether a tweet is misogynous or not.
Then, limited to the misogynous ones, a second
label should tell if the tweet is also aggressive.</p>
        <p>We apply our multi-agent classification in a
chained-prediction fashion. Specifically, we train
a first instance of the system on the binary
misogyny problem and label every tweet. In this step,
we use the complete corpus. Next, we train a
second instance on the binary aggressiveness
problem. We feed the model with tweets predicted as
misogynous on the previous step and produce a
class label for those only. Finally, we label all the
non-misogynous tweets as non-aggressive.</p>
        <p>This strategy presents advantages and
drawbacks since the predictions are chained. On the
one hand, the two models are independent and can
separately learn a simpler problem. On the other
hand, this design lets errors on the misogyny
prediction propagate to the aggressiveness one. We
further discuss the matter in Section 4.
2.5</p>
      </sec>
      <sec id="sec-2-5">
        <title>Approach to subtask B</title>
        <p>For this task, we employ our multi-agent model
(SE+Lex agents) with no modifications. Since we
desire the model to encode also the structure and
form of synthetic sentences, we train the model
using the whole corpus.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Results</title>
      <p>In this section, we firstly describe the
experimental setting and the hyper-parameter tuning. We
then report and comment experimental results of
our multi-agent system. Further, to evaluate the
effects of the two agents, we report the results of
the system using only the SE or the Lex agent. The
versions using only the SE agent or the Lex agent
correspond to ids run1 and run2 respectively. The
id run3 is assigned to the multi-agent system.</p>
      <p>Table 3 shows the F1 scores for misogyny and
aggressiveness classes on the test set. All our
To perform hyper-parameter optimization and
model selection, we split the input data in training
and validation data using random stratified
sampling on both misogyny and aggressiveness labels.
We used 20% of data as validation.</p>
      <p>We ran a grid search over multiple classifiers as
Support Vector Machines (SVM), Deep Feed
Forward Neural Network, Random Forest, Logistic
Regression, and their input parameters. The
evaluation was performed using the first agent as
reference. SVM with Radial Basic Function kernel
with gamma=“scale” and C=10 achieved highest
performance on F1 score for misogynous class on
the validation set. We used this configuration for
the supervised classifier of the second agent.</p>
      <p>For the TF-IDF, we tuned the n-grams from n=1
to n=3, and the number of maximum tokens from
5:000 to 10:000. To estimate the best
configuration, we trained the SVM classifier with tuned
parameters on the vectorized data, and evaluated the
classification F1 score on the binary misogyny
detection problem on the validation set. We achieved
the highest F1 score with unigrams and 10:000
tokens as maximum vocabulary size.</p>
      <p>The last hyper-parameter is the confidence
threshold value for the multi-agent system. We
evaluated the F1 score for the misogynous class
on validation data varying the confidence
threshold in the range [0:6; 0:95]. Best performance are
obtained with a confidence threshold of 0:9.</p>
      <p>The hyper-parameter settings resulting from the
experimental tuning are used for both the subtasks.</p>
      <sec id="sec-3-1">
        <title>3.2 Subtask A</title>
        <p>The score for subtask A is computed by averaging
the F1 measures estimated for the misogynous and
aggressiveness classes. Table 2 shows the official
results. Our multi-agent system (run3) achieves
our highest result. It is ranked 12th out of all
submissions and 7th if we consider just constrained
ones. While our TF-IDF and misogyny lexicon
agent (run2) reaches our worst result, its
introduction improves the agent trained on sentence
embedding. The average F1 score increases from
0.6809 of the SE agent (run1) to 0.6835.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.3 Subtask B</title>
        <p>
          The score for subtask B is the weighted
combination of AU C computed on the test tweets and three
per-term AUC-based bias scores computed on the
synthetic dataset. We refer the reader to
          <xref ref-type="bibr" rid="ref8">(Elisabetta Fersini, 2020)</xref>
          for the complete description
of the evaluation metrics.
        </p>
        <p>Table 4 shows the official results. Our
multiagent system is ranked 2nd out of all submissions
and 1st if only constrained runs are considered. As
for subtask A, the Lex agent improves the
performance of the SE one.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Discussion and Conclusions</title>
      <p>Results show that the introduction of the
TFIDF and lexicons effectively improves the solution
based on sentence embedding. This finding stands
as the most significant contribution of this work,
and we believe that it can drive future system
designs. However, results on the test set reveal that
we got wrong on some choices that affected the
final performance.
4.1</p>
      <sec id="sec-4-1">
        <title>Analysis on subtask A</title>
        <p>Our multi-agent system missed the target on the
aggressiveness detection task. As reported in
Table 3, aggressiveness has a notable low F1 score.
We think this is due to bad choices in training
the system. (i) We used for the aggressiveness
task only on the misogynous portion of the input
data. This sub-set has an imbalanced class
distribution with a prevalence of aggressive tweets.
We did not re-balance the dataset, and our
predictions produced many false positives on the test. (ii)
Since we did not train the aggressiveness system
on non-misogynous (and non-aggressive) tweets,
whenever the misogyny system produces a false
positive, the aggressiveness detector faces a
completely new data point, out of its training
distribution. (iii) Finally, we naively replicated the best
algorithm and configuration found on the
misogyny task to the aggressiveness one.</p>
        <p>Notably, the number of misogynous false
negatives which forced an aggressive tweet to be
classified as non-aggressive by our chained approach
(see Section 2.4) is 16 out of 365 total errors. This
further enforces the conclusion that the majority
of errors were due to bad training choices on the
aggressiveness task and not the chained approach.
4.2</p>
      </sec>
      <sec id="sec-4-2">
        <title>Analysis on subtask B</title>
        <p>The multi-agent (SE+Lex) errors are 72 false
negatives and 157 (x2.2) false positives. With a
posterior error analysis on the test tweets, we identified
several factors that contribute to misclassification.</p>
        <p>Bias on parts of the body. Our system
struggles with parts of the body that have sexual and
misogynous reference based on the context. These
words polarize the assignment to the misogynous
class. As an example, 15% of false positives
contain the word “gola” (throat). This behavior
somewhat mimics the bias of models towards specific
identity terms.</p>
      </sec>
      <sec id="sec-4-3">
        <title>Self-mocking references. Another category</title>
        <p>hard to model is self-referencing text
containing misogynous speech. While the tone of these
tweets is auto-ironic or self-mocking, the model
decontextualizes and produces false positives.</p>
        <p>Targeted gender. In these tweets, the model
correctly detects the hateful tone of voice but fails
at identifying the gender of the target subject. As
such, it predicts tweets attacking males as
misogynous. This problem gets harder when the targeted
gender can be only inferred by prior knowledge of
tagged profiles (e.g. @bonucci leo19, a male
Italian football player).</p>
        <p>Reported misogynous speech. Another
difficult scenario to model is the reported or quoted
misogynous speech. Frequently, users quote an
unpleasant, misogynous passage while trying to
support the exact opposite message. It can
happen directly, using quotation marks, or indirectly
by citing the original speaker.</p>
        <p>We provide a list of tweets for each of the
aforementioned categories as supplementary material5.
Conclusion. In this work, we presented our
solution to the AMI shared task at the EVALITA
2020 evaluation campaign. Our system is based on
two models, the SE and Lex agents, which we built
using sentence embedding techniques and TF-IDF
enriched with misogyny lexicons respectively. We
addressed both subtask A and B, limited to
constrained runs. The approach fell short on the
subtask A, while showed promising results on subtask
B. Besides, results show the Lex agent effectively
improves the performance of the SE agent.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgments</title>
      <p>This work was supported by the DataBase and
Data Mining Group of Politecnico di Torino.
5https://github.com/g8a9/ami20-improving-embedding</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>Sai</given-names>
            <surname>Saketh</surname>
          </string-name>
          <string-name>
            <surname>Aluru</surname>
          </string-name>
          , Binny Mathew, Punyajoy Saha, and
          <string-name>
            <given-names>Animesh</given-names>
            <surname>Mukherjee</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>Deep Learning Models for Multilingual Hate Speech Detection</article-title>
          . arXiv:
          <year>2004</year>
          .06465 [cs],
          <source>April.</source>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <given-names>Maria</given-names>
            <surname>Anzovino</surname>
          </string-name>
          , Elisabetta Fersini, and
          <string-name>
            <given-names>Paolo</given-names>
            <surname>Rosso</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Automatic identification and classification of misogynistic language on twitter</article-title>
          .
          <source>In International Conference on Applications of Natural Language to Information Systems</source>
          , pages
          <fpage>57</fpage>
          -
          <lpage>64</lpage>
          . Springer.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <given-names>Valerio</given-names>
            <surname>Basile</surname>
          </string-name>
          , Cristina Bosco, Elisabetta Fersini, Nozza Debora, Viviana Patti, Francisco Manuel Rangel Pardo, Paolo Rosso,
          <string-name>
            <given-names>Manuela</given-names>
            <surname>Sanguinetti</surname>
          </string-name>
          , et al.
          <year>2019</year>
          . SemEval
          <article-title>-2019 task 5: Multilingual detection of hate speech against immigrants and women in twitter</article-title>
          .
          <source>In SemEval-2019</source>
          , pages
          <fpage>54</fpage>
          -
          <lpage>63</lpage>
          . ACL.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <given-names>Valerio</given-names>
            <surname>Basile</surname>
          </string-name>
          , Danilo Croce, Maria Di Maro, and
          <string-name>
            <surname>Lucia</surname>
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Passaro</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>Evalita 2020: Overview of the 7th evaluation campaign of natural language processing and speech tools for italian</article-title>
          .
          <source>In Valerio Basile</source>
          , Danilo Croce, Maria Di Maro, and Lucia C. Passaro, editors,
          <source>Proceedings of Seventh Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA</source>
          <year>2020</year>
          ),
          <article-title>Online</article-title>
          . CEUR.org.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <given-names>Cristina</given-names>
            <surname>Bosco</surname>
          </string-name>
          ,
          <string-name>
            <surname>Dell'Orletta Felice</surname>
            , Fabio Poletto, Manuela Sanguinetti, and
            <given-names>Tesconi</given-names>
          </string-name>
          <string-name>
            <surname>Maurizio</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Overview of the EVALITA 2018 hate speech detection task</article-title>
          .
          <source>In EVALITA 2018</source>
          , pages
          <fpage>1</fpage>
          -
          <lpage>9</lpage>
          . CEUR.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <given-names>Daniel</given-names>
            <surname>Cer</surname>
          </string-name>
          , Yinfei Yang,
          <string-name>
            <surname>Sheng-yi Kong</surname>
          </string-name>
          , Nan Hua, Nicole Limtiaco, Rhomni St John, Noah Constant, Mario Guajardo-Cespedes, Steve Yuan, Chris Tar,
          <string-name>
            <surname>Yun-Hsuan</surname>
            <given-names>Sung</given-names>
          </string-name>
          , Brian Strope, and
          <string-name>
            <given-names>Ray</given-names>
            <surname>Kurzweil</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Universal Sentence Encoder</article-title>
          . arXiv:
          <year>1803</year>
          .11175 [cs],
          <source>April.</source>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <given-names>Lucas</given-names>
            <surname>Dixon</surname>
          </string-name>
          , John Li, Jeffrey Sorensen, Nithum Thain, and
          <string-name>
            <given-names>Lucy</given-names>
            <surname>Vasserman</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Measuring and Mitigating Unintended Bias in Text Classification</article-title>
          .
          <source>In AAAI/ACM AIES</source>
          <year>2018</year>
          , pages
          <fpage>67</fpage>
          -
          <lpage>73</lpage>
          , December.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <given-names>Paolo</given-names>
            <surname>Rosso Elisabetta Fersini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Debora</given-names>
            <surname>Nozza</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>AMI @ EVALITA2020: Automatic Misogyny Identification</article-title>
          . In Valerio Basile, Danilo Croce, Maria Di Maro, and Lucia C. Passaro, editors,
          <source>Proceedings of the 7th evaluation campaign of Natural Language Processing</source>
          and
          <article-title>Speech tools for Italian (EVALITA 2020), Online</article-title>
          . CEUR.org.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <given-names>Mary</given-names>
            <surname>Ellsberg</surname>
          </string-name>
          , Lori Heise, World Health Organization, et al.
          <year>2005</year>
          .
          <article-title>Researching violence against women: a practical guide for researchers and activists</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <given-names>Paula</given-names>
            <surname>Fortuna</surname>
          </string-name>
          and Se´rgio Nunes.
          <year>2018</year>
          .
          <article-title>A Survey on Automatic Detection of Hate Speech in Text</article-title>
          .
          <source>ACM Computing Surveys</source>
          ,
          <volume>51</volume>
          (
          <issue>4</issue>
          ):
          <volume>85</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>85</lpage>
          :
          <fpage>30</fpage>
          ,
          <string-name>
            <surname>July</surname>
          </string-name>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <given-names>Simona</given-names>
            <surname>Frenda</surname>
          </string-name>
          , Bilal Ghanem,
          <article-title>Estefan´ıa Guzma´nFalco´n, Manuel Montes-y Go´mez, Luis VillasenorPineda</article-title>
          , et al.
          <year>2018</year>
          .
          <article-title>Automatic expansion of lexicons for multilingual misogyny detection</article-title>
          .
          <source>In EVALITA 2018</source>
          , pages
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          . CEUR-WS.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <given-names>Simona</given-names>
            <surname>Frenda</surname>
          </string-name>
          , Bilal Ghanem, Manuel Montes-y Go´mez, and
          <string-name>
            <given-names>Paolo</given-names>
            <surname>Rosso</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Online hate speech against women: Automatic identification of misogyny and sexism on twitter</article-title>
          .
          <source>Journal of Intelligent &amp; Fuzzy Systems</source>
          ,
          <volume>36</volume>
          (
          <issue>5</issue>
          ):
          <fpage>4743</fpage>
          -
          <lpage>4752</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <surname>John M. Giorgi</surname>
            , Osvald Nitski,
            <given-names>Gary D.</given-names>
          </string-name>
          <string-name>
            <surname>Bader</surname>
            , and
            <given-names>Bo</given-names>
          </string-name>
          <string-name>
            <surname>Wang</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations</article-title>
          . arXiv:
          <year>2006</year>
          .03659 [cs], June.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <given-names>Brendan</given-names>
            <surname>Kennedy</surname>
          </string-name>
          , Xisen Jin, Aida Mostafazadeh Davani, Morteza Dehghani, and
          <string-name>
            <given-names>Xiang</given-names>
            <surname>Ren</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>Contextualizing hate speech classifiers with posthoc explanation</article-title>
          . pages
          <fpage>5435</fpage>
          -
          <lpage>5442</lpage>
          ,
          <year>July</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <given-names>Debora</given-names>
            <surname>Nozza</surname>
          </string-name>
          , Claudia Volpetti, and
          <string-name>
            <given-names>Elisabetta</given-names>
            <surname>Fersini</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Unintended bias in misogyny detection</article-title>
          .
          <source>In IEEE/WIC/ACM WI</source>
          <year>2019</year>
          , pages
          <fpage>149</fpage>
          -
          <lpage>155</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <given-names>Endang</given-names>
            <surname>Wahyu</surname>
          </string-name>
          <string-name>
            <surname>Pamungkas</surname>
          </string-name>
          , Valerio Basile, and
          <string-name>
            <given-names>Viviana</given-names>
            <surname>Patti</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>Misogyny detection in twitter: a multilingual and cross-domain study</article-title>
          .
          <source>Information Processing &amp; Management</source>
          ,
          <volume>57</volume>
          (
          <issue>6</issue>
          ):
          <fpage>102360</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <given-names>Nils</given-names>
            <surname>Reimers</surname>
          </string-name>
          and
          <string-name>
            <given-names>Iryna</given-names>
            <surname>Gurevych</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>SentenceBERT: Sentence Embeddings using Siamese BERTNetworks</article-title>
          . arXiv:
          <year>1908</year>
          .10084 [cs],
          <year>August</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <string-name>
            <given-names>Nils</given-names>
            <surname>Reimers</surname>
          </string-name>
          and
          <string-name>
            <given-names>Iryna</given-names>
            <surname>Gurevych</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation</article-title>
          . arXiv:
          <year>2004</year>
          .09813 [cs].
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <string-name>
            <given-names>Irene</given-names>
            <surname>Russo</surname>
          </string-name>
          , Francesca Frontini, and
          <string-name>
            <given-names>Valeria</given-names>
            <surname>Quochi</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>OpeNER sentiment lexicon italian - LMF.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <string-name>
            <given-names>Victor</given-names>
            <surname>Sanh</surname>
          </string-name>
          , Lysandre Debut, Julien Chaumond, and
          <string-name>
            <given-names>Thomas</given-names>
            <surname>Wolf</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter</article-title>
          . arXiv preprint arXiv:
          <year>1910</year>
          .01108.
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <string-name>
            <given-names>Bin</given-names>
            <surname>Wang</surname>
          </string-name>
          and
          <string-name>
            <surname>C.-C. Jay Kuo</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>SBERT-WK: A Sentence Embedding Method by Dissecting BERTbased Word Models</article-title>
          . arXiv:
          <year>2002</year>
          .06652 [cs], June.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>