<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>MDD @ AMI: Vanilla Classifiers for Misogyny Identification</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Samer El Abassi</string-name>
          <email>samer.el-abassi@s.unibuc.ro</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sergiu Nisioi</string-name>
          <email>sergiu.nisioi@unibuc.ro</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Faculty of Mathematics and</institution>
          ,
          <addr-line>Computer Science</addr-line>
          ,
          <institution>University of Bucharest</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Human Language Technologies, Research Center, University of Bucharest</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this report1, we present a set of vanilla classifiers that we used to identify misogynous and aggressive texts in Italian social media. Our analysis shows that simple classifiers with little feature engineering have a strong tendency to overfit and yield a strong bias on the test set. Additionally, we investigate the usefulness of function words, pronouns, and shallow-syntactical features to observe whether misogynous or aggressive texts have specific stylistic elements.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>
        This paper discusses our submission (team MDD)
to the Evalita 2020 Automatic Misogyny
Identification Shared Task
        <xref ref-type="bibr" rid="ref14 ref2 ref7">(Elisabetta Fersini, 2020;
Basile et al., 2020)</xref>
        (Task A). Our methods consist
of a set of simple vanilla classifiers that we employ
to assess their effectiveness on the datasets
provided by the organizers. The systems we
submitted for evaluation use a logistic regression
classifier with little hyperparameter tuning or feature
engineering, being trained on tf-idf and average word
embeddings pooling. Previous reports on
misogyny
        <xref ref-type="bibr" rid="ref10 ref11">(Fersini et al., 2018b,a)</xref>
        and aggressiveness
        <xref ref-type="bibr" rid="ref1">(Basile et al., 2019)</xref>
        detection indicate that
support vector machines and logistic regression
classifiers effectively identify these patterns in social
media posts. Furthermore, vanilla classifiers with
little feature engineering were successfully used
for other shared tasks, such as identifying
dialectal varieties
        <xref ref-type="bibr" rid="ref17 ref6">(Ciobanu et al., 2016; Zampieri et al.,
2017)</xref>
        or native language identification
        <xref ref-type="bibr" rid="ref13 ref17">(Malmasi
et al., 2017)</xref>
        , where high scores were obtained by
simple approaches using SVMs or logistic
regression classifiers.
      </p>
      <p>1Copyright c 2020 for this paper by its authors. Use
permitted under Creative Commons License Attribution 4.0
International (CC BY 4.0).</p>
      <p>The classifiers we built achieved a relatively
good accuracy on our cross-validation tests;
however, for this competition, the results obtained by
our systems are not among the top-scoring ones
and show to be misfit, with a significant tendency
towards biased results.</p>
      <p>In addition to the description of our
submissions, in this report, we analyze the errors of our
systems, and we bring into discussion several and
topic-independent features to: 1) test the
effectiveness of part-of-speech n-grams, function words,
and pronouns on the task of identifying
misogynous and aggressive texts on social media and 2)
observe whether texts labeled as misogynous or
aggressive have a particular bias towards certain
grammatical structures.
2</p>
    </sec>
    <sec id="sec-2">
      <title>System Description</title>
      <p>
        At the basis of submissions is the logistic
regression classifier with liblinear
        <xref ref-type="bibr" rid="ref8">(Fan et al., 2008)</xref>
        optimizer, l2 penalty, and regularization constant C =
3 that we chose based on different cross-validation
iterations. In addition, we introduced a heuristic at
the prediction time in which we predict a text not
to be aggressive if it was not categorized as
misogynous.
      </p>
      <p>
        The difference between our three submissions
for Task A consist in the feature extraction
process, where:
MDD.A.r.c.run1 is the logreg model trained on
td-idf of word n-grams, n ranging from 1 to 5
MDD.A.r.u.run2 is the logreg model trained on
pre-trained glove twitter embeddings of size 200
on 27 billion words2
MDD.A.r.u.run3 is the logreg model using
spaCy
        <xref ref-type="bibr" rid="ref12">(Honnibal and Montani, 2017)</xref>
        FastText
2English model GloVe.twitter.27B.200d https://
nlp.stanford.edu/projects/GloVe/
CBOW embeddings pre-trained on Wikipedia and
OSCAR (Common Crawl)3.
      </p>
      <p>The second run is trained on English glove
embeddings that surprisingly contain the
representation of more than half of our Italian vocabulary,
i.e., approximately 9500 words out of the total
15,000 size of the vocabulary of our data. The
English glove embeddings cover code-switching,
emojis, and basic Italian words. Despite having
the lowest evaluation score of our submissions
(0.666 macro f1), we believe it provides a decent
estimation for identifying non-misogynous texts.
2.1</p>
      <sec id="sec-2-1">
        <title>Feature Extraction</title>
        <p>
          Our feature extraction processes for the
submissions are simple, the first one uses the tf-idf
vectorizer
          <xref ref-type="bibr" rid="ref5">(Buitinck et al., 2013)</xref>
          on word n-grams,
with n ranging from 1 to 5, to cover more of word
context. Tf-idf features were used for their
ability to categorize the importance of an n-gram with
respect to the entire corpus. The second feature
set is based on pre-trained word representations by
calculating every word’s embeddings in the text
to eventually get an average representation. For
words not present in the embeddings, an array of
zeroes with the same dimensions was used.
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>Preprocessing</title>
        <p>Our submissions use raw, un-processed texts,
including tags and URLs. We have also
experimented with different preprocessing and feature
extraction steps for which we did not make any
submission. We consider multiple approaches in
this direction:
1. clean - changing the entire text to lowercase,
removing hashtags, and links
2. nps - replacing the text with the noun
phrases; these features contain the nouns
and surrounding attributes that can highlight
misogynous remarks
3. fct words - classification based on function
word occurrence; these words cover stylistic
and information of texts. We have collected a
list of conjunctions, prepositions, connectors,
etc. for Italian for this purpose.
4. POS n-grams - n-grams with n ranging from
1 to 5 over part-of-speech tags; these features
3Model it core news lg, version 2.3.0 released from
spaCy https://spaCy.io/models/it
would indicate a certain syntactic and
stylistic pattern in misogynous or aggressive texts
5. pronouns - n-grams with n ranging from 1
to 5 over the pronouns and pronoun
properties from the texts; we observed an increased
usage in aggressive expressions of
secondperson pronouns
6. filter POS - n-grams over a filtered set of
words and POS tags.</p>
        <p>For POS tagging and noun phrases
extraction, we use the default outputs from the
Italian model for spaCy trained on the dataset
provided by Bosco et al. (2014). In addition, we
use the tag for each word that covers an
entire set of features separated by whitespace; e.g.,
”Gender=Masc, Number=Sing, Person=2,
PronType=Prs” becomes: ”Masc Sing 2 Prs”.</p>
        <p>We expect the noun phrases to be less effective
at detecting aggressive behaviour because
aggressiveness often involves verbal constructs and
actions.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Results and Discussion</title>
      <p>In our work, we only describe the submissions for
Task A of the competition, which is a
classification task for the identification of misogynous and
aggressive texts. Task B measures the bias of such
classifiers with respect to certain concepts. Our
submissions for task B are extracted from tf-idf
representations of word n-grams and obtain the
smallest scores of the competition.</p>
      <p>Table 1 contains the submitted runs for Task A
and the experiments we did to get a better
understanding of the subtleties misogynistic and/or
aggressive tweets contain. The columns CV F1
contain the average F1 scores computed for 10-fold
cross-validation carried for ten iterations. Each
cross-validation train-test split is stratified to
preserve the proportions of misogynoys and/or
aggressive texts in both splits. The Test F1 columns
are the results obtained on the gold standard test
set. In the last column, we provide the macro F1
resulting from the average F1 between
aggressiveness and misogyny predictions.</p>
      <p>The submitted runs show that the tf-idf
vectorizer from run1, although it scored better during
the cross-validation stage, ended up being
outperformed by the word embeddings extracted from
spaCy (run3, 0.684 macro F1), being unable to
Feature
tf-idf, run1
glove, run2
spacy, run3
clean tf-idf
clean, glove
clean spacy
nps, clean, tf-idf
nps, clean, spacy</p>
      <p>fct words
POS n-grams
pronouns
filter POS
generalize to the new texts. The second run (run2,
0.666 macro F1) uses the glove pre-trained
embeddings for English. This result represents the
biggest surprise of the three since it did not use
Italian embeddings. We observe that the English
glove representations cover more than 60
Cleaned texts aid the classifier by a significant
threshold. In our experiments, we removed tags
and URLs to observe a significant increase in
macro scores for the same approaches over the
cleaned texts. The best result we obtained so far
(0.7 macro score) uses the Italian spaCy average
vector representations extracted from clean texts.
Noun phrases extracted from each cleaned text
do not indicate significant increases in
misogynous or aggressive texts detection. Using these
features yields comparable scores to the best of
our methods, surpassing the classification attempts
on uncleaned texts. This indicates that noun
phrases alleviate the noise extracted by the tf-idf
vectorizer. The model was less prone to
overfitting and, therefore, more able to adapt to the
unseen data.</p>
      <p>Function words are features with grammatical
roles, consisting of conjunctions, prepositions,
articles, etc. encompassing stylistic aspects of the
texts. We tested the accuracy of a simple
logistic regression using function words, and the
results were higher than 50% by a non-trivial
amount. This is a potential indicator that
misogynistic and/or aggressive tweets have a slightly
different syntax than those that do not fit in either of
the two. Moreover, using the tf-idf vectorizer on
plain function words achieved 0.628 F1 on the test
set for misogyny identification, a result that is not
at all negligible, given that these words do not
encapsulate meaning.</p>
      <p>POS n-grams are yet another set of features
capable of capturing shallow syntactic constructs.
Using this feature set, we observed a strong
overfitting tendency on the cross-validation scenarios
(average F1 0.754 for misogyny and 0.723 for
aggressiveness) while on the gold test set, the macro
F1 score is 0.59. This is an indicator that
certain syntactic patterns are indeed occurring in the
misogynistic and aggressive texts, weakly
differentiating them from other types of texts. However,
these features have little power to generalize on
new samples.</p>
      <p>Pronouns reveal the most interesting result due
to two reasons: 1) the features did not overfit the
data, as indicated by the cross-validation F1 scores
that are close to the actual scores on the gold test
set; 2) aggressive texts can be differentiated
between each other using only pronouns with an F1
score (0.636) that is comparable with more
advanced methods that use richer features such as
embeddings (0.655, for the embeddings over clean
texts) or tf-idf vectorizer (0.669, for tf-idf over
clean texts). Therefore, in terms of aggressiveness,
it is clear that certain expressions using forms of
second-person pronouns are typically used to
construct call-out phrases or curse-word expressions.
The most common pronoun observed in aggresive
texts is ti - the second person singular acusative of
pronoun tu (’you’).</p>
      <p>Filter POS account the n-grams of words and
POS tags extracted from the following categories:
nouns, adverbs, adpositions, determiners,
adjectives, verbs, pronouns, and auxiliary verbs. The
features obtain the second best result (0.694 F1
macro score) from all our attempts. Again, in this
situation, we are also facing a big difference
between the cross-validation results and the released
test set.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Discussion</title>
      <p>The results show that the vanilla feature
extraction methods suffered from a non-trivial amount of
overfitting. Despite the fact that we carried a
stratified 10-fold cross-validation, over ten iterations,
the average F1 scores obtained on the test set were
considerably lower than the ones we obtained in
our separate experiments.</p>
      <p>The evaluation scores of said methods was over
88% in our cross-validation splits. On the
crossvalidation evaluation from the training set, tf-idf
produced the best results. On the test set,
embeddings proved to have a better power of
generalization. Preprocessing the texts by removing
stopwords, hashtags, links, and other types of noise
proved to be beneficial for the classifier. The best
results were obtained by extracting average clean
text embeddings. Overall, word embeddings were
more consistent when comparing cross-validation
results with the test ones for misogyny detection.</p>
      <p>At a shallow eye-check we noticed in the test
set several examples labeled as misogynous with
no apparent reasons: ”troppo acida... non mangio
yogurt”, ”Impiccati”, ”#nome?”. We can only
assume that the misogynistic character of these
comments is given by the context in which they were
posted. On the test set it also appears that the
majority of misogynistic comments are remarks
on different body parts, most likely as comments
posted to pictures. It is, therefore, difficult to asses
the misogynistic character of a short text without
having at hand the full multi-modal context: to
whom it was posted, what kind of relation is
between the ”commenter” and the ”commentee”, if
the tweet is a reply or a single post, and so on and
so forth.</p>
      <p>
        It is worth noting that most text classification
papers mention or use BERT (Bidirectional
Encoder Representations from Transformers), as it
has proven to be one of the most accurate when
facing different types of data
        <xref ref-type="bibr" rid="ref14">(Pamungkas et al.,
2020)</xref>
        . Other state of the art methods are LTSM
(Long short-term memory) and XLNet, the latter
overtaking BERT on various tasks
        <xref ref-type="bibr" rid="ref16">(Yang et al.,
2019)</xref>
        . A current issue with such methods and
word embeddings is that they transfer the
human bias present in large corpora. This is
becoming a bigger problem as AI filters are
prevalent in today’s society and therefore
discriminatory traits of the models become discriminatory
real world actions. For example, textual
embeddings trained from Wikipedia data show
discriminatory traits towards minorities such as
associating foreigners with criminals, homosexuality with
corruption, men being linked to aggression and
women with the idea of the loving wife.
        <xref ref-type="bibr" rid="ref15">(Papakyriakopoulos et al., 2020)</xref>
        . Basta et al. (2019) finds
that word embeddings are more likely to be
discriminatory and biased than their contextualized
counterparts, implying that state of the art
methods are moving towards the right direction.
However, as the models are getting closer to
understanding language, one cannot help but wonder if
this will have a negative impact on their bias if
precautions aren’t taken, as they might be overly
impacted by the ubiquitous bias humans carry. Due
to the widespread automatisation of daily tasks
using machine learning models, mitigating prejudice
becomes a responsibility of the developers, as it
crucial for obtaining equal opportunities and
treatment of minorities.
5
      </p>
    </sec>
    <sec id="sec-5">
      <title>Conclusions</title>
      <p>Our results indicate that simple feature
engineering and vanilla classifiers cannot distinguish
between misogynistic/aggressive tweets with
reliable accuracy and that more research is needed
to understand the important features concerning
this task. However, the experiments imply a
correlation between a text’s syntax and its
misogynistic/aggressive value. This proposes the idea
that text that falls into either categories, (or maybe
even hate speech in general?) does have a slightly
more recognisable grammatical pattern than text
that isn’t. Whether it’s the POS n-grams,
pronouns, or just function words, the wording
matters and is worth looking into for more advanced
feature engineering.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>Valerio</given-names>
            <surname>Basile</surname>
          </string-name>
          , Cristina Bosco, Elisabetta Fersini, Nozza Debora, Viviana Patti, Francisco Manuel Rangel Pardo, Paolo Rosso,
          <string-name>
            <given-names>Manuela</given-names>
            <surname>Sanguinetti</surname>
          </string-name>
          , et al.
          <year>2019</year>
          .
          <article-title>Semeval-2019 task 5: Multilingual detection of hate speech against immigrants and women in twitter</article-title>
          .
          <source>In 13th International Workshop on Semantic Evaluation</source>
          , pages
          <fpage>54</fpage>
          -
          <lpage>63</lpage>
          . Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <given-names>Valerio</given-names>
            <surname>Basile</surname>
          </string-name>
          , Danilo Croce, Maria Di Maro, and
          <string-name>
            <surname>Lucia</surname>
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Passaro</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>Evalita 2020: Overview of the 7th evaluation campaign of natural language processing and speech tools for italian</article-title>
          .
          <source>In Proceedings of Seventh Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA</source>
          <year>2020</year>
          ),
          <article-title>Online</article-title>
          . CEUR.org.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <given-names>Christine</given-names>
            <surname>Basta</surname>
          </string-name>
          ,
          <string-name>
            <surname>Marta Ruiz</surname>
            Costa-jussa`, and
            <given-names>Noe</given-names>
          </string-name>
          <string-name>
            <surname>Casas</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Evaluating the underlying gender bias in contextualized word embeddings</article-title>
          .
          <source>CoRR</source>
          , abs/
          <year>1904</year>
          .08783.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <given-names>Cristina</given-names>
            <surname>Bosco</surname>
          </string-name>
          , Felice Dell'Orletta, Simonetta Montemagni, Manuela Sanguinetti, and
          <string-name>
            <given-names>Maria</given-names>
            <surname>Simi</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>The evalita 2014 dependency parsing task</article-title>
          .
          <source>In EVALITA 2014 Evaluation of NLP and Speech Tools for Italian</source>
          , pages
          <fpage>1</fpage>
          -
          <lpage>8</lpage>
          . Pisa University Press.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <given-names>Lars</given-names>
            <surname>Buitinck</surname>
          </string-name>
          , Gilles Louppe, Mathieu Blondel, Fabian Pedregosa, Andreas Mueller, Olivier Grisel, Vlad Niculae,
          <string-name>
            <given-names>Peter</given-names>
            <surname>Prettenhofer</surname>
          </string-name>
          , Alexandre Gramfort, Jaques Grobler, Robert Layton,
          <string-name>
            <surname>Jake</surname>
            <given-names>VanderPlas</given-names>
          </string-name>
          , Arnaud Joly, Brian Holt, and Gae¨l Varoquaux.
          <year>2013</year>
          .
          <article-title>API design for machine learning software: experiences from the scikit-learn project</article-title>
          .
          <source>In ECML PKDD Workshop: Languages for Data Mining and Machine Learning</source>
          , pages
          <fpage>108</fpage>
          -
          <lpage>122</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <given-names>Alina</given-names>
            <surname>Maria</surname>
          </string-name>
          <string-name>
            <surname>Ciobanu</surname>
          </string-name>
          , Sergiu Nisioi, and Liviu P Dinu.
          <year>2016</year>
          .
          <article-title>Vanilla classifiers for distinguishing between similar languages</article-title>
          .
          <source>In Proceedings of the VarDial Workshop.</source>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <given-names>Paolo</given-names>
            <surname>Rosso Elisabetta Fersini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Debora</given-names>
            <surname>Nozza</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>Ami @ evalita2020: Automatic misogyny identification</article-title>
          .
          <source>In Proceedings of the 7th evaluation campaign of Natural Language Processing</source>
          and
          <article-title>Speech tools for Italian (EVALITA 2020), Online</article-title>
          . CEUR.org.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>Rong-En</surname>
            <given-names>Fan</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kai-Wei</surname>
            <given-names>Chang</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cho-Jui</surname>
            <given-names>Hsieh</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xiang-Rui</surname>
            <given-names>Wang</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Chih-Jen Lin</surname>
          </string-name>
          .
          <year>2008</year>
          .
          <article-title>Liblinear: A library for large linear classification</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <given-names>J.</given-names>
            <surname>Mach</surname>
          </string-name>
          .
          <source>Learn. Res.</source>
          ,
          <volume>9</volume>
          :
          <fpage>1871</fpage>
          -
          <lpage>1874</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <given-names>Elisabetta</given-names>
            <surname>Fersini</surname>
          </string-name>
          , Debora Nozza, and
          <string-name>
            <given-names>Paolo</given-names>
            <surname>Rosso</surname>
          </string-name>
          . 2018a.
          <article-title>Overview of the evalita 2018 task on automatic misogyny identification (ami)</article-title>
          .
          <source>EVALITA Evaluation of NLP and Speech Tools for Italian</source>
          ,
          <volume>12</volume>
          :
          <fpage>59</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <given-names>Elisabetta</given-names>
            <surname>Fersini</surname>
          </string-name>
          , Paolo Rosso, and
          <string-name>
            <given-names>Maria</given-names>
            <surname>Anzovino</surname>
          </string-name>
          . 2018b.
          <article-title>Overview of the task on automatic misogyny identification at ibereval 2018</article-title>
          . IberEval@ SEPLN,
          <volume>2150</volume>
          :
          <fpage>214</fpage>
          -
          <lpage>228</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <given-names>Matthew</given-names>
            <surname>Honnibal</surname>
          </string-name>
          and
          <string-name>
            <given-names>Ines</given-names>
            <surname>Montani</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>spaCy 2: Natural language understanding with Bloom embeddings, convolutional neural networks and incremental parsing</article-title>
          . To appear.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <given-names>Shervin</given-names>
            <surname>Malmasi</surname>
          </string-name>
          , Keelan Evanini, Aoife Cahill,
          <string-name>
            <surname>Joel R. Tetreault</surname>
          </string-name>
          , Robert A.
          <string-name>
            <surname>Pugh</surname>
            , Christopher Hamill, Diane Napolitano, and
            <given-names>Yao</given-names>
          </string-name>
          <string-name>
            <surname>Qian</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>A report on the 2017 native language identification shared task</article-title>
          .
          <source>In BEA@EMNLP</source>
          , pages
          <fpage>62</fpage>
          -
          <lpage>75</lpage>
          . Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <given-names>Endang</given-names>
            <surname>Wahyu</surname>
          </string-name>
          <string-name>
            <surname>Pamungkas</surname>
          </string-name>
          , Valerio Basile, and
          <string-name>
            <given-names>Viviana</given-names>
            <surname>Patti</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>Misogyny detection in twitter: a multilingual and cross-domain study</article-title>
          .
          <source>Information Processing &amp; Management</source>
          ,
          <volume>57</volume>
          (
          <issue>6</issue>
          ):
          <fpage>102360</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <given-names>Orestis</given-names>
            <surname>Papakyriakopoulos</surname>
          </string-name>
          , Simon Hegelich, Juan Carlos Medina Serrano, and
          <string-name>
            <given-names>Fabienne</given-names>
            <surname>Marco</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>Bias in word embeddings</article-title>
          .
          <source>In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency</source>
          , FAT* '
          <volume>20</volume>
          , page 446-
          <fpage>457</fpage>
          , New York, NY, USA. Association for Computing Machinery.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <given-names>Zhilin</given-names>
            <surname>Yang</surname>
          </string-name>
          , Zihang Dai, Yiming Yang, Jaime Carbonell, Ruslan Salakhutdinov, and
          <string-name>
            <surname>Quoc</surname>
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Le</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Xlnet: Generalized autoregressive pretraining for language understanding</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <given-names>Marcos</given-names>
            <surname>Zampieri</surname>
          </string-name>
          , Shervin Malmasi, Nikola Ljubesˇic´,
          <string-name>
            <surname>Preslav</surname>
            <given-names>Nakov</given-names>
          </string-name>
          , Ahmed Ali, Jo¨rg Tiedemann, Yves Scherrer, and Noe¨mi Aepli.
          <year>2017</year>
          .
          <article-title>Findings of the vardial evaluation campaign 2017</article-title>
          .
          <source>In Proceedings of the Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial)</source>
          , pages
          <fpage>1</fpage>
          -
          <lpage>15</lpage>
          , Valencia, Spain. Association for Computational Linguistics.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>