=Paper= {{Paper |id=Vol-3033/paper36 |storemode=property |title=Leveraging Bias in Pre-Trained Word Embeddings for Unsupervised Microaggression Detection |pdfUrl=https://ceur-ws.org/Vol-3033/paper36.pdf |volume=Vol-3033 |authors=Tolúlọpẹ́ Ògúnrẹ̀mí,Nazanin Sabri,Valerio Basile,Tommaso Caselli |dblpUrl=https://dblp.org/rec/conf/clic-it/OgunremiSBC21 }} ==Leveraging Bias in Pre-Trained Word Embeddings for Unsupervised Microaggression Detection== https://ceur-ws.org/Vol-3033/paper36.pdf
    Leveraging Bias in Pre-Trained Word Embeddings for Unsupervised
                        Microaggression Detection
            Tolúlo.pé. Ògúnrè.mı́1 , Nazanin Sabri2 , Valerio Basile3 , Tommaso Caselli4
               1. Stanford University, United States, tolulope@stanford.edu
                    2. Independent Researcher nazanin.sabrii@gmail.com
                    3. University of Turin, Italy, valerio.basile@unito.it
                  4. University of Groningen, Netherlands, t.caselli@rug.nl

                        Abstract                                be identified ranging from offensive language to
                                                                more complex and dangerous ones, such as hate
    Microaggressions are subtle manifesta-                      speech or doxing. Recently, there has been a grow-
    tions of bias (Breitfeller et al., 2019).                   ing interest by the Natural Language Processing
    These demonstrations of bias can often                      community in the development of language re-
    be classified as a subset of abusive lan-                   sources and systems to counteract socially unac-
    guage. However, not as much focus has                       ceptable language online. Most previous work has
    been placed on the recognition of these in-                 focused on few, easy to model phenomena, ignor-
    stances. As a result, limited data is avail-                ing more subtle and complex ones, such as mi-
    able on the topic, and only in English. Be-                 croaggressions (Jurgens et al., 2019).
    ing able to detect microaggressions with-                      Microaggressions are brief, everyday ex-
    out the need for labeled data would be ad-                  changes that denigrate stigmatised and culturally
    vantageous since it would allow content                     marginalised groups (Merriam-Webster, 2021).
    moderation also for languages lacking an-                   They are not always perceived as hurtful by ei-
    notated data. In this study, we introduce an                ther party, and they can often be detected as pos-
    unsupervised method to detect microag-                      itive statements by current hate-speech detection
    gressions in natural language expressions.                  systems (Breitfeller et al., 2019). The occasion-
    The algorithm relies on pre-trained word-                   ally unintentional hurt caused by such comments
    embeddings, leveraging the bias encoded                     is a reflection of how certain stereotypes of oth-
    in the model in order to detect microag-                    ers are baked into society. Sue et al. (2007) de-
    gressions in unseen textual instances. We                   fine microaggressions in the racial context, par-
    test the method on a dataset of racial and                  ticularly when directed toward people of color, as
    gender-based microaggressions, reporting                    “brief and commonplace daily verbal, behavioral,
    promising results. We further run the algo-                 or environmental indignities”, such as: “you are a
    rithm on out-of-domain unseen data with                     credit to your race.” (intended message: it is un-
    the purpose of bootstrapping corpora of                     usual for someone of your race to be intelligent)
    microaggressions “in the wild”, and dis-                    or “do you think you’re ready for college?” (in-
    cuss the benefits and drawbacks of our                      dented message: it is unusual for people of color to
    proposed method.                                            succeed). The need for moderation of hateful con-
                                                                tent has previously been explored. For instance,
1    Introduction                                               Mathew et al. (2019b) analyses the temporal ef-
The growth of Social Media platforms has been                   fects of allowing hate speech on Gab, and finds
accompanied by an increased visibility of expres-               that the language of users tends to become more
sions of socially unacceptable language online. In              and more similar to that of hateful users over time.
a 2016 Eurobarometer survey, 75% of people who                  Mathew et al. (2019a) further highlights that the
follow or participate in online discussions have                spreading speed and reach of hateful content is
witnessed or experienced abuse or hate speech.                  much higher than with the non-hateful content. As
With this umbrella term, different phenomena can                a result, being able to remove instances of hate-
                                                                ful language, such as microaggressions, is of great
     Copyright © 2021 for this paper by its authors. Use per-   importance.
mitted under Creative Commons License Attribution 4.0 In-
ternational (CC BY 4.0).                                           Previous work on microaggressions with com-
putational methods is quite recent. Breitfeller et      have been shown to contain and amplify the biases
al. (2019) is one of the first work to address mi-      present in the data used to generate them (Boluk-
croaggressions in a systematic way, also introduc-      basi et al., 2016; Lauscher and Glavaš, 2019;
ing a first dataset, SelfMA. A further contribu-        Bhardwaj et al., 2020). As such, they often
tion specifically focused on racial microaggression     exhibit gender and racial bias (Swinger et al.,
is Ali et al. (2020), where the authors focus on the    2019). Many studies have attempted to reduce
development of machine learning systems.                this bias (Yang and Feng, 2020; Zhao et al., 2018;
   In this study we introduce an unsupervised           Manzini et al., 2019). In this work, we take a dif-
method for microaggression detection.           Our     ferent turn by using this bias to our advantage:
method utilizes the existing bias in word-              rather than taming the hurtfulness of the repre-
embeddings to detect words with biased conno-           sentations (Schick et al., 2021), we actively use
tations in the message. Although unsupervised           it to promote social good. In this first study, we
approaches tend to be less competitive than their       employ word representations derived from generic
supervised counterparts, our method is language-        textual corpora of English, in order to capture the
independent and thus it can be applied to any lan-      background knowledge needed to disambiguate
guage for which embedding representations exist.        instances of microaggressions in the text. Re-
Furthermore, the reliance of our methods on spe-        cently, however, there have been studies involving
cific lexical items and their context of occurrence     word representations created from tailored collec-
makes transparent the flagging of a message as an       tions of social media content aimed at capturing
instance of a microaggression. In addition to the       abusive phenomena like verbal aggression (Dynel,
usefulness of our method in languages with no la-       2021) and hate speech (Caselli et al., 2020).
beled data, the reliance of our model on words in          We devise a simple and effective method that
the sentences would make it interpretable as it al-     exploits existing bias in word embeddings and
low human moderators to understand what the sys-        identify words in a message that are related to
tem has based its decision on.                          particular and distant semantic areas in the em-
   Our contributions can be summarised as fol-          bedding space. Messages are analysed in three
lows:                                                   steps: first, for each token ti we compute its re-
    • we introduce a new unsupervised method            latedness to a list of manually curated seed words
      for the detection of microaggressions which       s = s1 , ..., sn denoting potential targets of mi-
      builds on top of pre-trained word embed-          croaggressions; second, we consider only the sim-
      dings;                                            ilarities of the pairs (ti , sj ) above an empirical
                                                        similarity threshold ST and compute their vari-
    • we compare the performance of our model
                                                        ance vi ; finally, we classify the token ti as a micro
      using different pre-trained word embeddings
                                                        aggression trigger, and consequently the message
      (Glove, FastText, and Word2Vec) and discuss
                                                        as a micro aggression, if the vi is above an empir-
      the potential reasons behind the differences;
                                                        ically determined variance threshold V T .
    • we test the proposed algorithm on unseen             The intuitive idea behind this algorithm is that
      data from a different domain (i.e., Twitter),     some lexical elements in a verbal microaggression
      in order to qualitatively evaluate its efficacy   are often (yet sometimes subtly) hinting at specific
      in discovering new instances of microaggres-      features of the recipient of the message, in an oth-
      sion.                                             erwise neutral lexical context.
   The rest of this paper is structured as follows:        In this work, we choose to focus on microag-
we introduce our method in Section 2. The data          gressions related to race and gender, therefore the
and our results are reported in Section 3. We de-       seed words have to be chosen accordingly. The
ploy our model and discuss its limitations in Sec-      seed word lists for race and gender are, respec-
tion 4. Finally, we present the conclusion and fu-      tively, [white, black, asian, latino, hispanic, arab,
ture work in Section 5.                                 african, caucasian] and [girl, boy, man, woman,
                                                        male, female] for gender. There is also a practi-
2    Use the Bias Against the Bias
                                                        cal reasons to focus on gender and race, namely
Embedded representations, either from pre-trained       the scarcity of data available for other categories
word embeddings or pre-trained language models,         of microaggression and other idiosincrasies of the
Figure 1: Worked example of unsupervised method for word ”chopsticks” in the message ”Ford: Built
With Tools, Not With Chopsticks”


available datasets — the religion class was spe-                    Source           Number of posts
cific to different religions, therefore hard to gener-          SelfMA Gender                  1,314
alise, sexuality and gender presented a large over-             SelfMA Racial                  1,278
lap, and so on.                                                     Tumblr                     2,021

                                                         Table 1: Statistics of the two subsets of the
   An example of how the proposed method works           SelfMA dataset used in this paper, and the extra
is illustrated in Figure 1. In the example, con-         data downloaded to balance the dataset.
sider the word ”chopsticks” in the message ”Ford:
Built With Tools, Not With Chopsticks” (from the         3   Experiments
SelfMA dataset, described in Section 3). The tar-
get word exhibits a much higher relatedness to           To test our method, we use two subsets of the
the word asian (0.237) than any other seed words.        SelfMA: microaggressions.com dataset (Breitfeller
Even just considering the seed words with a sim-         et al., 2019), comprised of 1,314 and 1,278 Tumblr
ilarity above a fixed threshold (white, asian and,       posts respectively1 . The posts in SelfMA are all
african), the variance of their similarity score with    instances of microaggressions, manually tagged
respect to chopsticks is still higher than the vari-     with one of four categories: race, gender, sexu-
ance threshold, and therefore this target word, in       ality and religion. These posts can be tagged with
this context, triggers a microaggression accord-         more than one form of microaggressions, mean-
ing to the algorithm. This process is repeated for       ing certain instances can appear in both subsets
all the words in the message in order to detect          of race and gender used for the purposes of this
microaggressions. Some categories of words are           study. The dataset consists of first and second
bound to exhibit a high relatedness to all the seed      hand accounts of microaggressions, as well as di-
words, e.g., “people” or “human”. This is the rea-       rect quotes of phrases or sentences said to the per-
son to introduce the variance threshold in the fi-       son posting. In order to reduce linguistic pertur-
nal step of our algorithm, to filter out these cases     bation introduced by accounts of a situation, we
when classifying a given message, and instead fo-        only take direct quotes found in the dataset as
cus on words that are related to different races (or     instances of microaggressions that we can detect
genders) unevenly, with a skewed distribution of         with our unsupervised method. For training, we
similarity scores.                                       pull out direct quotes from the gender (561) and
                                                         racial (519) dataset to test the algorithm. In order
                                                         to balance the dataset, we scraped 2,021 random
   An important by-product of this algorithm is          Tumblr posts, for a total of 4,612 instances. Ta-
that the output is one or more trigger words, in ad-     ble 1 summarises the composition of our dataset.
dition to the microaggression label — in the exam-          It is important to note that a microaggression
ple, the trigger word is indeed chopsticks — there-      can have multiple tags, so there is an overlap of
fore enabling a more informative and interpretable         1
                                                             Tumblr is a popular American microblogging platform
decision process.                                        https://www.tumblr.com
instances. However, the seed words used to detect     tion described in Section 2 with the only difference
microaggression types in the method are different     of the target metric. The aim of this step is to only
for each target phenomenon (e.g., race, gender).      label tweets as microaggressions with the highest
   We ran the algorithm on the SelfMA dataset,        possible degree of confidence. We set ST = 0.12
empirically optimising the two thresholds on the      and V T = 0.014 for racial microaggressions lead-
training split, for each word embedding type and      ing to Precision of .931 and ST = 0.13 and
each microaggression category, filtering by the       V T = 0.019 for gender-based microaggressions
seed words listed in Section 2. We test the al-       leading to a Precision of .912. Precision has been
gorithm with three pre-trained word embedding         measured on the original SelfMA dataset used as
models for English, namely FastText (Joulin           a validation set.
et al., 2016) (trained on Wikipedia and Com-             We then run the unsupervised model on the new
mon Crawl), word2vec (Mikolov et al., 2013)           Twitter dataset by automatically labelling 256,843
(trained on Google News), and GloVe (Penning-         tweets for gender and 373,631 tweets for race. Af-
ton et al., 2014) (trained on Wikipedia, GigaWord     ter the data is labeled, we manually explore the
corpus, and Common Crawl). The optimization is        positive instances in order to evaluate the perfor-
performed by exhaustive grid search over the hy-      mance of the model. The algorithm tuned for
perparamter space.                                    high precision found in this dataset 6,306 gender-
   The results, shown in Table 2, indicate that       related microaggression candidates, 13,004 race-
FastText has a better F1 score on Racial mi-          related microaggression candidates.
croaggressions while word2vec performs bet-           We find that while the model does detect actual
ter on Gender microaggressions. The differ-           instances of microaggression, there is a notice-
ence in performance between FastText and              able amount of false positive instances. These
word2vec is not major, and we attribute this          tweets discuss race or gender in some manner.
to the difference between the corpora on which        However, they do not necessarily contain mi-
the two models were trained (i.e., web crawl          croaggressions towards these groups. While the
and Wikipedia for FastText vs. news data              model does learn to detect discussions of these
for word2vec). The GloVe pretrained model,            topics, it seems to sometimes confuse these dis-
trained on a combination of newswire texts, en-       cussions with microaggressions towards the afore-
cyclopedic entries and texts from the Web, under-     mentioned groups. Some examples follow, para-
performs in both experiments. In general, the ab-     phrased to avoid tracking the original messages.
solute figures are encouraging, especially consid-
ering the simplicity of this unsupervised approach.        Saying ”Arrested Development isn’t
                                                           funny” in an office full of women just to
                                                           feel something
4   Discovering Microaggressions
                                                           “Men have moustaches, women have
To better understand the performance of our un-
                                                           oversized bracelets”
supervised model, we performed an additional ex-
periment. Our goal is to understand the false posi-      The humorous attempts in this tweets hinge on
tive results and the potential harm the model could   gender stereotypes, and therefore in some contexts
cause. To do so, we use our unsupervised model to     it could be perceived as offensive by some recip-
label unseen instances from another domain (Twit-     ients. The high relatedness in the word embed-
ter) than the SelfMA dataset (Tumblr) in order to     ding space between some words (moustaches and
see how the model would perform in detecting mi-      bracelets) and gender-related seed words (men and
croaggressions.                                       women) triggers the detection algorithm.
   We begin by performing keyword searches on            The automatic detection of racial microaggres-
Twitter (using Twitter’s official API) and collect    sions “in the wild” is more challenging than
a new dataset of of 3M tweets with seven key-         gender-based ones, according to our manual ex-
words potentially containing race and gender ex-      ploration of this automatically labeled dataset.
pressions.Next, we set the threshold values ST        This may be due to the difficulty of crafting a
and V T in our model in order to obtain the highest   list of seed words that is sufficiently race-related,
Precision scores, rather than the highest F1 value.   but at the same time avoids generating too many
This step is performed exactly like the optimiza-     false positives. We indeed found many of them,
                 Target     Model          Class         Precision    Recall    F1-Score
                                           not-MA             .609     .746         .671
                            FastText       MA                 .714     .570         .634
                                           macro avg.                               .680
                                           not-MA              .692      .380       .491
                 Gender     GloVe          MA                  .603      .848       .705
                                           macro avg.                               .598
                                           not-MA              .659      .789       .718
                            word2vec       MA                  .769      .634       .694
                                           macro avg.                               .706
                                           not-MA              .659      .875       .654
                            FastText       MA                  .814      .547       .752
                                           macro avg.                               .702
                                           not-MA              .765      .371       .500
                 Race       GloVe          MA                  .611      .896       .726
                                           macro avg.                               .613
                                           not-MA              .640      .814       .747
                            word2vec       MA                  .776      .584       .667
                                           macro avg.                               .692

Table 2: Results of the experiment on the Gender and Racial subset of SelfMA, in terms of Precition
(P), Recall (R), and F1-score (F1) on the positive class (MA), on the negative class (not-MA), and their
macro-average. Best scores per microagression category are in bold.


mainly due to named entities and multi-word ex-         beddings to detect subtly abusive language phe-
pressions such as “White House”, or simply be-          nomena such as microagressions. While super-
cause of the polysemy of color words, e.g. “black”      vised methods of detection in the field of natu-
and “white”. We, however, still found instances         ral language processing are plentiful, these meth-
of messages containing different extent of racial       ods are only viable for languages and topics with
stereotyping.                                           available labeled datasets. That is however not the
     “why are you being so dramatic? just               case for many languages. As a result, the unsuper-
     say I’m not originally arab, you don’t             vised method of detection introduced in this study
     have to fight about it”                            could help address the need for the moderation of
                                                        microaggressions in languages other than English.
     “I will need to explain that to the chi-           This is further helped by the availability of multi-
     nese old lady who works at my school’s             lingual word-embeddings as they would allow the
     administrative office”                             method to be used in any of the languages sup-
   In summary, running the unsupervised microag-        ported by the embedding.
gression detection algorithm on unseen data seems          The method is unsupervised and only needs a
to represent a promising intermediate step towards      small list of seed words. Considering its simplic-
the semi-automatic creation of language resources       ity, the results obtained from an experiment on
for this phenomenon. While the accuracy is not          a dataset of manually annotated microaggressions
ideal, and lists of seed words have to be hand-         are very promising. Further, the method is trans-
crafted carefully in order to avoid false positives,    parent, explicitly identifying the words triggering
these drawbacks are balanced by the fairly cheap        a microaggression, and thus paving the way for ex-
computational cost and the ease of application in a     plainable microaggression detection.
multilingual scenario.                                     Although the preliminary results are promising,
                                                        an experiment on unseen data from a different do-
5   Conclusion and Future Work
                                                        main shows that there is leeway for improvement.
In this paper we introduce a novel algorithm that       Given that we are looking at the explicit words
exploits the existing bias in pre-trained word em-      used in each message, our method is not sensitive
to implicit expressions like “you people” or “your           Poria. 2020. Investigating gender bias in bert. arXiv
kind”, often occurring in microaggressions. We               preprint arXiv:2009.05021.
would have to add further steps to our algorithm           Tolga Bolukbasi, Kai-Wei Chang, James Zou,
to catch expressions like these.                             Venkatesh Saligrama, and Adam Kalai. 2016. Man
   Polysemy is another known issue, e.g., in words           is to computer programmer as woman is to home-
like “black” and “white” whose relatedness to cer-           maker? debiasing word embeddings. arXiv preprint
                                                             arXiv:1607.06520.
tain identified trigger words could not necessarily
be due to race. While a careful composition of             Luke Breitfeller, Emily Ahn, David Jurgens, and Yu-
the seed word lists helps to minimize this issue, a          lia Tsvetkov. 2019. Finding microaggressions in
systematic approach to polysemy would certainly              the wild: A case for locating elusive phenomena in
                                                             social media posts. In Proceedings of the 2019 Con-
be desirable. The seed word list may also be ex-             ference on Empirical Methods in Natural Language
panded, either manually or exploiting existing lex-          Processing and the 9th International Joint Confer-
icons such as HurtLex (Bassignana et al., 2018)              ence on Natural Language Processing (EMNLP-
for offensive terms (including stereotypes for sev-          IJCNLP), pages 1664–1674.
eral categories of individuals) or specialized lists       Tommaso Caselli, Valerio Basile, Jelena Mitrović, and
of identity-related terms2 .                                 Michael Granitzer. 2020. HateBERT: Retraining
   In future work, we plan on improving our model            BERT for Abusive Language Detection in English.
to account for lexical ambiguity, and the complex-           arXiv preprint arXiv:2010.12472.
ity derived from the interference between prag-            Marta Dynel. 2021. Humour and (mock) aggression:
matic phenomena and aggression, e.g., in humor-             Distinguishing cyberbullying from roasting. Lan-
ous and ironic messages, following the intuition            guage & Communication, 81:17–36.
in recent literature (Frenda, 2018) about the inter-       Simona Frenda. 2018. The role of sarcasm in hate
connection between irony or sarcasm and abusive              speech. a multilingual perspective. In e Doctoral
language online. Our current plan is to apply the            Symposium of the XXXIVInternational Conference
algorithm presented in this paper to bootstrap the           of the Spanish Society for Natural Language Pro-
                                                             cessing (SEPLN 2018), pages 13–17. Lloret, E.; Sa-
creation of a multilingual resource of online ver-
                                                             quete, E.; Martı́nez-Barco, P.; Moreno, I.
bal microaggressions and release it to the research
community.                                                 Armand Joulin, Edouard Grave, Piotr Bojanowski,
                                                             and Tomas Mikolov.          2016.  Bag of tricks
Acknowledgements                                             for efficient text classification. arXiv preprint
                                                             arXiv:1607.01759.
This work of Valerio Basile is partially funded
                                                           David Jurgens, Libby Hemphill, and Eshwar Chan-
by the project “Be Positive!” (under the 2019                drasekharan. 2019. A just and comprehensive strat-
“Google.org Impact Challenge on Safety” call).               egy for using NLP to address online abuse. In Pro-
                                                             ceedings of the 57th Annual Meeting of the Asso-
                                                             ciation for Computational Linguistics, pages 3658–
References                                                   3666, Florence, Italy, July. Association for Compu-
                                                             tational Linguistics.
Omar Ali, Nancy Scheidt, Alexander Gegov, Ella Haig,
 Mo Adda, and Benjamin Aziz. 2020. Automated               Anne Lauscher and Goran Glavaš. 2019. Are we con-
 detection of racial microaggressions using machine          sistently biased? multidimensional analysis of bi-
 learning. In 2020 IEEE Symposium Series on Com-             ases in distributional word vectors. arXiv preprint
 putational Intelligence (SSCI), pages 2477–2484.            arXiv:1904.11783.
 IEEE.
                                                           Thomas Manzini, Yao Chong Lim, Yulia Tsvetkov, and
Elisa Bassignana, Valerio Basile, and Viviana Patti.         Alan W Black. 2019. Black is to criminal as cau-
   2018. Hurtlex: A multilingual lexicon of words to         casian is to police: Detecting and removing mul-
   hurt. In 5th Italian Conference on Computational          ticlass bias in word embeddings. arXiv preprint
   Linguistics, CLiC-it 2018, volume 2253, pages 1–6.        arXiv:1904.04047.
   CEUR-WS.
                                                           Binny Mathew, Ritam Dutt, Pawan Goyal, and Ani-
Rishabh Bhardwaj, Navonil Majumder, and Soujanya             mesh Mukherjee. 2019a. Spread of hate speech
                                                             in online social media. In Proceedings of the 10th
   2
     See for instance this compendium of LGBTQIA+ termi-     ACM conference on web science, pages 173–182.
nology: https://www.umass.edu/stonewall/si
tes/default/files/documents/allyship ter                   Binny Mathew, Anurag Illendula, Punyajoy Saha,
m handout.pdf                                                Soumya Sarkar, Pawan Goyal, and Animesh
  Mukherjee.    2019b.  Temporal effects of un-
  moderated hate speech in gab. arXiv preprint
  arXiv:1909.10966.
Merriam-Webster. 2021. Merriam-webster’s definition
 of microaggression. https://www.merriam-
 webster.com/dictionary/microaggres
 sion. Accessed: 2021-03-08.
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Cor-
  rado, and Jeff Dean. 2013. Distributed representa-
  tions of words and phrases and their composition-
  ality. In C. J. C. Burges, L. Bottou, M. Welling,
  Z. Ghahramani, and K. Q. Weinberger, editors, Ad-
  vances in Neural Information Processing Systems,
  volume 26. Curran Associates, Inc.
Jeffrey Pennington, R. Socher, and Christopher D.
   Manning. 2014. Glove: Global vectors for word
   representation. In EMNLP.
Timo Schick, Sahana Udupa, and Hinrich Schütze.
  2021. Self-diagnosis and self-debiasing: A pro-
  posal for reducing corpus-based bias in nlp. arXiv
  preprint arXiv:2103.00453.
Derald Sue, Christina Capodilupo, Gina Torino, Jen-
  nifer Bucceri, be Aisha, Kevin Nadal, and Marta Es-
  quilin. 2007. Racial microaggressions in everyday
  life: Implications for clinical practice. The Ameri-
  can psychologist, 62:271–86, 05.
Nathaniel Swinger, Maria De-Arteaga, Neil Thomas
  Heffernan IV, Mark DM Leiserson, and Adam Tau-
  man Kalai. 2019. What are the biases in my word
  embedding? In Proceedings of the 2019 AAAI/ACM
  Conference on AI, Ethics, and Society, pages 305–
  311.
Zekun Yang and Juan Feng. 2020. A causal inference
  method for reducing gender bias in word embedding
  relations. In Proceedings of the AAAI Conference on
  Artificial Intelligence, volume 34, pages 9434–9441.
Jieyu Zhao, Yichao Zhou, Zeyu Li, Wei Wang, and Kai-
   Wei Chang. 2018. Learning gender-neutral word
   embeddings. arXiv preprint arXiv:1809.01496.