<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Italian Symposium on Advanced Database Systems, June</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Bias Score: Estimating Gender Bias in Sentence Representations</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>(Discussion Paper)</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Fabio Azzalini</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tommaso Dolci</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mara Tanelli</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Human Technopole - Center for Analysis, Decisions and Society</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2022</year>
      </pub-date>
      <volume>1</volume>
      <fpage>9</fpage>
      <lpage>22</lpage>
      <abstract>
        <p>The ever-increasing number of applications based on semantic text analysis is making natural language understanding a fundamental task. Language models are used for a variety of tasks, such as parsing CVs or improving web search results. At the same time, concern is growing around embedding-based language models, which often exhibit social bias and lack of transparency, despite their popularity and widespread use. Word embeddings in particular exhibit a large amount of gender bias, and they have been shown to reflect social stereotypes. Recently, sentence embeddings have been introduced as a novel and powerful technique to represent entire sentences as vectors. However, traditional methods for estimating gender bias cannot be applied to sentence representations, because gender-neutral entities cannot be easily identified and listed. We propose a new metric to estimate gender bias in sentence embeddings, named bias score. Our solution, leveraging the semantic importance of individual words and previous research on gender bias in word embeddings, is able to discern between correct and biased gender information at sentence level. Experiments on a real-world dataset demonstrates that our novel metric identifies gender stereotyped sentences.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Gender bias</kwd>
        <kwd>natural language processing</kwd>
        <kwd>computer ethics</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Language models are used for a variety of downstream applications, such as CV parsing for a
job position, or detecting sexist comments on social networks. Recently, a big step forward in
the field of natural language processing (NLP) was the introduction of language models based
on word embeddings, i.e. representations of words as vectors in a multi-dimensional space.
These models translate the semantics of words into geometric properties, so that terms with
similar meanings tend to have their vectors close to each other, and the diference between two
embeddings represents the relationship between their respective words [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. For instance, it is
possible to retrieve the analogy  :  =  :  because the diference vectors
→−− − − →−− a →n−− d−− − − →−− share approximately the same direction.
      </p>
      <p>
        Word embeddings boosted results in many NLP tasks, like sentiment analysis and question
answering. However, despite the growing hype around them, these models have been shown
to reflect the stereotypes of the Western society, even when the training phase is performed
over text corpora written by professionals, such as news articles. For instance, they return
sexist analogies like  :  =  : ℎ [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. The social bias in the
geometry of the model is reflected in downstream applications like web search or CV parsing. In
turn, this phenomenon favours prejudice towards social categories already frequently penalised,
such as women or African Americans.
      </p>
      <p>
        Lately, sentence embeddings – vector representations of sentences based on word embeddings
– are also increasing in popularity, improving results in many language understanding tasks, such
as semantic similarity or sentiment prediction [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ]. Therefore, it is of the utmost importance
to expand the research to understand how language models perceive the semantics of natural
language when computing the respective embedding. A very interesting step in this direction
is to define metrics to estimate social bias in sentence embeddings.
      </p>
      <p>
        This work expands research on social bias in embedding-based models, focusing on gender
bias in sentence representations. We propose a method to estimate gender bias in sentence
embeddings and perform our experiments on InferSent, a sentence encoder designed by Facebook
AI [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] based on GloVe1, a very popular word embedding model. Our solution, named bias score,
is highly flexible and can be adapted to both diferent kinds of social bias and diferent language
models. Bias score will help in researching procedures like debiasing embeddings, that require
to identify biased embeddings and therefore to estimate the amount of bias they contain [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
Similarly, techniques for improving training datasets also require to evaluate all sentences
contained in them, to identify problematic entries to remove, change, or compensate for.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. State of the Art</title>
      <p>
        Although language models are successfully used in a variety of applications, bias and fairness in
NLP have received relatively little consideration until recent times, running the risk of favouring
prejudice and strengthening stereotypes [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <sec id="sec-2-1">
        <title>2.1. Bias in Word Embeddings</title>
        <p>
          Static word embeddings were the first to be analysed. In 2016, they have been shown to exhibit
the so-called gender bias, defined as the cosine of the angle between the word embedding of a
gender-neutral word, and a one-dimensional subspace representing gender [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. The approach
was later adapted for non-binary social biases such as racial and religious bias [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]. A debiasing
algorithm was also proposed to mitigate gender bias in word embeddings [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ], however it was
also shown that it fails to entirely capture and remove it [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. The Word Embedding Association
Test (WEAT) [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] was created to measure bias in word embeddings following the pattern of the
implicit-association test for humans. WEAT demonstrated the presence of harmful associations
in GloVe and word2vec2 embeddings. More recently, contextualised word embeddings like
BERT [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] proved to be very accurate language models. However, despite literature suggesting
1https://nlp.stanford.edu/projects/glove/
2https://code.google.com/archive/p/word2vec/
that they are generally less biased compared to their static counterparts [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ], they still display a
significant amount of social bias [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ].
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Bias in Sentence Representations</title>
        <p>
          Research is quite lacking regarding sentence-level representations. WEAT was extended to
measure bias in sentence embedding encoders: the Sentence Encoder Association Test (SEAT)
is again based on the evaluation of implicit associations and showed that modern sentence
embeddings also exhibit social bias [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. Attempts at debiasing sentence embeddings faced
the issue of not being able to recognise neutral sentences, thus debiasing every representation
regardless of the gender attributes in the original natural language sentence [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ].
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Methodology</title>
      <p>
        As already mentioned, gender bias in word embeddings is estimated using the cosine similarity
between word vectors and a gender direction identified in the vector space [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Cosine similarity
is a popular metric to compute the semantic similarity of words based on the angle between
their embedding vectors. Given two word vectors ⃗ and ⃗, cosine similarity is expressed as:
cos(⃗, ⃗) = cos( ) =
      </p>
      <p>⃗ · ⃗
‖⃗‖ ‖⃗‖
,
where  is the angle between ⃗ and ⃗. The more cos( ) approaches 1, the higher is the semantic
similarity between ⃗ and ⃗. In word embedding models, similarity with respect to the gender
direction means that a word vector contains information about gender. Since only
genderneutral words can be biased, gendered words like man or woman are assumed to contain correct
gender information.</p>
      <p>When it comes to sentence representations, the main problem is that gender-neutral sentences
cannot be easily identified and listed like words, because sentences are infinite in number.
Moreover, they may contain gender bias despite their being gendered. Consider the sentence
my mother is a nurse: the word mother contains correct gender semantics, but the word nurse
is female stereotyped. Table 1 shows that the gender-neutral sentence someone is a nurse still
contains a lot of gender information due to the bias associated with the word nurse.</p>
      <p>
        Therefore, it is important to distinguish between the amount of encoded gender information
coming from gendered words, and the amount coming from biased words. For this reason, we
adopt a more dynamic approach: we keep working at the word level, using the cosine similarity
between neutral word representations, and the gender direction to estimate word-level gender
bias. Then, we sum the bias of all the words in the sentence, adjusted according to the length of
the sentence and to the contextualised importance of each word. This decision is grounded on
two observations: first, the semantics of a sentence depends largely on the semantics of the
words contained in it; second, sentence embedding encoders are based on previously defined
word embedding models [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ]. We focus our research on InferSent by Facebook AI [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], a
sentence encoder that achieved great results in many diferent downstream tasks [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. InferSent
encodes sentence representations starting from GloVe [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] word embeddings. Therefore, we
use GloVe for the first step of quantifying gender bias at the word level.
      </p>
      <sec id="sec-3-1">
        <title>3.1. Gender Bias Estimation</title>
        <p>To estimate gender bias in sentence representations, we consider four elements:
• cos(⃗, ⃗): cosine similarity between two word vectors ⃗ and ⃗,
• : gender direction identified in the vector space,
• : list of gendered words in English,
• : a percentage estimating the semantic importance of a word in the sentence.
Our metric, named bias score, takes a sentence as input, and returns two indicators corresponding
to the amount of female and male bias at sentence level. Respectively, they are a positive and a
negative value, obtained from the sum of the gender bias of all words, estimated from cosine
similarity with respect to the gender direction. Since gender bias is a characteristic of
genderneutral words, gendered terms are excluded from the computation and instead their bias is
always set to zero. In detail, for each neutral word  in the sentence we compute its gender
bias as the cosine similarity between its word vector  and the gender direction , and
then we multiply it by the word importance . In particular, for a given sentence :
 () =
 () =
∑︁ cos(, ) × 
∈∈/ ⏟ &gt;0⏞
∑︁ cos(, ) × 
∈∈/ ⏟ &lt;0⏞
Notice that, for each word  that is gender-neutral,  ∈/ . Also, word importance  is always
a positive number, and the cosine similarity can be either positive or negative. Therefore, bias
score keeps the estimations of gender bias towards the male and female directions separated. In
the following sections, we go into more detail by illustrating how we derive ,  and .</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Gender Direction</title>
        <p>The first step of our method is to identify in the vector space a single dimension comprising
the majority of the gender semantics of the model. The resulting dimension, named gender
direction, serves as the first term in the cosine similarity function, to establish the amount of
gender semantics encoded in a vector for a given word, according to the model.</p>
        <p>
          GloVe [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ], the word embedding model that we use, is characterised by a vector space of 300
dimensions. Inside the vector space, the diference between two embeddings returns the direction
that connects them. In the case of the embeddings →−ℎ and→−ℎ, their diference vector →−ℎ − →−ℎ
represents a one-dimensional subspace that identifies gender in GloVe. However, also the
diference vect→o−r− − − − − →−− identifies gender, yet it represents a slightly diferent subspace
compared to →−ℎ − →−ℎ. Therefore, following the approach in [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ], we take into consideration
several pairs of gendered words and perform a Principal Component Analysis (PCA) to reduce
the dimensionality. We use ten pairs of gendered words: woman–man, girl–boy, she–he, mother–
father, daughter–son, gal–guy, female–male, her–his, herself–himself, Mary–John.
        </p>
        <p>As shown in Fig. 1, the top component resulting from the analysis is significantly more
important than the other components, explaining almost 60% of the variance. We use this
top component as gender direction, and we observe that embeddings of female words have a
positive cosine with respect to it, whereas for male words we have a negative cosine.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Gendered Words</title>
        <p>A list  of gendered words is fundamental to estimate gender bias, because only gender-neutral
entities can be biased. Since the number of elements in the subset  of gender-neutral words
in the vocabulary of a language is very big, while the subset  of gendered words is relatively
small (especially in the case of the English language), we derive  as the diference between
the complete vocabulary of the language  and the subset  of gendered words:  =  ∖ .
To achieve this, we define a list  of words containing as many of the elements of the subset 
as possible. Therefore, gender bias is estimated for all elements  in the subset  (neutral
words), whereas for all elements  in the subset  (gendered words) the gender bias is always
set to zero:
∀  ∈  , () ̸= 0
∀  ∈ , () = 0
For this reason, all the elements from  are not considered when estimating gender bias. As a
matter of fact, we consider the gender information encoded in their word embeddings to be
always correct. Examples of gendered words include he, she, sister, girl, father, man.</p>
        <p>
          Our list  contains a total of 6562 gendered nouns, of which 409 and 388 are respectively
lower-cased and capitalised common nouns taken from [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] and [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]. Additionally, we added
5765 unique given names taken from Social Security card applications in the United States3.
        </p>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Word Importance</title>
        <p>
          Following the approach in [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ], word importance is estimated based on the max-pooling operation
performed by the sentence encoder (in our case InferSent) using all vectors representing the
words in a given sentence. The procedure counts how many times each word representation is
selected by the sentence encoder during the max-pooling phase. In particular, we count the
number of times that the max-pooling procedure selects the hidden state ℎ, for each time step
 in the neural network underlying the language model, with  ∈ [0, . . . ,  ] and  equal to the
number of words in the sentence. Note that ℎ can be seen as a sentence representation centred
on the word , i.e. the word at position  in the sentence.
        </p>
        <p>3https://www.kaggle.com/datagov/usa-names
15
%
10
5
0
0
2
4
6</p>
        <p>8</p>
        <p>We consider both the absolute importance of each word, and the percentage with respect to the
total absolute importance of all the words in the sentence. For instance, in the example of Fig. 2,
the absolute importance of the word saxophone is 1106, meaning that its vector representation
is selected by the max-pooling procedure for 1106 dimensions out of the total 4096 dimensions
of the sentence embeddings computed by InferSent. The percentage importance is 14100966 ≈ 0.27,
meaning that the word counts for around 27% of the semantics of the sentence. In particular,
the percentage importance is also independent of the length of the sentence, despite the fact
that very long sentences generally have a more distributed semantics. For this reason, we use
the percentage importance to compute bias score.</p>
      </sec>
      <sec id="sec-3-5">
        <title>3.5. Variant</title>
        <p>Bias score enables to discern gender bias towards the female and male directions. However, we
can also take the absolute value of each word-level bias to derive a single estimation of gender
bias at sentence level:
-() =
∑︁ | cos(, ) ×  |
∈∈/ ⏟ word-leve⏞l bias</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experimental Results</title>
      <p>
        Table 2 illustrate an example of gender bias estimation via bias score, showing that gender
stereotyped concepts like wearing pink dresses are heavily internalised in the final sentence
representation. Additionally, we used bias score to estimate gender bias for sentences contained in
the Stanford Natural Language Inference (SNLI) corpus, a large collection of human-written
English sentences for classification training [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. SNLI contains more than 570k pairs of sentences,
and more than 600k unique sentences in the train set alone. According to our experiments,
sentences corresponding to the highest bias score towards the male direction describe situations
from popular sports like baseball and football, that are frequently associated with men and very
seldom with women. Similarly, sentences corresponding to the highest bias score in the female
direction illustrate female stereotypes, like participating in beauty pageants, applying make-up
or working as a nurse. Table 3 displays the most-biased SNLI sentences according to our metric.
Results are similar when estimating the absolute bias score. In particular, entries associated to
      </p>
      <p>Football players scoring touchdowns -0.149844
Football players playing defense. -0.140169
A defensive player almost intercepted the football from the quarterback. -0.139420
Baseball players -0.138058
the highest absolute bias score include sentences with a high bias score in either the female
or male direction, like football players scoring touchdowns or the bikini is pink. Additionally,
sexualised sentences like the pregnant sexy volleyball player is hitting the ball are also present.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion and Future Work</title>
      <p>
        In this paper we proposed an algorithm to estimate gender bias in sentence embeddings, based on
a novel metric named bias score. We discern between gender bias and correct gender information
encoded in a sentence embedding, and weigh bias on the basis of semantic importance of each
word. We tested our solution on InferSent [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], searching for gender biased representations in a
corpus of natural language sentences. Since gender bias has been proven to be caused by the
internalisation of gender stereotypical associations [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], our algorithm for estimating bias score
allows to identify whose vector representation encapsulates stereotypes the most.
      </p>
      <p>Future work will include adapting the proposed solution to diferent language models and
diferent kinds of social bias. Additionally, since bias score allows to identify stereotypical
entries in natural language corpora used for training language models, removing or substituting
such entries may improve the fairness of the corpus. Thus, future work also includes re-training
language models using text corpora made fairer with such procedure, and a comparison with
the original models both from the quality and the fairness point of view.
We are grateful to our mentor Letizia Tanca for her advice in the writing phase of this work.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>T.</given-names>
            <surname>Mikolov</surname>
          </string-name>
          , W.-T. Yih, G. Zweig,
          <article-title>Linguistic regularities in continuous space word representations</article-title>
          ,
          <source>in: HLT-NAACL</source>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>T.</given-names>
            <surname>Bolukbasi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.-W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Saligrama</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kalai</surname>
          </string-name>
          ,
          <article-title>Man is to computer programmer as woman is to homemaker? debiasing word embeddings</article-title>
          ,
          <source>arXiv preprint arXiv:1607.06520</source>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Conneau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Kiela</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Schwenk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Barrault</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bordes</surname>
          </string-name>
          ,
          <article-title>Supervised learning of universal sentence representations from natural language inference data</article-title>
          ,
          <source>arXiv preprint arXiv:1705.02364</source>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>N.</given-names>
            <surname>Reimers</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Gurevych</surname>
          </string-name>
          ,
          <article-title>Sentence-bert: Sentence embeddings using siamese bert-networks</article-title>
          , arXiv preprint arXiv:
          <year>1908</year>
          .
          <volume>10084</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>K.-W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Prabhakaran</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Ordonez</surname>
          </string-name>
          ,
          <article-title>Bias and fairness in natural language processing</article-title>
          ,
          <source>in: EMNLP-IJCNLP: Tutorial Abstracts</source>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>T.</given-names>
            <surname>Manzini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y. C.</given-names>
            <surname>Lim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Tsvetkov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. W.</given-names>
            <surname>Black</surname>
          </string-name>
          ,
          <article-title>Black is to criminal as caucasian is to police: Detecting and removing multiclass bias in word embeddings</article-title>
          , arXiv preprint arXiv:
          <year>1904</year>
          .
          <volume>04047</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>H.</given-names>
            <surname>Gonen</surname>
          </string-name>
          ,
          <string-name>
            <surname>Y. Goldberg,</surname>
          </string-name>
          <article-title>Lipstick on a pig: Debiasing methods cover up systematic gender biases in word embeddings but do not remove them</article-title>
          , arXiv preprint arXiv:
          <year>1903</year>
          .
          <volume>03862</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>A.</given-names>
            <surname>Caliskan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. J.</given-names>
            <surname>Bryson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Narayanan</surname>
          </string-name>
          ,
          <article-title>Semantics derived automatically from language corpora contain human-like biases</article-title>
          ,
          <source>Science</source>
          <volume>356</volume>
          (
          <year>2017</year>
          )
          <fpage>183</fpage>
          -
          <lpage>186</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          , Bert:
          <article-title>Pre-training of deep bidirectional transformers for language understanding</article-title>
          , arXiv preprint arXiv:
          <year>1810</year>
          .
          <volume>04805</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>C.</given-names>
            <surname>Basta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. R.</given-names>
            <surname>Costa-Jussà</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Casas</surname>
          </string-name>
          ,
          <article-title>Evaluating the underlying gender bias in contextualized word embeddings</article-title>
          , arXiv preprint arXiv:
          <year>1904</year>
          .
          <volume>08783</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>C.</given-names>
            <surname>May</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bordia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. R.</given-names>
            <surname>Bowman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Rudinger</surname>
          </string-name>
          ,
          <article-title>On measuring social biases in sentence encoders</article-title>
          , arXiv preprint arXiv:
          <year>1903</year>
          .
          <volume>10561</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>P. P.</given-names>
            <surname>Liang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. M.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Zheng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y. C.</given-names>
            <surname>Lim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Salakhutdinov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.-P.</given-names>
            <surname>Morency</surname>
          </string-name>
          ,
          <article-title>Towards debiasing sentence representations</article-title>
          , arXiv preprint arXiv:
          <year>2007</year>
          .
          <volume>08100</volume>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>A.</given-names>
            <surname>Conneau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Kiela</surname>
          </string-name>
          ,
          <string-name>
            <surname>Senteval:</surname>
          </string-name>
          <article-title>An evaluation toolkit for universal sentence representations</article-title>
          , arXiv preprint arXiv:
          <year>1803</year>
          .
          <volume>05449</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>J.</given-names>
            <surname>Pennington</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Socher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. D.</given-names>
            <surname>Manning</surname>
          </string-name>
          , Glove:
          <article-title>Global vectors for word representation</article-title>
          ,
          <source>in: EMNLP</source>
          ,
          <year>2014</year>
          , pp.
          <fpage>1532</fpage>
          -
          <lpage>1543</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.-W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <article-title>Learning gender-neutral word embeddings</article-title>
          , arXiv preprint arXiv:
          <year>1809</year>
          .
          <volume>01496</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>S. R.</given-names>
            <surname>Bowman</surname>
          </string-name>
          , G. Angeli,
          <string-name>
            <given-names>C.</given-names>
            <surname>Potts</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. D.</given-names>
            <surname>Manning</surname>
          </string-name>
          ,
          <article-title>A large annotated corpus for learning natural language inference</article-title>
          ,
          <source>arXiv preprint arXiv:1508.05326</source>
          (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>