<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Bari, Italy
" gianluca.sperduti@isti.cnr.it (G. Sperduti); alejandro.moreo@isti.cnr.it (A. Moreo);
fabrizio.sebastiani@isti.cnr.it (F. Sebastiani)</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Garbled-Word Embeddings for Jumbled Text</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Gianluca Sperduti</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alejandro Moreo</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Fabrizio Sebastiani</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Italy</string-name>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Garbled-Word Embeddings, Garbled Words, Misspellings, Distributional Semantic Models</string-name>
        </contrib>
      </contrib-group>
      <pub-date>
        <year>2021</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>“Aoccdrnig to a reasrech at Cmabrigde Uinervtisy, it deosn't mttaer in waht oredr the ltteers in a wrod are, the olny itmopnrat tihng is taht the frist and lsat ltteer be at the rghit pclae. The rset can be a toatl mses and you can sitll raed it wouthit porbelm. Tihs is bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the wrod as a wlohe”. We investigate the extent to which this phenomenon applies to computers as well. Our hypothesis is that computers are able to learn distributed word representations that are resilient to character reshufling, without incurring a significant loss in performance in tasks that use these representations. If our hypothesis is confirmed, this may form the basis for a new and more eficient way of encoding character-based representations of text in deep learning, and one that may prove especially robust to misspellings, or to corruption of text due to OCR. This paper discusses some fundamental psycho-linguistic aspects that lie at the basis of the phenomenon we investigate, and reports on a preliminary proof of concept of the above idea.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Since 2003, the sentence quoted in the abstract of the present paper has been circulating around
the Internet, and has become fairly popular in social networks and forums. Even though there is
no evidence whatsoever of any such research at Cambridge University, a lot of psycho-linguistic
literature [
        <xref ref-type="bibr" rid="ref1 ref2 ref3 ref4 ref5">1, 2, 3, 4, 5</xref>
        ] has shown that, though at the cost of sacrificing reading speed (a cost that
depends on the specific words that are afected), humans can actually understand garbled (a.k.a.
jumbled) text, which we here define as text containing words afected by character reshufling,
as long as the first and last letters of each word stay in place. In this paper we put forth
our conjecture that, if humans can make sense of garbled text, it is likely that computerized
distributional semantic models can do so as well. Despite the fact that some previous work has
suggested that this may not be the case [
        <xref ref-type="bibr" rid="ref6 ref7">6, 7</xref>
        ], we call into question the way surface forms are
customarily represented in this field, and propose a new mechanism that natively disregards
character order.
      </p>
      <p>One of the goals of our investigation is furthering our understanding of the connections
between the capabilities of humans and those of computers. However, the vectorial
representations that derive from this investigation, called garbled-word embeddings, have an applicative
impact too, since they could also be applied to building resiliency to misspellings into
computerized models of text, to automatic misspelling correction, and to handling out-of-vocabulary
terms, and could hopefully set the foundations of a new, more eficient modality of representing
character-based textual inputs for deep neural models. While we plan to cope with all these
aspects in our future research, we devote this short paper to studying the viability of garbled-word
embeddings.</p>
      <p>
        This paper is organized as follows. In Section 2 we review and discuss the main works
devoted to the study of the above-mentioned psycho-linguistic phenomena, and the (few)
computational approaches that have derived. In order to verify the conjecture we have described,
we subsequently devise a transformation of the word surface form (that we dub BE-sorting –
Section 3) whose output is consistent with all possible “garbled variants” of the same word. We
then learn embeddings for BE-sorted words, by using any of the publicly available techniques
(we here use word2vec [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]), from a corpus of texts in which each word occurrence has been
replaced by its corresponding BE-sorted version. We then compare the performance of these
garbled-word embeddings, across a series of downstream tasks and standard benchmarks for
them, with that of standard word embeddings. The experiments we have carried out are
presented and discussed in Section 4. We sketch future work in Section 5.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Related work</title>
      <p>
        How transposed letters and misspellings afect our reading performance has long been studied in
the field of psycho-linguistics [
        <xref ref-type="bibr" rid="ref1 ref2 ref3 ref4 ref5">1, 2, 3, 4, 5</xref>
        ] and, more recently, also in the area of NLP [
        <xref ref-type="bibr" rid="ref10 ref6 ref7 ref9">6, 7, 9, 10</xref>
        ].
      </p>
      <p>
        The literature distinguishes two diferent types of word corruption, i.e., (a) corruption due to
transpositions only (whereby an anagram of the original word is formed – this is what we have
called garbling), and (b) corruption that also involves letter insertion and/or deletion and/or
replacement. In their seminal work, Rayner et al. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] show that humans are able to read (and
understand) corrupted text with good reading speed (which is proportional to the inverse of
cognitive efort), but that this speed depends on which type of word corruption is involved,
i.e., humans deal with type (a) (“garbling”) much more easily than with type (b). In this paper
we only consider word garbling. Concerning garbled text, Rayner et al. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] find that reading
speed decays by 12% when the transposed letters do not involve the first and last letters of the
word, by 26% when the transposed letters are the first and the last one, and by 36% when the
transposition involves the first and second letters of the word.
      </p>
      <p>
        Many studies have been devoted to identifying the factors and conditions that impact on the
dificulty of reading garbled words. Andrews [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] finds that a garbled word is easier to process
when it is a highly frequent one, since, when reading, words that are common in language
surprise us the least, and involve thus a smaller cognitive efort to make sense of when they
occur in garbled form [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Many authors seem to agree that the neighbourhood size  of a
garbled word (the neighbourhood of a given word being the set containing all valid words that
can be obtained by replacing a single letter) is a good proxy for estimating the dificulty of
processing it, with higher values of  typically implying higher reading dificulty [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. English
terms tend to display higher values of  on average than many other languages [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], thus making
it one of the most challenging languages for reading garbled words. We use English as our
target language in this study.
      </p>
      <p>
        In the field of NLP, several studies have been carried out in order to assess the resiliency
of embedding-generation techniques and pre-trained language models to misspellings and
garbled words. Heigold et al. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] test the impact that diferent types of noise have on the
performance of convolutional and recurrent neural architectures equipped with diferent input
modes (character-based and byte-pair encoding) in morphological tagging and translation tasks,
concluding that all models are highly sensitive to (even subtle) perturbations in the text, but
that such degradation in performance can be partially countered by reproducing the same
noise conditions in the training data. Yang and Gao [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] show that even stronger models like
BERT [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] cannot elude the catastrophic deterioration in performance due to the presence of
corrupted text. Nguyen and Grieve [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] compare the performance of word embeddings generated
from corrupted words using word2vec’s skip-gram with negative sampling (SGNS – [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]) and
fastText [
        <xref ref-type="bibr" rid="ref12 ref13">12, 13</xref>
        ], showing that SGNS tends to perform better than fastText in the majority
of intrinsic tasks1, despite the fact that SGNS is a spelling-agnostic method.2 The authors also
note that both methods tend to perform fairly well in tasks characterized by the presence of
emotion-laden, intentionally elongated words (e.g., “gooood”), and they conjecture that this
may be due to the fact that such intentional misspellings tend to occur in specific and controlled
contexts. Other work has focused on creating embeddings that are resilient to misspellings.
Arguably, the most notable such example is Misspelling-Oblivious Word Embeddings (MOEs)
[
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. MOEs are based on the fastText algorithm extended with a redesigned loss function that
implements a distance between correct and misspelled words. MOEs have been tested in both
“intrinsic” and “extrinsic” tasks at various levels of corruption, with notable results in highly
corrupted text. Diferently from this approach, we do not attempt to create word embeddings
for the (potentially many) altered surface forms of the words in the vocabulary, but instead
BE-sort each word, thus “collapsing” all possible such variations into a single canonical form
which is then handled as any standard word by any of-the-shelf word-embedding generation
technique.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. BE-sorting</title>
      <p>The method we propose is very easy, and comes down to a simple pre-processing of the text
input. Given a word  = [1, 2, . . . , ], in which  denotes the character at position , we
BE-sort it, i.e., we map it into an artificial token () = [1, sort([2, . . . , − 1]), ] in which
the beginning (B) and end (E) characters are left at their original position while the other ones
(which we here call the “middle” ones) are sorted in alphabetical order. (Any deterministic
sorting method other than alphabetical would do as well, though.) For example, given the input
word “embedding” (short for [, , , , , , , , ]), the function  will return the BE-sorted
1Consistently with this literature, we call intrinsic (resp., extrinsic) those tasks in which words (resp.,
documents) are the objects of study, and which are directly (resp., indirectly) afected by how we handle words; an
example is word analogy (resp., text classification).</p>
      <p>2In particular, the authors found out that fastText performs better in tasks where some characters have been
deleted, while SGNS performs better in all other tasks, and especially so in tasks having to do with semantic analogy.
token “ebddeimng”; any garbled variant of “embedding” in which the B and E characters are in
their original position (e.g., “ebdimnedg”, “edmnbdieg”) will result in the same BE-sorted token.</p>
      <p>We apply the  function to replace all word occurrences in the training corpus with their
BE-sorted versions, and then use, in the standard way, any standard function for generating
word embeddings. During test, we BE-sort all word occurrences in the test corpus and use the
embeddings generated at training time. Note that the sorting function produces a deterministic
and unique version of the text corpus, irrespective of whether the original text was garbled
or not. Should our hypothesis be correct, the word embeddings learned from the BE-sorted
training corpus would behave no worse than a set of embeddings learned from the original
(non-BE-sorted) corpus.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Experiments</title>
      <p>
        We choose the British National Corpus (BNC)3 as our corpus for training word embeddings.
From the BNC, we generate diferent sets of embeddings:
• Garbled(%), obtained by randomly picking % of the word occurrences in the corpus,
garbling them, and generating embeddings from the resulting (garbled or non-garbled)
words; we generate five diferent instances, i.e., for  ∈ {5, 10, 50, 100}); note that the
 = 0 version corresponds to embeddings generated from the original version of the BNC,
while for the  = 100 version all the words from which the embeddings are generated
are garbled.
• BE-sorted, obtained by BE-sorting all word occurrences in the corpus and generating
embeddings from the resulting tokens;
• Full-sorted, obtained by sorting all characters in the word alphabetically, for each word
occurrence in the corpus; this is the same as the BE-sorted version, but for the fact that
the B and E characters of the word are not necessarily left at their original positions;
• RandEmbeds, a set of embeddings (one for each word in the original corpus) randomly
generated and not optimized any further; this set establishes a lower-bound baseline.
For each variant of our dataset, we lower-case all text and then generate word embeddings
using word2vec’s SGNS [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], using the same hyper-parameter values in all cases. 4 In order
to compensate for fluctuations in results due to randomness, we carried out 10 runs for each
model using diferent seeds. Table 1 reports the results of our experiments in using the sets of
embeddings learned from each corpus across a well-established battery of intrinsic-task
benchmarks5, comprising 17 diferent tasks dealing with semantic categorization, word similarity,
word relatedness, and word analogy (see [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] for further details).
      </p>
      <p>3The British National Corpus was originally created by Oxford University, and contains 100-million-words
worth of text from various genres from the later part of the 20th century, including spoken, fiction, magazines,
newspapers, and academic. The corpus is available at http://www.natcorp.ox.ac.uk/corpus/index.xml .</p>
      <p>4We use the Gensim implementation with hyper-parameter values min_count=1, max_vocab_size=None
(i.e., without limit), window=5, vector_size=300, sample=6e-5, alpha=0.03, min_alpha=0.0007, negative=20,
sg=1. We leave the other hyper-parameters at their default values.</p>
      <p>5https://github.com/kudkudak/word-embeddings-benchmarks</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusions</title>
      <p>Somehow surprisingly, humans can efectively read garbled words, provided that their first
and last letters stay in place. While this phenomenon has long lured the attention of
psycholinguists, authors of recent computational experiments have argued that the performance of
computational models, unlike that of humans, deteriorates noticeably in the presence of garbled
input. However, we hypothesize that computational models can be made robust to garbled
text too, and argue that the key to doing this is devising word representation mechanisms that
natively disregard character order within words. As a preliminary proof of concept, we have
shown that it is indeed possible to learn word embeddings from text in which we simply sort the
middle letters of each word (thus intentionally getting rid of any character order information)
with no substantial diference in performance. We are currently investigating character-based
representations for deep learning that implement this intuition.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>K.</given-names>
            <surname>Rayner</surname>
          </string-name>
          ,
          <string-name>
            <surname>S. White</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Johnson</surname>
          </string-name>
          , S. Liversedge,
          <article-title>Raeding wrods with jubmled lettres: There is a cost</article-title>
          ,
          <source>Psychological Science</source>
          <volume>17</volume>
          (
          <year>2006</year>
          )
          <fpage>192</fpage>
          -
          <lpage>3</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>L. X.</given-names>
            <surname>McCusker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. B.</given-names>
            <surname>Gough</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. G.</given-names>
            <surname>Bias</surname>
          </string-name>
          ,
          <article-title>Word recognition inside out and outside in</article-title>
          ,
          <source>Journal of Experimental Psychology: Human Perception and Performance</source>
          <volume>7</volume>
          (
          <year>1981</year>
          )
          <fpage>538</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>Andrews</surname>
          </string-name>
          ,
          <article-title>Lexical retrieval and selection processes: Efects of transposed-letter confusability</article-title>
          ,
          <source>Journal of Memory and Language</source>
          <volume>35</volume>
          (
          <year>1996</year>
          )
          <fpage>775</fpage>
          -
          <lpage>800</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A. F.</given-names>
            <surname>Healy</surname>
          </string-name>
          ,
          <article-title>Detection errors on the word the: Evidence for reading units larger than letters</article-title>
          ,
          <source>Journal of Experimental Psychology: Human Perception and Performance</source>
          <volume>2</volume>
          (
          <year>1976</year>
          )
          <fpage>235</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>V.</given-names>
            <surname>Marian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bartolotti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Chabal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Shook</surname>
          </string-name>
          , Clearpond:
          <article-title>Cross-linguistic easy-access resource for phonological and orthographic neighborhood densities</article-title>
          ,
          <source>PLOS ONE 7</source>
          (
          <year>2012</year>
          )
          <article-title>e43230</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>G.</given-names>
            <surname>Heigold</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Varanasi</surname>
          </string-name>
          , G. Neumann,
          <string-name>
            <surname>J. van Genabith</surname>
          </string-name>
          ,
          <article-title>How robust are character-based word embeddings in tagging and MT against wrod scramlbing or randdm nouse?</article-title>
          ,
          <source>in: Proceedings of the 13th Conference of the Association for Machine Translation in the Americas (Volume</source>
          <volume>1</volume>
          : Research Track), Boston, US,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>R.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <article-title>Can machines read jmulbed senetcnes? (</article-title>
          <year>2019</year>
          ). URL: https://runzhe-yang. science/demo/jumbled.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>T.</given-names>
            <surname>Mikolov</surname>
          </string-name>
          , I. Sutskever,
          <string-name>
            <given-names>K.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. S.</given-names>
            <surname>Corrado</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Dean</surname>
          </string-name>
          ,
          <article-title>Distributed representations of words and phrases and their compositionality</article-title>
          ,
          <source>in: Proceedings of the 27th Annual Conference on Neural Information Processing Systems (NIPS</source>
          <year>2013</year>
          ), Lake Tahoe,
          <string-name>
            <surname>US</surname>
          </string-name>
          ,
          <year>2013</year>
          , pp.
          <fpage>3111</fpage>
          -
          <lpage>3119</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>D.</given-names>
            <surname>Nguyen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Grieve</surname>
          </string-name>
          ,
          <article-title>Do word embeddings capture spelling variation?</article-title>
          ,
          <source>in: Proceedings of the 28th International Conference on Computational Linguistics</source>
          , Barcelona,
          <string-name>
            <surname>ES</surname>
          </string-name>
          ,
          <year>2020</year>
          , pp.
          <fpage>870</fpage>
          -
          <lpage>881</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>A.</given-names>
            <surname>Piktus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. B.</given-names>
            <surname>Edizel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Bojanowski</surname>
          </string-name>
          , E. Grave,
          <string-name>
            <given-names>R.</given-names>
            <surname>Ferreira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Silvestri</surname>
          </string-name>
          ,
          <article-title>Misspellingoblivious word embeddings</article-title>
          ,
          <source>in: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</source>
          , Volume
          <volume>1</volume>
          ,
          <string-name>
            <surname>Minneapolis</surname>
            ,
            <given-names>US</given-names>
          </string-name>
          ,
          <year>2019</year>
          , pp.
          <fpage>3226</fpage>
          -
          <lpage>3234</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          , BERT:
          <article-title>Pre-training of deep bidirectional transformers for language understanding</article-title>
          ,
          <source>in: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL</source>
          <year>2019</year>
          ), Minneapolis,
          <string-name>
            <surname>US</surname>
          </string-name>
          ,
          <year>2019</year>
          , pp.
          <fpage>4171</fpage>
          -
          <lpage>4186</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>P.</given-names>
            <surname>Bojanowski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Grave</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Joulin</surname>
          </string-name>
          , T. Mikolov,
          <article-title>Enriching word vectors with subword information, Transactions of the Association for Computational Linguistics 5 (</article-title>
          <year>2017</year>
          )
          <fpage>135</fpage>
          -
          <lpage>146</lpage>
          . doi:
          <volume>10</volume>
          .1162/tacl_a_
          <fpage>00051</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>E.</given-names>
            <surname>Grave</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mikolov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Joulin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Bojanowski</surname>
          </string-name>
          ,
          <article-title>Bag of tricks for eficient text classification</article-title>
          ,
          <source>in: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics (EACL</source>
          <year>2017</year>
          ), Valencia,
          <string-name>
            <surname>ES</surname>
          </string-name>
          ,
          <year>2017</year>
          , pp.
          <fpage>427</fpage>
          -
          <lpage>431</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>A.</given-names>
            <surname>Lenci</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sahlgren</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Jeuniaux</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. C.</given-names>
            <surname>Gyllensten</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Miliani</surname>
          </string-name>
          ,
          <article-title>A comprehensive comparative evaluation and analysis of distributional semantic models</article-title>
          ,
          <source>arXiv preprint arXiv:2105.09825</source>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>