<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Comparing Geospatiality of Topics between Geotag- and Geoparsing-based Geolocations</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Johannes Mast</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Richard Lemoine-Rodríguez</string-name>
          <email>Richard.lemoine-rodriguez@uni-wuerzburg.de</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vanessa Rittlinger</string-name>
          <email>Vanessa.rittlinger@dlr.de</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Christian Geiß</string-name>
          <email>Christian.geiss@dlr.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hannes Taubenböck</string-name>
          <email>Hannes.taubenboeck@dlr.de</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Geography, Chair of Georisk Research with Remote Sensing Methods, University of Bonn</institution>
          ,
          <addr-line>Meckenheimer Allee 166, 53115 Bonn</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Geolingual Studies Team, University of Würzburg</institution>
          ,
          <addr-line>Am Hubland, 97074 Würzburg</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>German Remote Sensing Data Center (DFD), German Aerospace Center (DLR)</institution>
          ,
          <addr-line>Münchener Straße 20, 82234 Weßling</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Institute of Geography and Geology, Chair of Global Urbanization and Remote Sensing, University of Würzburg</institution>
          ,
          <addr-line>Am Hubland, 97074 Würzburg</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Geolocated social media data offers the opportunity to analyze text data spatially in a wide variety of contexts. Previous work has identified that the likelihood of texts to contain mentions of locations varies between topics, indicating differences in their geospatiality. Social media posts can be linked to geographic locations through two main approaches: geoparsing, which extracts geographic information for places mentioned in the text, and geotags, corresponding to geographic coordinates explicitly attached to posts. In this study, we examine a curated data set of both geotagged and non-geotagged tweets for several thousand of Nigerian Twitter users, to explore differences between geotagging-based and geoparsing-based geolocation approaches in topic representation, controlling for the effects of users and time. Our findings indicate that the two approaches yield data with similar proportions of location mentions, but the interaction between topic and geospatiality varies substantially for some topics. We conclude that the method chosen to geolocate social media data can impact the number of geolocated posts differently across topics. This should be considered in research involving the identification of geolocations from social media posts.</p>
      </abstract>
      <kwd-group>
        <kwd>geographic information extraction</kwd>
        <kwd>geoparsing</kwd>
        <kwd>nlp</kwd>
        <kwd>social media</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Geolocated text data enables the application of spatial analysis methods and it is valuable in the
study of several topics across multiple scientific fields [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. One form of geolocated
text data are geotagged social media texts. The explicitly attached geocoordinates allow researchers
to directly query geographic text data via APIs. However, geotags are only available on some
platforms, such as Instagram or Twitter (now X), which only represent a small part of web data and
whose APIs are often restricted and not free to use. As an alternative, geoparsing allows to geolocate
texts based on mentions of locations within them [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Alongside efforts to create free and open
web indices, such as pursued by the OpenSearch Initiative [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], geoparsing approaches
have the potential to unlock a much larger variety of text data sources and contribute substantially
to open and reproducible research on geographic topics [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. However, previous work [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] has shown
that the likelihood of texts to contain geoparsable geoinformation (their “geospatiality”) varies
depending on the texts’ topic. This affects geodata availability and can introduce biases. While
previous work analyzed this based on a geoparsing approach [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], differences in the geospatiality
of topics discussed in posts including geotags (i.e., geographic coordinates) have not yet been
explored. Therefore, it is not well understood to what extend geoparsing and geotagging approaches
yield similar geolocated datasets, especially regarding the topics they contain. In this study, we aim
to address this research gap and analyze whether topical geospatiality differs between geoparsed and
geotagged text data. Concretely, using Twitter posts (tweets), we seek to answer the following
research questions:
RQ1: Do geotagged and non-geotagged tweets differ regarding the frequency of geoparsable
locations within their texts, and to what extent is this affected by the topic of the texts?
      </p>
      <p>RQ2: Does the effect of topical geospatiality vary between geoparsing-based and geotag-based
approaches to data selection?</p>
    </sec>
    <sec id="sec-2">
      <title>2. Data and methods</title>
      <p>2.1. Data</p>
      <p>
        We used data from Twitter (now X), a microblog platform whose salience and accessibility have
made it a popular data source in a wide variety of research fields [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Twitter offers a geotagging
feature which allows users to explicitly link their posts to a location, including geographic
coordinates [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. In a previous study [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], several thousands of stationary Twitter users (i.e., users
who did not relocate to another country) from Nigeria were identified based on their timelines of
geolocated tweets. We queried and selected both geotagged and non-geotagged tweets posted by
those users from official Twitter clients (excluding third-party applications) for 48 distinct and
randomly spaced one-week intervals between 2015 and 2019. For every user and week, their tweets
were used only if the user produced both geolocated and non-geolocated tweets during a given week.
By focusing on users which were stationary in Nigeria, we ensured also for their non-geolocated
tweets that they were likely produced in the same country. The geotagged and non-geotagged
datasets are thus comparable.
      </p>
      <sec id="sec-2-1">
        <title>2.2. Topic classification and geoparsing</title>
        <p>
          To assign tweets to topics, we trained a transformer-based text classification model on data from a
Nigerian web forum Nairaland using the domain adaptation approach described in [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. Tweets
shorter than 10 tokens were excluded due to their potential limited thematic context. We merged the
42 topics, which were derived from the structure of Nairaland, into 17 overarching supertopics (see
Figure 2) as well as an Other category capturing generic or unassigned content.
        </p>
        <p>
          To identify geolocations within texts of both geotagged and non-geotagged tweets, we used an
ensemble of four named entity recognition (NER) models to identify entities of the GPE (geopolitical
entities), LOC (non-GPE locations), and FAC (facilities) types, and used the Geonames API for
geocoding [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]. We considered a text geolocated if at least two models detected a spatial entity in it
that could be geocoded to a real geographic location. For the ensemble, we included widely-used
state of the art NER models: flair-ner-english-ontonotes-large [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ], SpacyNER [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ], and
bert-baseNER [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ], as well as masakhaNER, a NER model which was optimized for several languages of Africa
[
          <xref ref-type="bibr" rid="ref16">16</xref>
          ].
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>2.3. Analytical approach</title>
        <p>Based on our filtered and classified data, we performed two experiments on our geotagged and
nongeotagged set of tweets: Firstly, for each dataset we quantified the frequency of tweets containing at
least one geoparsed entity FracGeo across topics and datasets.</p>
        <p>Secondly, we modeled the probability of geoinformation as a function of the topic and geolocation
type, controlling for effects of user, time, and text length in a mixed modeling approach with the
presence of geolocation (geoparsed or geotagged) as the binary response variable and using the Other
category as a reference class. This yielded coefficients in the form of log-odds ratios for each topic
which can be interpreted as indicators for the topics’ geospatiality.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Results</title>
      <sec id="sec-3-1">
        <title>3.1. Frequencies of geoparsed entities in geotagged and non-geotagged tweets</title>
        <p>The fraction of tweets that contained geoparsed locations was slightly higher in the geotagged tweets
(12.5% or 39,346) than in the non-geotagged tweets (11.2%, 7,891). Looking at the former, FracGeo
varied strongly depending on the topic, from 33% for International Politics to 3% for Private Life,
Family &amp; Relationships (Figure 2). Altogether, FracGeo was similar between the two datasets for most
topics, but with two notable exceptions: Geotagged Advert tweets contained far more textual spatial
references than the non-geotagged Advert tweets, and the inverse was found for Travel, Tourism &amp;
Migration: Here, geotagged tweets contained relatively fewer spatial references.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Differences between topical geospatiality in geocoding and geotagging approaches to geolocation</title>
        <p>We modeled the geospatiality of topics for both a geoparsing and geotagging approach to
geolocation. As a first observation, the geoparsing approach yielded much higher coefficients for
most topics than the geotagging approach (Figure 3). While for the geotagging approach, the effect
was still significant (p&lt;0.05) for 11 of the 17 topics, the modeled log-odds were generally lower than
in the geocoding approach with the notable exception of Travel, Tourism and Migration, which
showed strong geospatiality in both approaches. Notably, the two approaches did not only show
differences in magnitude, but also in sign: International Politics showed a strong positive effect on
the likelihood of geocodeable entities FracGeo, but a slight negative effect on geotagging frequency.
The correlation between the two rankings was moderate and non-significant (spearman ρ = 0.45,
pvalue = 0.073).</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Discussion and outlook</title>
      <p>Our results indicate that geocodeable entities appeared in similar frequency in geotagged and
nongeotagged tweets. However, our findings suggests that this cannot be presumed to be true for every
topic. For example, in the topic Travel, Tourism and Migration, texts contained place names more
rarely in geotagged than in non-geotagged tweets. There could be a possibility that this topic covers
a complex and diverse semantic field, and that geotagged tweets cover a different subset of this field
(e.g., posts about visiting prestigious locations without mentioning them) than non-geotagged tweets
(e.g., travel plans).</p>
      <p>Results of our mixed modeling analysis suggest that the geospatiality of topics differs between
geotagging and geoparsing approaches. The effect of topics is much higher in the geoparsing
approach – plausible, since in this approach, location and topic are both derived from the same
(usually short) texts. Compared to this, the lower modeled coefficients in the geotagging approach
seem to indicate some detachment between topic and identified location, although topical effects
remain and for some topics even show different results than the geoparsing approach. In
conjunction, our findings suggest that different approaches to geographic information extraction
lead to different representations of topics within the extracted information.</p>
      <p>
        Consequently, researchers using topics as a means to structure and analyze data should consider
the impact of their geolocation method. Future assessments should expand to include other text data
types, such as news media articles and web forums. This contributes to emerging initiatives aiming
to effectively integrate diverse text sources for applications where identifying geolocation is key [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]
and advances openness, transparency, and reproducibility in the associated scientific disciplines.
      </p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgements</title>
      <p>This study was conducted as part of the project MIGRAWARE (Grant No. 01LG2082C), funded by
the German Federal Ministry of Education and Research program WASCAL WRAP 2.0 and partially
funded by the projects OpenSearch@DLR phase II (internal DLR project) and “A New Focus in
English Linguistics: Geolingual Studies”, funded by the Volkswagen Foundation (Grant No. 98 662).</p>
    </sec>
    <sec id="sec-6">
      <title>Declaration on Generative AI</title>
      <p>The authors have not employed any Generative AI tools.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Karami</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lundy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Webb</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Y. K.</given-names>
            <surname>Dwivedi</surname>
          </string-name>
          , '
          <article-title>Twitter and research: A systematic literature review through text mining', IEEE access</article-title>
          , vol.
          <volume>8</volume>
          , pp.
          <fpage>67698</fpage>
          -
          <lpage>67717</lpage>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>R.</given-names>
            <surname>Lemoine-Rodríguez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Biewer</surname>
          </string-name>
          , and
          <string-name>
            <given-names>H.</given-names>
            <surname>Taubenböck</surname>
          </string-name>
          , '
          <article-title>Can Social Media Data Help to Understand the Socio-spatial Heterogeneity of the Interests and Concerns of Urban Citizens? A Twitter Data Assessment for Mexico City'</article-title>
          , in Recent Developments in Geospatial Information Sciences,
          <string-name>
            <given-names>H.</given-names>
            <surname>Carlos-Martinez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Tapia-McClung</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. A.</given-names>
            <surname>Moctezuma-Ochoa</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <surname>A</surname>
          </string-name>
          . J. AlegreMondragón, Eds., Cham: Springer Nature Switzerland,
          <year>2024</year>
          , pp.
          <fpage>119</fpage>
          -
          <lpage>133</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>031</fpage>
          -61440-8_
          <fpage>10</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>R.</given-names>
            <surname>Lemoine-Rodríguez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Mast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mühlbauer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Mandery</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Biewer</surname>
          </string-name>
          , and
          <string-name>
            <given-names>H.</given-names>
            <surname>Taubenböck</surname>
          </string-name>
          , '
          <article-title>The voices of the displaced: Mobility and Twitter conversations of migrants of Ukraine in 2022'</article-title>
          ,
          <string-name>
            <given-names>Information</given-names>
            <surname>Processing</surname>
          </string-name>
          &amp; Management, vol.
          <volume>61</volume>
          , p.
          <fpage>103670</fpage>
          ,
          <string-name>
            <surname>Jan</surname>
          </string-name>
          .
          <year>2024</year>
          , doi: 10.1016/j.ipm.
          <year>2024</year>
          .
          <volume>103670</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J.</given-names>
            <surname>Mast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sapena</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mühlbauer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Biewer</surname>
          </string-name>
          , and
          <string-name>
            <given-names>H.</given-names>
            <surname>Taubenböck</surname>
          </string-name>
          , '
          <article-title>The migrant perspective: Measuring migrants' movements and interests using geolocated tweets'</article-title>
          ,
          <source>Population Space and Place</source>
          , vol.
          <volume>30</volume>
          , no.
          <issue>2</issue>
          , p.
          <fpage>e2732</fpage>
          ,
          <string-name>
            <surname>Mar</surname>
          </string-name>
          .
          <year>2024</year>
          , doi: 10.1002/psp.2732.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>H.</given-names>
            <surname>Senaratne</surname>
          </string-name>
          et al., '
          <article-title>The Unseen-an investigative analysis of thematic and spatial coverage of news on the ongoing refugee crisis in West Africa'</article-title>
          ,
          <source>ISPRS International Journal of GeoInformation</source>
          , vol.
          <volume>12</volume>
          , no.
          <issue>4</issue>
          , p.
          <fpage>175</fpage>
          ,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>X.</given-names>
            <surname>Hu</surname>
          </string-name>
          et al.,
          <article-title>'Location Reference Recognition from Texts: A Survey and Comparison'</article-title>
          ,
          <source>ACM Comput. Surv.</source>
          , vol.
          <volume>56</volume>
          , no.
          <issue>5</issue>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>37</lpage>
          ,
          <year>2023</year>
          , doi: 10.1145/3625819.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Hu</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.-Q.</given-names>
            <surname>Wang</surname>
          </string-name>
          , '
          <article-title>Understanding the removal of precise geotagging in tweets'</article-title>
          ,
          <source>Nat Hum Behav</source>
          , vol.
          <volume>4</volume>
          , no.
          <issue>12</issue>
          , pp.
          <fpage>1219</fpage>
          -
          <lpage>1221</lpage>
          , Sep.
          <year>2020</year>
          , doi: 10.1038/s41562-020-00949-x.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Open</given-names>
            <surname>Search</surname>
          </string-name>
          <string-name>
            <surname>Foundation e.V.</surname>
          </string-name>
          , 'Home', Open Search Foundation.
          <source>Accessed: Jan. 25</source>
          ,
          <year>2025</year>
          . [Online]. Available: https://opensearchfoundation.org/en/
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M.</given-names>
            <surname>Granitzer</surname>
          </string-name>
          et al.,
          <article-title>'Impact and development of an Open Web Index for open web search'</article-title>
          ,
          <source>Journal of the Association for Information Science and Technology</source>
          , vol.
          <volume>75</volume>
          , no.
          <issue>5</issue>
          , pp.
          <fpage>512</fpage>
          -
          <lpage>520</lpage>
          ,
          <year>2024</year>
          , doi: 10.1002/asi.24818.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>O. W. I. Open Search Foundation e.V.</surname>
          </string-name>
          , 'The Open Web Index Dashboard'.
          <source>Accessed: Jan. 28</source>
          ,
          <year>2025</year>
          . [Online]. Available: https://openwebindex.eu/%PUBLIC_URL%
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11] Open Search Foundation e.V. and
          <string-name>
            <given-names>C.</given-names>
            <surname>Plote</surname>
          </string-name>
          , 'Welcome', Open Web Search -
          <article-title>Promoting Europe's Independence in Web Search</article-title>
          .
          <source>Accessed: Jan. 28</source>
          ,
          <year>2025</year>
          . [Online]. Available: https://openwebsearch.eu/welcome/
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>J.</given-names>
            <surname>Mast</surname>
          </string-name>
          et al., 'Geospatiality:
          <article-title>The effect of topics on the presence of geolocation in English text data'</article-title>
          ,
          <source>International Journal of Geographical Information Science</source>
          ,
          <year>2025</year>
          , doi: http://dx.doi.org/10.1080/13658816.
          <year>2025</year>
          .
          <volume>2460051</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>X. X.</given-names>
            <surname>Zhu</surname>
          </string-name>
          et al.,
          <article-title>'Geoinformation Harvesting From Social Media Data: A community remote sensing approach'</article-title>
          ,
          <source>IEEE Geoscience and Remote Sensing Magazine</source>
          , vol.
          <volume>10</volume>
          , no.
          <issue>4</issue>
          , pp.
          <fpage>150</fpage>
          -
          <lpage>180</lpage>
          , Dec.
          <year>2022</year>
          , doi: 10.1109/MGRS.
          <year>2022</year>
          .
          <volume>3219584</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14] 'GeoNames'.
          <source>Accessed: Nov. 09</source>
          ,
          <year>2022</year>
          . [Online]. Available: https://www.geonames.org/
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>M.</given-names>
            <surname>Honnibal</surname>
          </string-name>
          , I. Montani,
          <string-name>
            <surname>S. Van Landeghem</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Boyd</surname>
          </string-name>
          , 'spaCy: Industrial-strength
          <source>Natural Language Processing in Python'</source>
          ,
          <year>2020</year>
          , doi: 10.5281/zenodo.1212303.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>D. I. Adelani</surname>
          </string-name>
          et al.,
          <article-title>'MasakhaNER: Named Entity Recognition for African Languages', Transactions of the Association for Computational Linguistics</article-title>
          , vol.
          <volume>9</volume>
          , pp.
          <fpage>1116</fpage>
          -
          <lpage>1131</lpage>
          ,
          <year>2021</year>
          , doi: 10.1162/tacl_a_
          <fpage>00416</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>