<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>The Past is a Foreign Place: Improving Toponym Linking for Historical Newspapers</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Mariona Coll Ardanuy</string-name>
          <email>mcoll@prhlt.upv.es</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Federico Nanni</string-name>
          <email>fnanni@turing.ac.uk</email>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kaspar Beelen</string-name>
          <email>kaspar.beelen@sas.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Luke Hare</string-name>
          <email>lhare@turing.ac.uk</email>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Digital Humanities Research Hub, School of Advanced Study</institution>
          ,
          <addr-line>Senate House, London</addr-line>
          ,
          <country country="UK">United Kingdom</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>PRHLT Research Center, Universitat Politècnica de València</institution>
          ,
          <addr-line>València</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>The Alan Turing Institute</institution>
          ,
          <addr-line>British Library, London</addr-line>
          ,
          <country country="UK">United Kingdom</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Work conducted while at The Alan Turing Institute</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <fpage>368</fpage>
      <lpage>390</lpage>
      <abstract>
        <p>In this paper, we examine the application of toponym linking to digitised historical newspapers. These collections constitute the largest trove of historical text data available to researchers in the humanities. They contain varied, 昀椀ne-grained information about the past, anchored in a speci昀椀c place and time. Place names (or toponyms) are common entry points for starting exploring these collections. In this paper, we introduce a new tool for toponym linking and resolutionT,-Res, a modular, 昀氀exible, and opensource pipeline, which is built on top of robust state-of-the-art approaches. We present a comprehensive step-by-step examination of this task in English, and conclude with a case study in which we show how toponym linking enables historical research in the digitised press.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;toponym resolution</kwd>
        <kwd>entity linking</kwd>
        <kwd>historical newspapers</kwd>
        <kwd>nineteenth-century</kwd>
        <kwd>toponym linking</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        newspaper texts. In this paper, we focus on the speci昀椀c task of identifying and linking
geographical named entities (i.e. toponyms) in historical newspapers in English. This task presents
certain additional challenges to standard EL, as illustrated with the following news fragments:
• CxitiTCHUßCit, June 10—Yesterday being the day appointed for the election of taro gentlemen to
tepcoeot this borough in the new imperial Parliament [...].1
• Leghorn, April 6. ETTERS from Condantinopte, dated March 3, mention, tliat an Earthquake had
lately hapL 3 pened at Tauris, the Capita! of the Province of *IS )ra Ariherbigan, in Pcr昀椀a. 2
Most visible are the errors introduced during digitisation and optical character recognition
(OCR). Such errors can occur both in the named entity (sometimes even rendering it
incomprehensible for the human reader) or in the context of the entity. Secondly, historical
newspapers portray a world that has changed, while at the same time being o昀琀en very regional in
their focus [
        <xref ref-type="bibr" rid="ref23">20</xref>
        ]: in the 昀椀rst example, ‘CxitiTCHUßCit’ (i.e. Christchurch) refers to the town in
Dorset—which would have been the 昀椀rst reference of the readers of The Dorset County
Chronicle—instead of the (today) more well-known city in New Zealand. Despite their strong regional
focus, most publications also covered international news, re昀氀ecting a state (and vision) of the
world that has changed: notice the use of the toponyms‘Leghorn’ for Livorno,‘Condantinopte’
(i.e. Constantinople) for Istanbul, and‘Tauris’ for Tabriz, capital of‘*IS )ra Ariherbigan’ (i.e. East
Azerbaijan) in ‘Pcr昀椀a ’ (i.e. Persia, modern Iran).
      </p>
      <p>
        In this paper, we perform a comprehensive step-by-step examination of toponym linking
in the historical newspapers domain in English. As a result of this analysis, we
presentTRes,3 a new tool for toponym linking and resolution of historical newspapers in English, built
on top of existing robust technologies, such astransformers [
        <xref ref-type="bibr" rid="ref50">49</xref>
        ] for 昀椀ne-tuning a BERT
language model for named entity recognition 1[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]; DeezyMatch [
        <xref ref-type="bibr" rid="ref26">23</xref>
        ] for candidate selection;
and the work of Le and Titov [
        <xref ref-type="bibr" rid="ref29">27</xref>
        ] and Ganea and Hofmann [17] for entity disambiguation,
via the Radboud Entity Linker (REL) implementation 2[
        <xref ref-type="bibr" rid="ref5">4</xref>
        ]. T-Res has been developed to assist
researchers explore large collections of digitised historical newspapers, and has been designed
to tackle common problems of working with these data. It is implemented as a modular pipeline,
and is both user-friendly and 昀氀exible, where the user can either provide their own resources
and datasets and train their own models, or they can load existing models. We conclude our
paper with a preliminary but realistic case study in which we showcase how T-Res can be used
to support historical research.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        Entity linking (EL) is o昀琀en treated as a three-step process: (1) named entity recognition (NER)
is the task of detecting mentions, (2) candidate selection (CS) is the task of selecting a subset
of potential referents from a knowledge base (KB) for the detected mentions, and (3) entity
disambiguation (ED) 昀椀nds the best match, if any, from the pool of selected candidates. EL
benchmarks in English consist mainly of texts from the general domain, which mostly feature
1The Dorset County Chronicle, 1864-04-14.
2The Manchester Mercury, 1780-05-30.
3https://github.com/Living-with-machines/T-Res.
prominent entities [48]. Therefore, tools that perform well on such datasets, are o昀琀en found to
deteriorate in other domains, such as on historical documents3[
        <xref ref-type="bibr" rid="ref34 ref38 ref9">8, 42, 36, 32</xref>
        ]. The HIPE 2020
shared task4 [14] was created to address some of the EL challenges that are speci昀椀c to digitised
historical documents.
      </p>
      <p>
        Historical digitised data has certain traits that are typically absent from standard EL
benchmarks [13]. The presence of OCR errors is a persistent problem. In their assessment of the
impact of OCR in downstream tasks, Strien, Beelen, Coll Ardanuy, Hosseini, McGillivray, and
Colavizza [46] and Hamdi, Jean-Caurant, Sidère, Coustaty, and Doucet 1[
        <xref ref-type="bibr" rid="ref13">9</xref>
        ] observe how NER
performance decreases as text quality declines. The results of the HIPE-2020 shared task 1[
        <xref ref-type="bibr" rid="ref5">4</xref>
        ]
(and its continuation HIPE-2022 [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]) point to the importance of having in-domain training
data for NER, suggesting that 昀椀ne-tuning on noisy data results in better performance on
similarly noisy data. Similarly, Manjavacas and Fonteyn [
        <xref ref-type="bibr" rid="ref33">31</xref>
        ] show how NER models perform better
when they have been 昀椀ne-tuned on top of base models which were originally pre-trained on
indomain data, in this case, historical digitised texts. González-Gallardo, Boros, Girdhar, Hamdi,
Moreno, and Doucet [
        <xref ref-type="bibr" rid="ref21">18</xref>
        ] evaluated the performance of OpenAI’s ChatGPT on the task of
detecting (in a zero-shot manner) named entities in historical documents, revealing that, similarly,
ChatGPT struggles with identifying entities in OCR text5.
      </p>
      <p>
        Candidate selection is the least studied of the three sub-tasks. The identi昀椀cation of potential
candidates from the KB (usually based on collaboratively-built resources such as Wikipedia,
Wikidata, of Freebase) has traditionally been approached by performing exact or partial string
matching between a mention and the entries in the KB [
        <xref ref-type="bibr" rid="ref35 ref47">33, 45</xref>
        ]. Since most popular EL
benchmarks consist of very clean text, this step does not o昀琀en pose an obstacle for achieving a good
EL performance. In other words, plain string matching goes a long way. However, when
working with noisy text, basic string matching is far from su昀케cient [
        <xref ref-type="bibr" rid="ref52">51</xref>
        ]. In the domain of digitised
historical newspapers, Linhares Pontes, Cabrera-Diego, Moreno, Boros, Hamdi, Doucet, Sidere,
and Coustaty [
        <xref ref-type="bibr" rid="ref31">29</xref>
        ] propose a series of pre-processing heuristics used in combination with a
postcorrection step, based on mappings of common OCR errors observed in the data. Traditional
fuzzy string matching techniques based on edit distance (such as Levenshtein) can deal quite
accurately with OCR text, but they are not a viable solution for real-time EL, since they are
computationally ine昀케cient [
        <xref ref-type="bibr" rid="ref20">10</xref>
        ]. DeezyMatch [
        <xref ref-type="bibr" rid="ref26">23</xref>
        ], a so昀琀ware library for neural fuzzy string
matching, was developed as a response to this problem, building on Santos, Murrieta-Flores,
Calado, and Martins [
        <xref ref-type="bibr" rid="ref45">43</xref>
        ].
      </p>
      <p>
        The last step of the pipeline is a disambiguation task, consisting of selecting the most
appropriate entity from the pool of previously selected potential candidates. The entity
disambiguation literature o昀琀en distinguishes between local models—which rely only on the mention’s
context and the entities’ priors, o昀琀en based on hyperlink counts from large resources such as
Wikipedia [
        <xref ref-type="bibr" rid="ref36 ref37 ref9">8, 35, 34</xref>
        ]—and global models—which take interdependencies between entities into
account [
        <xref ref-type="bibr" rid="ref28 ref29 ref43">41, 25, 27</xref>
        ], with the more recent approaches learning deep representations for
relations between entities and mentions. In the domain of historical newspapers, Boros, Pontes,
Cabrera-Diego, Hamdi, Moreno, Sidère, and Doucet 6[] and Linhares Pontes, Hamdi, Sidere,
and Doucet [
        <xref ref-type="bibr" rid="ref32">30</xref>
        ] build on these approaches, and emphasise the importance of good knowledge
4https://impresso.github.io/CLEF-HIPE-2020/.
5Our own experiments with ChatGPT were not more successful.
representation.
      </p>
      <p>
        EL pipelines encapsulate all steps in one toolkit. DBpedia Spotlight3[3] and TagMe! [
        <xref ref-type="bibr" rid="ref18 ref42">16, 40</xref>
        ]
are two of the 昀椀rst and most widely used out-of-the-box linkers. More recently, REL [
        <xref ref-type="bibr" rid="ref27">24</xref>
        ] was
developed to overcome some of the shortcomings of previous systems, building on
state-of-theart approaches. REL uses Flair [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] for recognition. For candidate selection, just like most other
state-of-the-art approaches, REL employs a series of string-based heuristics to 昀椀nd potential
candidates, which are ranked according to a combination of entity priors and a measure of
similarity between the entity and the context of the mention, as in Ganea and Hofmann1[
        <xref ref-type="bibr" rid="ref8">7</xref>
        ], using
Wikipedia2Vec’s [
        <xref ref-type="bibr" rid="ref51">50</xref>
        ] word and entity embeddings. The local coherence between mention and
entity is computed as de昀椀ned in Ganea and Hofmann [17] and uses the global disambiguation
strategy proposed in Le and Titov 2[
        <xref ref-type="bibr" rid="ref8">7</xref>
        ]. REL is fast, user-friendly, easily customisable, and very
well documented, including tutorials, examples and a running API6.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Terminology and Task Definition</title>
      <p>In this paper, we use the terms toponym linking and (geographic) entity linking interchangeably
to refer to the end-to-end task of detecting mentions of places in texts and linking them to
their referent in a knowledge base.7 Formally de昀椀ned, given a document  , the goal is to
detect mentions of places 1, ...,   and resolve them to their corresponding entitie s1, ...,   in
a knowledge base (KB). This is achieved in three steps. The 昀椀rst step, called toponym recognition
or (geographic) named entity recognition, consists of detecting mentions of places 1, ...,   in
a document  . The second step is candidate selection, which, for a given mentio n  , aims
at selecting a subset of potential entities   = ( 1 , ...,   ) from the KB. The last step is entity
disambiguation, which, given the set of candidates  for mention   , consists of selecting the
candidate that is the correct entity  for mention  , or return NIL if there is none. Finally, we
de昀椀ne toponym resolution as the task of retrieving the coordinates of the predicted entities.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Experiments</title>
      <sec id="sec-4-1">
        <title>4.1. Knowledge Base</title>
        <p>
          As is commonly done in entity linking, we have used Wikimedia resources (in this case,
Wikipedia in combination with Wikidata) as the starting point for our KB. For each Wikipedia
page (herea昀琀er ‘entity’), we extracted all ways of referring to it over the entire Wikipedia
collection by means of the anchor texts of the hyperlinks pointing to the page (herea昀琀er ‘mentions’).
We then mapped Wikipedia entities to Wikidata, and kept only the subset of entities that are
6See: https://github.com/informagi/REL. While more recent approaches have now surpassed it on the leaderboard,
it is still positioned near the top, according tohttps://paperswithcode.com/task/entity-linking.
7Detecting and resolving mentions of places to their real world referents is a research problem shared by two
di昀erent tasks: (1) Entity Linking, the task of linking named entities to their corresponding entries in a knowledge base,
and (2) Toponym Resolution, also called geoparsing, the task of resolving place names to their spatial footprint,
o昀琀en their geographic coordinates [
          <xref ref-type="bibr" rid="ref30">28</xref>
          ]. Because of these slightly di昀erent objectives, both tasks are rarely
evaluated jointly. We treat this problem as an EL task, in part because linking to Wikidata (instead of directly providing
coordinates) gives the user access to other linked information.
geolocated on Wikidata8. The resulting subset consists of 929,855 geolocated entities. In
addition, for each entity, we keep the absolute and normalised mention-to-entity frequencies of
all its mentions. Mention-to-entity frequencies are normalised per entity: for example, the
settlement named ‘London’ in Kiribati has an absolute mention-to-entity count of 13 and a
normalised frequency of 0.81 (the mention ‘London’ refers to the location in Kiribati 13 times,
but the probability of London in Kiribati being referred to as ‘London’ is 0.81).
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Datasets</title>
        <p>
          We performed experiments on two di昀erent digitised historical newspaper datasets in English:
• TopRes19th (henceforth lwm): This dataset was created by the Living with Machines
project [
          <xref ref-type="bibr" rid="ref13">9</xref>
          ].9 In its latest version (v2), this dataset consists of 455 news articles in which
places were manually annotated and linked to Wikipedia (which we have mapped to
Wikidata). The news articles in this dataset were selected from local or regional
newspapers based in di昀erent locations in England (Manchester, Ashton-under-Lyne, Poole and
Dorchester), published between 1780 and 1870. In the dataset, toponyms are classi昀椀ed as
‘BUILDING’, ‘STREET’, ‘LOC’, ‘ALIEN’, ‘FICTION’, and ‘OTHER’, but the last three were
found to occur between zero and 昀椀ve times in the whole dataset, therefore resulting
negligible for training purposes. The dataset is split into training and test sets (343 and 112
articles respectively). We used 20% of the training set for development.
• Hipe2020 (henceforth hipe): This dataset was created by the Impresso project with data
from the Chronicling America project, and was released as part of the HIPE2020
evaluation campaign on named entity processing on historical newspapers1[
          <xref ref-type="bibr" rid="ref5">4</xref>
          ].10 It consists
of news articles in English, French, and German. The English collection, which is the
one we use, consists of 125 articles from 14 di昀erent newspapers (based in 14 di昀erent
locations in the United States) published between 1790 and 1960. The named entities
are manually identi昀椀ed and linked, whenever possible, to their corresponding Wikidata
entity. While the dataset has other entity types (such as ‘person’ or ‘organisation’), in
our experiments we consider only entities of the type ‘location’. This dataset does not
have a training set, it is instead split into a development and test set (80 and 46 articles
respectively).
        </p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Approaches</title>
        <sec id="sec-4-3-1">
          <title>4.3.1. Named entity recognition</title>
          <p>
            We 昀椀ne-tuned a BERT model for token classi昀椀cation, using the lwm training set. We used
a historical BERT model as base,bert_1760_1900 [
            <xref ref-type="bibr" rid="ref25">22</xref>
            ], trained on books in English published
8We used the 2021-10-20 English Wikipedia version and the2022-07-28 Wikidata version. We relied on the
WikiExtractor (https://github.com/attardi/wikiextractor) tool to extract the content of each page from the
Wikipedia XML dump, usedWikiMapper (https://github.com/jcklie/wikimapper) to map Wikipedia titles to
Wikidata QIDs, and used the Wikidata propertyP625 to 昀椀lter out non-geographic entities.
9The lwm dataset is available at https://doi.org/10.23636/r7d4-kw08(CC BY-NC-SA 4.0). Newspaper data was
provided by Findmypast Limited from the British Newspaper Archive, a partnership between the British Library
and Findmypast: https://www.britishnewspaperarchive.co.uk/.
10The hipe dataset is available at https://zenodo.org/record/6046853(CC BY-NC-SA 4.0).
between 1760 and 1900.11 To 昀椀ne-tune for toponym recognition, we used a learning rate of
0.00005, a batch size of 8, 10 epochs, and weight decay of 0.001.2 We perform a series of
postprocessing steps to 昀椀x obvious mistagging errors: we corrected I- labels at the beginning of a
new entity, removed innerI- tags due to nested entities, and 昀椀xed pre昀椀x assignment errors in
hyphenated entities.
          </p>
        </sec>
        <sec id="sec-4-3-2">
          <title>4.3.2. Candidate selection</title>
          <p>
            Our tool provides two main di昀erent strategies for candidate selection1:3 one is based on exact
matching, where candidates are retrieved from the KB if they are identical to the query; and
the other is based on fuzzy string matching, using a deep learning approach to fuzzy string
matching, DeezyMatch [
            <xref ref-type="bibr" rid="ref20 ref26">23, 10</xref>
            ] in a new fashion, which we expand on in the following
paragraphs.
          </p>
          <p>DeezyMatch for candidate selection DeezyMatch learns string transformations from a
large set of positive and negative example pairs (e.g. both ‘Zuiich’ and ‘7urich’ are positive
examples of OCR variations of ‘Zurich’, whereas ‘Munich’ is not). A model trained from these
examples is then used to embed both (1) the query and (2) all name variations in the KB into
vector representations. Candidate string variations are retrieved from the KB and ranked
according to the similarity between their embedding representations and the query embedding.</p>
          <p>
            We propose a new approach for generating positive and negative example pairs when large
volumes of noisy text are available. We observed an interesting di昀erence between static word
embeddings learnt from clean text and learnt from OCR text. In the 昀椀rst case, as explained
by the distributional hypothesis, the top nearest neighbours of a query tend to be words that
are semantically similar. However, when word embeddings are trained on OCR text, many of
the top nearest neighbours are OCR variations of the query. We used this observation to build
a dataset of positive and negative matches from word2vec embeddings learnt from digitised
English newspapers, where:
• If the string similarity between the nearest neighbour and the target word is high (such
as ‘maciiine’ and ‘machine’) and the nearest neighbour is not an existing word in English,
we consider it an OCR string variation of the target word (i.e. positive example);
• If the string similarity between the nearest neighbour and the target word is low (such
as ‘maciiine’ and ‘device’), we consider it is not a string variation, as it is probably a
synonym or near-synonym (i.e. negative example).
11The model is available athttps://huggingface.co/Livingwithmachines/bert_1760_1900.
12We selected these values based on previous research that performed a hyperparameter search for the same task
and a di昀erent base model [
            <xref ref-type="bibr" rid="ref46">44</xref>
            ]. See more information athttps://github.com/dbmdz/clef-hipe/blob/main/experi
ments/clef-hipe-2022/. The resulting toponym recognition model is available athttps://huggingface.co/Livingw
ithmachines/toponym-19thC-en.
13We also provide functions for performing partial matching based on string overlap and fuzzy string matching
based on the Damerau-Levenshtein edit distance. However, both methods are highly time-consuming, and
therefore unusable for real-time scenarios.
We used openly-available word embeddings trained on digitised newspaper text from four
different decades (1800s, 1830s, 1860s, and 1890s).14 Table 1 shows examples of positive and
negative string matches generated with this approach. We expanded the resulting string pairs
dataset by appending similar variations of place names obtained from our KB mention-to-entity
mapping.15 The resulting dataset consists of 1,085,514 string pairs.
Candidate ranking and selection Given a query, the candidate selection step retrieves one
or many potential name variations from the KB. In the exact match approach, only one name
variation is retrieved (i.e. the identical match) with a similarity score of 1.0. In the DeezyMatch
approach, the user can choose the number of name variations to retrieve and set the maximum
accepted distance between the embeddings of the query and the KB mentions1.6 The similarity
score for each of the retrieved name variations is obtained from reverse-normalising the
distance score against the threshold. Each name variation is then expanded to multiple Wikidata
entities (i.e. candidates), using the mention-to-entity mapping from our KB17.
          </p>
        </sec>
        <sec id="sec-4-3-3">
          <title>4.3.3. Entity disambiguation</title>
          <p>
            The last step consists of 昀椀nding the most likely entity from the pool of selected candidates for
a given query. We provide two dummy baselines: the 昀椀rst baseline ( mostpopular) selects the
candidate which is more likely to be referred by a certain query, using the mention-to-entity
absolute counts described in section4.1. The second baseline (bydistance) is based on distance
from the place of publication, in which the closest candidate from the place of publication
is naively selected as the correct entity. Finally, our tool adopts REL’s entity disambiguation
14The word embeddings are available athttps://doi.org/10.5281/zenodo.7181682(CC BY 4.0) [39]. For example,
the nearest neighbours of ‘machine’ in word embeddings trained from digitised newspaper articles published in
the 1860s are: ‘machines’, ‘maehine’, ‘maciiine’, ‘machina’, ‘maohine’, ‘achine’, ‘miachine’, and ‘maohine’. We used
the vocabulary of the 50d GLoVe embeddings to discern whether a word exists in English. Further details can be
found in our GitHub repository.
15For example, the Wikidata entry Q7268098 is referred to as both ‘Qoorlugud’ or ‘Qorilugud’: they would be added
as positive variations of each other.
16We selected one name variation per query, with an L2-norm similarity threshold set at 50.
17For example, given the query ‘Wiltshire’, the exact match approach would retrieve the mention ‘Wiltshire’ from
the KB with a similarity score of 1.0, which would be expanded to the following Wikidata entities: Q23183,
Q55448990, and Q8023421, since all of them are referred to as ”Wiltshire” at least once in Wikipedia anchor
texts.
implementation18 (rel), which is based on Ganea and Hofmann [17] and Le and Titov [
            <xref ref-type="bibr" rid="ref29">27</xref>
            ], and
uses a neural approach to combining local mention-to-entity compatibility and global
entity-toentity coherence. We provide our own set of candidates (selected either with theexact or deezy
match approach), which we pre-rank by averaging the string matching score and the relative
mention-to-entity score, and the normalised absolute mention-to-entity score. We additionally
provide the following two alterations to the REL disambiguation approach:
• Providing information about the place of publication: (+publ) Since we are aware
of the strong local emphasis of the historical press, we experiment with arti昀椀cially
providing information on the place of publication (which is o昀琀en available from the
newspaper’s metadata) to the disambiguation module: we do so by adding one additional
already-disambiguated entity per sentence, both in training and in testing,
corresponding to the place of publication, and adding the publication place name also as part of the
context of the sentence.
• Unlinking micro locations: (:nil) Streets and buildings have rather di昀erent
characteristics than other locations typically found in news articles: they are o昀琀en highly
ambiguous and o昀琀en entirely dependent on the cues provided by context. At the same
time, they have a very limited coverage in Wikipedia (where only the most noteworthy
streets or buildings are usually included). In this variation of the original method, only
the LOC entities are disambiguated, whereas mentions classi昀椀ed as BUILDING or STREET
are linked to NIL.
          </p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Evaluation and Discussion</title>
      <p>In this section, we report and discuss the results assessed using the HIPE-scorer1.9</p>
      <sec id="sec-5-1">
        <title>5.1. Toponym Recognition</title>
        <p>
          18We used the REL version at commit9ca253b. We use the Wikipedia2Vec [
          <xref ref-type="bibr" rid="ref51">50</xref>
          ] word and entity embeddings shared
by the authors, mapping them to Wikidata entities instead of Wikipedia titles.
19The HIPE-scorer (https://github.com/hipe-eval/HIPE-scorer, v1.1) is a Python module developed as part of the
        </p>
        <p>CLEF-HIPE-2020 evaluation campaign on named entity recognition and linking on historical newspapers.
20REL’s default approach to recognising named entities in text uses Flair’s character-level sequence tagger 1[],
which is trained and evaluated on CoNLL-2003 data. Since REL tags not only locations, but also persons and
organisations, we keep only those entities which are tagged as ‘LOC’ or whose prediction is an entity in our KB.</p>
        <p>
          REL returns a Wikipedia title, which we turn into a Wikidata QID.
21To learn more about the neural baseline and participating teams and approaches, read Ehrmann, Romanello,
Najem-Meyer, Doucet, and Clematide [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]. We only provide results for the ‘LOC’ tag forhipe because this dataset
includes non-geographic entities. It is worth noting that there may be slight di昀erences between the datasets
used by us and those used in the shared task, because they have undergone di昀erent preparation steps. These
di昀erences are probably too small to be signi昀椀cant.
• Strict: exact boundary match, same entity type.
        </p>
        <p>• Type: at least one token overlap, same entity type.</p>
        <p>Note the considerable di昀erence between the strict and type settings in all cases,22 the latter
re昀氀ecting the correct identi昀椀cation of a mention’s presence while not agreeing on the exact
named entity boundary. While the poorer performance of the out-of-the-box REL tool is not
per se surprising (given that it has not been optimised for digitised historical text) the di昀erence
is substantial nonetheless. This experiment in fact highlights how, already at the recognition
stage, there is a di昀erence of around 15%–23% in terms of exact F1 between the out-of-the-box
state-of-the-art method and a tool carefully attuned to the speci昀椀c application domain. This is
signi昀椀cant, since errors introduced in this step will percolate through the rest of the pipeline.
lwm
hipe
all
loc
street
building
loc</p>
        <p>T-Res
aauzh
neurbsl
T-Res
rel
T-Res
T-Res
T-Res
aauzh
neurbsl
l3i
rel</p>
        <p>P</p>
      </sec>
      <sec id="sec-5-2">
        <title>5.2. Candidate Selection</title>
        <p>In table 3, we report the highest possible performance that can be achieved during linking (i.e.
skyline) by the di昀erent candidate selection strategies. In other words, a skyline true positive
is when the correct entity has been selected as a potential candidate for a particular mention.
The skyline, therefore, can be considered as a proxy for the quality of the di昀erent candidate
selection approaches. We provide two evaluation settings: end-to-end EL (where mentions are
identi昀椀ed using the best performing NER approach), and EL-only (where gold standard
mentions are provided). In both cases, we report micro-scores using the ‘type’ evaluation setting,
as 昀椀nding the exact boundaries of the named entity is now not the goal. The results show the
advantages of having a fuzzy string matching method (i.e.deezy), which in this case is trained
on corpus-speci昀椀c OCR variations, similar to those present in both datasets. The lower
performance on hipe in the end-to-end EL setting is mostly just the consequence of the a worse
toponym recognition in the previous step.
22The smaller di昀erence for our tool’s performance on lwm is expected since it uses thelwm training set for NER.
lwm
hipe</p>
        <p>approach
T-Res:exact
T-Res:deezy
T-Res:exact
T-Res:deezy</p>
        <p>P</p>
      </sec>
      <sec id="sec-5-3">
        <title>5.3. Entity Disambiguation</title>
        <p>We report the results for the entity disambiguation step in Tables4 and 5. The scores highlight
how a very simple baseline—the combination of perfect match (at the selection stage) and most
popular (at the disambiguation stage)—achieves a higher performance than the out-of-the-box
REL system in the lwm dataset (but not onhipe), emphasising again the importance of a
domainspeci昀椀c module at the recognition stage. 23 The distance-based baseline, on the other hand,
performs very poorly on both datasets. On thelwm dataset, the REL disambiguation approach
(used as part of our tool), beats themostpopular baseline, but not so in thehipe dataset, showing
that the most common sense continues to be a very strong baseline. In both cases, interestingly,
forcing streets and buildings to be ‘NIL’ results in a substantially higher performance. This
suggests that most of these entities must be of type ‘NIL’ in the data (i.e. either too ambiguous
to annotate or not present in the KB), but also that the disambiguation approach may not be
suitable for these entities. While adding the place of publication has a positive impact on the
hipe dataset, the impact on the lwm dataset is less clear.</p>
        <p>Finally, we inspected more closely how the performance of our approaches vary based on
several characteristics of our data. We split thelwm dataset into ten di昀erent subsets, each a
unique combination of the decade in which the texts were written and the place of publication
of the newspaper. We then performed a 10-fold validation of our results, where, in each fold,
one subset was used for testing, another one for development, and the remaining eight subsets
were used for training.24 Detailed results are shown in Table6.</p>
        <p>First of all, we see a correlation between worse OCR quality (corresponding to the earlier data
splits) and a lower skyline and linking performances. Second, there seems to be a correlation
23Note that there are other factors that should also be taken into account. REL returns Wikipedia titles, which we
mapped to Wikidata IDs, keeping only those results that are tagged as ‘LOC’ or can be mapped to geographic
coordinates. However, it should be noted that REL uses its own KB, consisting not only of locations, therefore
making the disambiguation a more di昀케cult task since it is not only geographical entities that compete in the
disambiguation process, but entities of any kind. It may be worth investigating, as part of future research, the
impact this has on end-to-end EL. In providing this comparison, our goal is to illustrate the impact of using a
general purpose EL system for this task, and stress the importance of developing tools that are targeted to the
speci昀椀c task and domain.
24The ten subsets by publication are: Ashton-under-Lyne 1860, Dorchester 1820, Dorchester 1830, Dorchester 1860,
Manchester 1780, Manchester 1800, Manchester 1820, Manchester 1830, Manchester 1860 and Poole 1860.
between the proportion of NILs and a lower median distance from publication, which is not
entirely justi昀椀ed by a high presence of micro locations (i.e. streets and buildings), suggesting
either (1) a higher di昀케culty for human annotators of 昀椀nding the true referents of local mentions,
or (2) the absence of local entities in the knowledge base. It is therefore not surprising to see
rel:nil signi昀椀cantly improving on mostpopular in these cases, since the 昀椀rst maps buildings and
streets to NIL. However, a closer inspection of the results also reveals the importance of the
sensitivity of rel:nil (and rel+publ:nil) to context: for example, while “Ashton” is consistently
resolved to be in Maryland by the mostpopular approach, it is in all but one case resolved
adequately to Ashton-under-Lyne by the REL-based approaches.</p>
      </sec>
      <sec id="sec-5-4">
        <title>5.4. Discussion and Limitations</title>
        <p>Our research has focused on the geographic aspect of entity linking. However, T-Res could
directly be used for general entity linking as well, with the exception of thebydistance linking
Manchester
Poole
method, given a suitable knowledge base. We have focused on digitised historical
newspapers, but T-Res could in principle be generalisable to other domains, possibly depending on
additional annotated in-domain data. Further research is needed in this direction.</p>
        <p>
          Our tool can be improved in many directions: each of its modules (named entity recognition,
candidate selection, entity disambiguation) is open to improvements: for example, we can
include more sophisticated NER 昀椀ne-tuning strategies [
          <xref ref-type="bibr" rid="ref8">7</xref>
          ]. Computationally, however, the NER
step is the clear bottle-neck of our tool: for example, it took about 90 minutes to recognise all
toponyms in a sample of about 1,500 articles (4.2M of plain text) on a CPU, while the
candidate selection and entity disambiguation steps (usingdeezy and rel:nil+publ) jointly took about
three minutes.
        </p>
        <p>
          By looking at our results from a more qualitative perspective, we realise that many of our
errors stem from the KB itself. This is not surprising: not only does our KB mainly contain
modern entities, it also uses modern relations between entities (via word and entity
embeddings) as a way to represent their historical similarity. In the same vein as recent research 2[
          <xref ref-type="bibr" rid="ref6 ref7">6,
5</xref>
          ], we believe further research should go into understanding the impact of using domain
appropriate entity embeddings in the disambiguation step, for example by training embeddings
which take into account time and space.
        </p>
        <p>Finally, another source of errors is the use of DeezyMatch for fuzzy string matching: while
it allows us to e昀케ciently discover entities which would otherwise have remained hidden, such
as ‘Ashtonnnder-Lyne’ for ‘Ashton-under-Lyne’ or ‘Horbury Junotion’ for ‘Horbury Junction’,
its precision is lower than that of a traditional edit distance approach1[0], sometimes resulting
in what is called hallucinations in today’s AI jargon. For example, ‘Vieillevigne’ is matched to
‘Vielle Montaigne’. We therefore suggest combining the fast discoverability power of
DeezyMatch with a more conservative edit distance approach to 昀椀lter obvious mismatches.25 Finally,
linking of micro locations is another direction that clearly requires further research.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Historical Case Study: Geographies from Below?</title>
      <p>
        In this section, we present a case study as a type of user-testing and to assess how T-Res
supports novel historical research on the digitised press. We explore the shi昀琀ing geographies in
Victorian working-class newspapers, analysing their local, national and transnational
dimensions. For that, we have used the openly available British newspapers digitised by theLiving
with Machines project [
        <xref ref-type="bibr" rid="ref49">47</xref>
        ].26 Motivated by a need to counterbalance the dominance of the
liberal and conservative press with non-elite perspectives “from below” 4[], the project
prioritised the digitisation of “plebeian” newspapers channelling working-class voices, and selected
exclusively ‘provincial’ papers to help research move beyond the traditional metropolitan
emphasis in periodical research [
        <xref ref-type="bibr" rid="ref24">21</xref>
        ]. The corpus is not representative of the press (or society)
as a whole, but it provides a solid starting point to explore the geographies embedded in the
popular, working-class papers.
Our case study is based on a sample of more than 2,500 complete issues published between 1880
and 1900. This resulted in a set of 2.7 million detected toponyms. In the experiments, we kept
only those georeferenced places classi昀椀ed as location (‘LOC’), which resulted in a collection
of 1,770,412 data points comprising 46,820 distinct georeferenced locations. Figur1e shows
25In our case studies, we have applied a threshold of 0.85 edit distance similarity ratio between the mention and the
returned DeezyMatch candidate, using theTheFuzz library: https://github.com/seatgeek/thefuzz.
26https://livingwithmachines.ac.uk/over-half-of-a-million-pages-of-historical-newspapers-now-openly-available
/
the global distribution of these unique toponyms. Below, we explore the places mentioned in
the news across three di昀erent levels: the local, the national and the transnational. Firstly, we
investigate to what extent the coverage of these late-Victorian newspapers was limited to the
national border (of the United Kingdom, which includes today’s Republic of Ireland). Secondly,
we analyse whether these provincial titles emphasised local events or increasingly attended to
metropolitan news. Thirdly, we have a closer look at transnational aspects, more precisely the
presence of popular imperialism in these working-class newspapers.
      </p>
      <p>Turning to the 昀椀rst question, Figure 2 shows the proportion of British (the orange bar) versus
non-British places (the blue bar) between 1880 and 1900. While there is no dramatic change
over these two decades, it shows a decrease in foreign place names (admittedly, a small drop),
from 25% in the early 1880s to 23% around 1900. Put di昀erently, more than 75% of all
mentions comprise British place names. Taken together, these results may be more remarkable in
their stability: while news reporting is o昀琀en driven by unexpected events, on average,
attention to what is happening outside the borders of the United Kingdom remained more or less
unchanged.</p>
      <p>
        Newspapers played a critical role in shaping and upholding the nation as an imagined
community [
        <xref ref-type="bibr" rid="ref3">2</xref>
        ]. But, even though the previous analysis shows that the press was, discursively,
椀昀rmly anchored in British soil, this doesn’t necessarily imply that it was “national” in its scope.
In his extensive study A Fleet Street in Every Town, the historian Andrew Hobbs concedes that
the Victorian reader generally preferred local news and that the press played a critical role
in forging local communities [
        <xref ref-type="bibr" rid="ref23">20</xref>
        ]. Nonetheless, while the idea of a “national press” was still
only emergent, these provincial papers were far from isolated entities. Hobbs, at the same
time, underlined the networked character of the Victorian provincial press. While newspapers
o昀琀en served as chroniclers of local culture and events, they did so as local nodes in a wider,
national network of information. Put di昀erently, the provincial press was “a ‘national’ [network]
made from many ‘local’ elements”, and London 昀椀gured as a central node in this network. This
suggests that the provincial press was far from being parochial.
      </p>
      <p>
        To better understand how these newspapers meandered between the local and national level,
we scrutinise the distribution of toponyms situated in England, Wales, Scotland and Ireland.
For Figure 3, we 昀椀rst computed the distance between each toponym and the newspaper’s place
of publication.27 We divided these mentions into di昀erent bands based on their proximity to
the place of publication. For example, the blue band shows the proportion of places names
which were less than 25km removed from the district where the paper was produced.
Interestingly enough, the geographical coverage of these papers tended to shrink. Changes remain
small, but we do observe an increasing emphasis on more local matters (again, very
rudimentarily measured as events taking place near the place of publication). On average, the proportion
of toponyms in the blue band increases by roughly 5 percentage points over these two decades.
To assess the dominance of the metropolis in the provincial press, we calculated the distance
of each toponym to London2.8 While London was very present indeed, it was not as dominant
as expected: less than 20% of all the toponyms were located in or around London. Most
surprisingly, the number of mentions seems to decline over time, which ties in with our earlier
椀昀nding that suggested a narrowing of the geographic horizon of these late Victorian titles.
27We used historical press directories to determine the place of publication. For more information on the press
directories, see [
        <xref ref-type="bibr" rid="ref5">4</xref>
        ].
28We looked at places less than 25km removed from the coordinates as reported on Wikidataht(tps://www.wikida
ta.org/wiki/Q84).
      </p>
      <p>When comparing how individual papers vary in their emphasis on the local event, the
differences become more pronounced, as shown in Figure4. This allows us to understand and
classify how these titles di昀ered in terms of their geographical reach and coverage. Some of
these provincial periodicals had a distinctively local emphasis. For example, close to 60% of all
places in the The Birkenhead News and Wirral General Advertiser are located within a radius of
25km from Birkenhead. Others are broader in their coverage, they appear as less centred on
one speci昀椀c locality, but on a wider region. The The Atherstone, Nuneaton, and Warwickshire
Times can serve as an example. Even though 50% of all toponyms are less than 50km removed
from Atherstone, just about 20% appear within a 25km radius of this town. Exploring the
distribution of these toponyms, therefore, might be a valuable way of understanding how these
working-class papers anchored themselves spatially, negotiating between local, regional and
national identities.</p>
      <p>Lastly, we scrutinised the transnational level, focusing on the imperial geographies
embedded in these digitised newspapers. In his analysis of popular imperialism, Nicholson37[]
relies on popular provincial newspapers to probe the attitudes of the working classes towards
the imperialist project. Especially concerning the Second Boer War (1899-1902), he questions
whether the working-class patriotic support for this endeavour was as strong as historians
previously imagined. Looking at the results gathered from our corpus, it 昀椀rstly transpires that
geographical mentions of the empire were relatively low, consistently hoovering around 5%
of all the toponyms. Zooming in on Africa and Asia, the numbers are even lower—especially
compared to references to locations in Canada and Australia—except around moments of crisis,
such as the Second Boer War. Mentions of South African place names, for example, showed
a dramatic increase at the end of the 19th century. These results are preliminary and should
be complemented with additional content- and sentiment-based analyses in order to monitor
more accurately the prevalence of jingoism in the popular press.</p>
    </sec>
    <sec id="sec-7">
      <title>7. Conclusion</title>
      <p>In this paper, we presented a comprehensive step-by-step examination of toponym linking for
historical newspapers in English. We argued that a good performance on standard and highly
generic benchmarks does not necessarily extrapolate to other domains. When applied to
digitised historical newspapers, the accuracy of these state-of-the-art tools o昀琀en drop signi昀椀cantly,
therefore hinting at the complexity of 昀椀nding a general solution to EL. We have presented and
evaluated a new and very adaptable tool, T-Res, that resulted from these investigations: T-Res
builds on top of robust NLP approaches, tailoring them to the speci昀椀c task of toponym linking
in historical newspapers. We concluded with a historical case study that demonstrated how
our pipeline supports ongoing research on the local, national and transnational dimensions of
the popular press.</p>
    </sec>
    <sec id="sec-8">
      <title>Acknowledgements</title>
      <p>The authors are grateful to the reviewers for their careful and constructive reviews. Work
for this paper was produced as part of Living with Machines. This project, funded by the UK
Research and Innovation (UKRI) Strategic Priority Fund, is a multidisciplinary collaboration
delivered by the Arts and Humanities Research Council (AHRC grant AH/S01179X/1), with
The Alan Turing Institute, the British Library and the Universities of Cambridge, East Anglia,
Exeter, and Queen Mary University of London. This work was also supported by The Alan
Turing Institute (EPSRC grant EP/N510129/1).
[26] K. Labusch and C. Neudecker. “Named Entity Disambiguation and Linking Historic
Newspaper OCR with BERT”. In: Conference and Labs of the Evaluation Forum (CLEF).</p>
      <p>Vol. 2696. CEUR Workshop Proceedings, 2020.</p>
      <p>M. Van Erp, P. Mendes, H. Paulheim, F. Ilievski, J. Plu, G. Rizzo, and J. Waitelonis.
“Evaluating entity linking: An analysis of current benchmark datasets and a roadmap for doing
a better job”. In: Proceedings of the Tenth International Conference on Language Resources
and Evaluation. European Language Resources Association, 2016, pp. 4373–4379.</p>
    </sec>
    <sec id="sec-9">
      <title>A. Appendix: T-Res</title>
      <p>The following code snippet shows how the T-Res pipeline works:
from geoparser import pipeline, recogniser, ranking, linking
myner = recogniser.Recogniser(...) # Instantiate the Recogniser
myranker = ranking.Ranker(...) # Instantiate the Ranker
mylinker = linking.Linker(...) # Instantiate the Linker
geoparser = pipeline.Pipeline(myner=myner, myranker=myranker, mylinker=mylinker)
output = geoparser.run_text("Inspector Liddle said: I am an inspector of police, living
in the city of Durham.",
place="Alston, Cumbria, England",
place_wqid="Q2560190"
)
The parentheses (...) indicate an ellipsis in the code, where the user has the option to
instantiate each of the three modules (theRecogniser for named entity recognition, theRanker for
candidate selection, and theLinker for entity disambiguation) according to their needs. For
example, they may choose to instantiate aRecogniser that uses a speci昀椀c model for named
entity recognition from the HuggingFace hub, or they may choose to train their own model,
provided a base model and their own dataset (in the format required). They may instantiate a
Ranker that, given a KB, uses the exact match approach to 昀椀nd candidates, or choose to train
their own DeezyMatch model, given a dataset of positive and negative pairs, and use it for
candidate selection. Likewise, they may instantiate aLinker module that, given a KB, uses the
mostpopular approach, or they may train their ownrel disambiguation approach.</p>
      <p>The following snippet shows the output from the previous command, as a json:
[{"mention": "Durham",
},
"string_match_score": {"Durham": [1.0, ["Q1137286", "Q5316477", "Q752266", "..."]]},
"prior_cand_score": {
"Q179815": 0.881,
"Q49229": 0.522,
"Q5316459": 0.457,
"Q17003433": 0.455,
"Q23082": 0.313,
"Q458393": 0.295,
"Q1075483": 0.293
For each mention detected in the input text, our tool returns:
• mention: mention as it appears in the text.
• ner_score: NER con昀椀dence score.
• pos: start character position of the mention in the sentence.
• sent_idx: sentence index in the text.
• end_pos: end character position of the mention in the sentence.
• tag: name entity type.
• sentence: target sentence.
• prediction: predicted Wikidata entity.
• ed_score: disambiguation con昀椀dence score.
• cross_cand_score: selected candidates and their cross-candidate con昀椀dence scores.
• string_match_score: selected candidates and their string matching con昀椀dence scores.
• prior_cand_score: selected candidates and their prior con昀椀dence scores.
• latlon: geographic coordinates of the predicted entity.</p>
      <p>• wkdt_class: Most common Wikidata class of the predicted entity.</p>
      <p>The tool can also be used in a step-wise manner, or for just one module in the pipeline. We
provide full documentation in our GitHub repositoryh:ttps://github.com/Living-with-machi
nes/T-Res.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Akbik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Blythe</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Vollgraf</surname>
          </string-name>
          . “
          <article-title>Contextual string embeddings for sequence labeling”</article-title>
          .
          <source>In: Proceedings of the 27th international conference on computational linguistics.</source>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <given-names>Santa</given-names>
            <surname>Fe</surname>
          </string-name>
          : Association for Computational Linguistics,
          <year>2018</year>
          , pp.
          <fpage>1638</fpage>
          -
          <lpage>1649</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>B.</given-names>
            <surname>Anderson</surname>
          </string-name>
          .
          <article-title>Imagined communities: Re昀氀ections on the origin and spread of nationalism</article-title>
          .
          <source>Verso books</source>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <given-names>M.</given-names>
            <surname>Beals</surname>
          </string-name>
          and
          <string-name>
            <surname>E. Bell. “</surname>
          </string-name>
          <article-title>The atlas of digitised newspapers and metadata: Reports from Oceanic Exchanges”</article-title>
          . In: Loughborough: Loughborough University (
          <year>2020</year>
          ). doi:
          <volume>10</volume>
          .6084/m 9.figshare.
          <volume>11560059</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>K.</given-names>
            <surname>Beelen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lawrence</surname>
          </string-name>
          , D. C. Wilson, and
          <string-name>
            <given-names>D.</given-names>
            <surname>Beavan</surname>
          </string-name>
          . “
          <article-title>Bias and representativeness in digitized newspaper collections: Introducing the environmental scan”</article-title>
          .
          <source>InD:igital Scholarship in the Humanities 38.1</source>
          (
          <issue>2023</issue>
          ), pp.
          <fpage>1</fpage>
          -
          <lpage>22</lpage>
          . doi:
          <volume>10</volume>
          .1093/llc/fqac037.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>E.</given-names>
            <surname>Boros</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.-E.</given-names>
            <surname>González-Gallardo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Giamphy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hamdi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. G.</given-names>
            <surname>Moreno</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Doucet</surname>
          </string-name>
          . “
          <article-title>Knowledge-based contexts for historical named entity recognition &amp; linking”. InC: onference and Labs of the Evaluation Forum (CLEF)</article-title>
          .
          <source>Vol. 3180. CEUR Workshop Proceedings</source>
          ,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>E.</given-names>
            <surname>Boros</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. L.</given-names>
            <surname>Pontes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. A.</given-names>
            <surname>Cabrera-Diego</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hamdi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. G.</given-names>
            <surname>Moreno</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Sidère</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Doucet</surname>
          </string-name>
          . “
          <article-title>Robust named entity recognition and linking on historical multilingual documents”</article-title>
          .
          <source>In: Conference and Labs of the Evaluation Forum (CLEF)</source>
          . Vol.
          <volume>2696</volume>
          .
          <string-name>
            <surname>CEUR-WS Working Notes</surname>
          </string-name>
          .
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>E.</given-names>
            <surname>Boroş</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hamdi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. L.</given-names>
            <surname>Pontes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.-A.</given-names>
            <surname>Cabrera-Diego</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. G.</given-names>
            <surname>Moreno</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Sidere</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Doucet</surname>
          </string-name>
          . “
          <article-title>Alleviating digitization errors in named entity recognition for historical documents”</article-title>
          .
          <source>In: Proceedings of the 24th conference on computational natural language learning. Acl</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>431</fpage>
          -
          <lpage>441</lpage>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2020</year>
          .conll-
          <volume>1</volume>
          .
          <fpage>35</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>R.</given-names>
            <surname>Bunescu</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Paşca</surname>
          </string-name>
          . “
          <article-title>Using Encyclopedic Knowledge for Named entity Disambiguation”</article-title>
          .
          <source>In: 11th Conference of the European Chapter of the Association for Computational Linguistics. Trento: Association for Computational Linguistics</source>
          ,
          <year>2006</year>
          , pp.
          <fpage>9</fpage>
          -
          <lpage>16</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <surname>Nanni</surname>
            ,
            <given-names>D. van Strien</given-names>
          </string-name>
          , and D. C. Wilson. “
          <article-title>A dataset for toponym resolution in nineteenthCentury English newspapers”</article-title>
          .
          <source>In: Journal of Open Humanities Data</source>
          <volume>8</volume>
          (
          <year>2022</year>
          ). doi:
          <volume>10</volume>
          .533 4/johd.56.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <source>In: Proceedings of the 28th International Conference on Advances in Geographic Information Systems</source>
          .
          <year>2020</year>
          , pp.
          <fpage>385</fpage>
          -
          <lpage>388</lpage>
          . doi:
          <volume>10</volume>
          .1145/3397536.3422236.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          , and
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          . “BERT:
          <article-title>Pre-training of Deep Bidirectional Transformers for Language Understanding”</article-title>
          .
          <source>In:Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</source>
          , Volume
          <volume>1</volume>
          (Long and Short Papers).
          <source>Minneapolis: Association for Computational Linguistics</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>4171</fpage>
          -
          <lpage>4186</lpage>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>N19</fpage>
          -1423.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [9] [10] [12] [13]
          <string-name>
            <surname>M. L. Dıéz Platas</surname>
            ,
            <given-names>S. Ros</given-names>
          </string-name>
          <string-name>
            <surname>Munoz</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          <string-name>
            <surname>González-Blanco</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Ruiz Fabo</surname>
            , and
            <given-names>E. Alvarez</given-names>
          </string-name>
          <string-name>
            <surname>Mellado</surname>
          </string-name>
          . “
          <article-title>Medieval Spanish (12th-15th centuries) named entity recognition and attribute annotation system based on contextual information”</article-title>
          .
          <source>In:Journal of the Association for Information Science and Technology 72.2</source>
          (
          <issue>2021</issue>
          ), pp.
          <fpage>224</fpage>
          -
          <lpage>238</lpage>
          . doi:
          <volume>10</volume>
          .1002/asi.24399.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <given-names>M.</given-names>
            <surname>Ehrmann</surname>
          </string-name>
          , G. Colavizza,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Rochat</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Kaplan</surname>
          </string-name>
          . “
          <article-title>Diachronic evaluation of NER systems on old newspapers”</article-title>
          .
          <source>In: Proceedings of the 13th Conference on Natural Language Processing (KONVENS</source>
          <year>2016</year>
          ).
          <source>Bochumer Linguistische Arbeitsberichte</source>
          ,
          <year>2016</year>
          , pp.
          <fpage>97</fpage>
          -
          <lpage>107</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15] [17]
          <string-name>
            <given-names>M.</given-names>
            <surname>Ehrmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Romanello</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Flückiger</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Clematide</surname>
          </string-name>
          . “
          <source>Extended Overview of CLEF HIPE</source>
          <year>2020</year>
          :
          <article-title>Named Entity Processing on Historical Newspapers”</article-title>
          . In:Working Notes of CLEF 2020 -
          <article-title>Conference and Labs of the Evaluation Forum</article-title>
          . Ed. by
          <string-name>
            <given-names>L.</given-names>
            <surname>Cappellato</surname>
          </string-name>
          , C. Eickho昀, N. Ferro,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Névéol</surname>
          </string-name>
          . Vol.
          <volume>2696</volume>
          . Thessaloniki: Ceur-ws,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <given-names>M.</given-names>
            <surname>Ehrmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Romanello</surname>
          </string-name>
          , S. Najem-Meyer, A. Doucet, and
          <string-name>
            <given-names>S.</given-names>
            <surname>Clematide</surname>
          </string-name>
          . “Overview of HIPE-2022:
          <article-title>Named Entity Recognition and Linking in Multilingual Historical Documents”</article-title>
          . In: Experimental IR Meets Multilinguality, Multimodality, and
          <string-name>
            <surname>Interaction</surname>
          </string-name>
          (CLEF).
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <surname>Springer</surname>
          </string-name>
          ,
          <year>2022</year>
          , pp.
          <fpage>423</fpage>
          -
          <lpage>446</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>031</fpage>
          -13643-6\_
          <fpage>26</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>P.</given-names>
            <surname>Ferragina</surname>
          </string-name>
          and
          <string-name>
            <given-names>U.</given-names>
            <surname>Scaiella</surname>
          </string-name>
          . “Tagme:
          <article-title>on-the-昀氀y annotation of short text fragments (by Wikipedia entities)”</article-title>
          .
          <source>In:Proceedings of the 19th ACM international conference on Information and knowledge management</source>
          . New York: Association for Computing Machinery,
          <year>2010</year>
          , pp.
          <fpage>1625</fpage>
          -
          <lpage>1628</lpage>
          . doi:
          <volume>10</volume>
          .1145/1871437.1871689.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <string-name>
            <surname>O.-E. Ganea</surname>
            and
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Hofmann</surname>
          </string-name>
          . “
          <article-title>Deep Joint Entity Disambiguation with Local Neural Attention”</article-title>
          .
          <source>In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing</source>
          . Copenhagen: Association for Computational Linguistics,
          <year>2017</year>
          , pp.
          <fpage>2619</fpage>
          -
          <lpage>2629</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <source>doi: 10</source>
          .18653/v1/
          <fpage>D17</fpage>
          -1277.
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [18]
          <string-name>
            <surname>C.-E.</surname>
            González-Gallardo,
            <given-names>E.</given-names>
          </string-name>
          <string-name>
            <surname>Boros</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Girdhar</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Hamdi</surname>
            ,
            <given-names>J. G.</given-names>
          </string-name>
          <string-name>
            <surname>Moreno</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Doucet</surname>
          </string-name>
          . “
          <article-title>Yes but</article-title>
          ...
          <article-title>Can ChatGPT identify entities in historical documents?”</article-title>
          <source>In:arXiv preprint arXiv:2303.17322</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>A.</given-names>
            <surname>Hamdi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Jean-Caurant</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Sidère</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Coustaty</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <surname>A. Doucet.</surname>
          </string-name>
          “
          <article-title>Assessing and minimizing the impact of OCR quality on named entity recognition”. InD:igital Libraries for Open Knowledge (TPDL)</article-title>
          . Springer,
          <year>2020</year>
          , pp.
          <fpage>87</fpage>
          -
          <lpage>101</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>030</fpage>
          -54956-5\_7.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>A.</given-names>
            <surname>Hobbs</surname>
          </string-name>
          .
          <article-title>A Fleet Street in every town: The provincial</article-title>
          press in England, 1855-
          <fpage>1900</fpage>
          . Open Book Publishers,
          <year>2018</year>
          . doi:
          <volume>10</volume>
          .11647/obp.0152.
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>A.</given-names>
            <surname>Hobbs</surname>
          </string-name>
          . “
          <article-title>The deleterious dominance of The Times in nineteenth-century scholarship”</article-title>
          .
          <source>In: Journal of Victorian Culture 18.4</source>
          (
          <issue>2013</issue>
          ), pp.
          <fpage>472</fpage>
          -
          <lpage>497</lpage>
          . doi:
          <volume>10</volume>
          .1080/13555502.
          <year>2013</year>
          .
          <volume>854</volume>
          519.
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>K.</given-names>
            <surname>Hosseini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Beelen</surname>
          </string-name>
          , G. Colavizza, and
          <string-name>
            <given-names>M. Coll</given-names>
            <surname>Ardanuy</surname>
          </string-name>
          . “
          <article-title>Neural language models for nineteenth-century English”</article-title>
          .
          <source>In: Journal of Open Humanities Data</source>
          (
          <year>2021</year>
          ). doi:
          <volume>10</volume>
          .5334/j ohd.
          <volume>48</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>K.</given-names>
            <surname>Hosseini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Nanni</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M. Coll</given-names>
            <surname>Ardanuy</surname>
          </string-name>
          . “
          <article-title>DeezyMatch: A 昀氀exible deep learning approach to fuzzy string matching”</article-title>
          .
          <source>In:Proceedings of the 2020</source>
          conference
          <article-title>on empirical methods in natural language processing: System demonstrations</article-title>
          .
          <source>Online: Association for Computational Linguistics</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>62</fpage>
          -
          <lpage>69</lpage>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2020</year>
          .emnlp-demos.
          <volume>9</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [24]
          <string-name>
            <surname>J. M. van Hulst</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Hasibi</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Dercksen</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Balog</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <surname>A. P.</surname>
          </string-name>
          de Vries.
          <article-title>“REL: An Entity Linker Standing on the Shoulders of Giants”</article-title>
          .
          <source>In:Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. Sigir '20</source>
          . New York: Acm,
          <year>2020</year>
          , pp.
          <fpage>2197</fpage>
          -
          <lpage>2200</lpage>
          . doi:
          <volume>10</volume>
          .1145/3397271.3401416.
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>N.</given-names>
            <surname>Kolitsas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.-E.</given-names>
            <surname>Ganea</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T.</given-names>
            <surname>Hofmann</surname>
          </string-name>
          . “
          <article-title>End-to-End Neural Entity Linking”</article-title>
          .
          <source>In:Proceedings of the 22nd Conference on Computational Natural Language Learning</source>
          . Brussels,
          <year>2018</year>
          , pp.
          <fpage>519</fpage>
          -
          <lpage>529</lpage>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>K18</fpage>
          -1050.
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>P.</given-names>
            <surname>Le</surname>
          </string-name>
          and
          <string-name>
            <surname>I. Titov.</surname>
          </string-name>
          “
          <article-title>Improving entity linking by modeling latent relations between mentions”</article-title>
          . In:
          <article-title>Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)</article-title>
          .
          <source>Melbourne: Association for Computational Linguistics</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>1595</fpage>
          -
          <lpage>1604</lpage>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>P18</fpage>
          -1148.
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>J. L.</given-names>
            <surname>Leidner</surname>
          </string-name>
          . “
          <article-title>Toponym resolution in text: annotation, evaluation and applications of spatial grounding”</article-title>
          .
          <source>In:ACM SIGIR Forum</source>
          . Vol.
          <volume>41</volume>
          . 2. New York: Association for Computing Machinery,
          <year>2007</year>
          , pp.
          <fpage>124</fpage>
          -
          <lpage>126</lpage>
          . doi:
          <volume>10</volume>
          .1145/1328964.1328989.
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>E.</given-names>
            <surname>Linhares Pontes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. A.</given-names>
            <surname>Cabrera-Diego</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. G.</given-names>
            <surname>Moreno</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Boros</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hamdi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Doucet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Sidere</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Coustaty</surname>
          </string-name>
          . “
          <article-title>MELHISSA: a multilingual entity linking architecture for historical press articles”</article-title>
          .
          <source>In:International Journal on Digital Libraries 23.2</source>
          (
          <issue>2022</issue>
          ), pp.
          <fpage>133</fpage>
          -
          <lpage>160</lpage>
          . doi:
          <volume>10</volume>
          .1007/s00799-021-00319-6.
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>E.</given-names>
            <surname>Linhares Pontes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hamdi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Sidere</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Doucet</surname>
          </string-name>
          . “
          <article-title>Impact of OCR quality on named entity linking”</article-title>
          .
          <source>In: Digital Libraries at the Crossroads of Digital Information for the Future: 21st International Conference on Asia-Paci昀椀c Digital Libraries</source>
          . Springer,
          <year>2019</year>
          , pp.
          <fpage>102</fpage>
          -
          <lpage>115</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>030</fpage>
          -34058-2\_
          <fpage>11</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>E.</given-names>
            <surname>Manjavacas</surname>
          </string-name>
          and
          <string-name>
            <given-names>L.</given-names>
            <surname>Fonteyn</surname>
          </string-name>
          . “
          <article-title>Adapting vs. Pre-training Language Models for Historical Languages”</article-title>
          .
          <source>In: Journal of Data Mining &amp; Digital Humanities Nlp4dh</source>
          (
          <year>2022</year>
          ). doi:
          <volume>10</volume>
          .462 98/jdmdh.9152.
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>K.</given-names>
            <surname>McDonough</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Moncla</surname>
          </string-name>
          , and M. van de Camp. “
          <article-title>Named entity recognition goes to old regime France: geographic text analysis for early modern French corpora”</article-title>
          .
          <source>InI:nternational Journal of Geographical Information Science</source>
          <volume>33</volume>
          .12 (
          <year>2019</year>
          ), pp.
          <fpage>2498</fpage>
          -
          <lpage>2522</lpage>
          . doi:
          <volume>10</volume>
          .1080/13658816.
          <year>2019</year>
          .
          <volume>1620235</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>P. N.</given-names>
            <surname>Mendes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Jakob</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Garcıa</surname>
          </string-name>
          ́-Silva, and
          <string-name>
            <given-names>C.</given-names>
            <surname>Bizer</surname>
          </string-name>
          . “
          <article-title>DBpedia spotlight: shedding light on the web of documents”</article-title>
          .
          <source>In:Proceedings of the 7th international conference on semantic systems. 2011</source>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>8</lpage>
          . doi:
          <volume>10</volume>
          .1145/2063518.2063519.
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [34]
          <string-name>
            <given-names>R.</given-names>
            <surname>Mihalcea</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Csomai</surname>
          </string-name>
          . “Wikify!
          <article-title>Linking documents to encyclopedic knowledge”</article-title>
          .
          <source>In: Proceedings of the sixteenth ACM conference on Conference on information and knowledge management</source>
          . New York: Association for Computing Machinery,
          <year>2007</year>
          , pp.
          <fpage>233</fpage>
          -
          <lpage>242</lpage>
          . doi:
          <volume>10</volume>
          .1145/1321440.1321475.
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          [35]
          <string-name>
            <given-names>D.</given-names>
            <surname>Milne</surname>
          </string-name>
          and
          <string-name>
            <given-names>I. H.</given-names>
            <surname>Witten</surname>
          </string-name>
          . “
          <article-title>Learning to link with Wikipedia”</article-title>
          .
          <source>InP:roceedings of the 17th ACM conference on Information and knowledge management</source>
          . New York: Association for Computing Machinery,
          <year>2008</year>
          , pp.
          <fpage>509</fpage>
          -
          <lpage>518</lpage>
          . doi:
          <volume>10</volume>
          .1145/1458082.1458150.
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          [36]
          <string-name>
            <given-names>G.</given-names>
            <surname>Munnelly</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Lawless</surname>
          </string-name>
          . “
          <article-title>Investigating entity linking in early English legal documents”</article-title>
          .
          <source>In: Proceedings of the 18th ACM/IEEE on Joint Conference on Digital Libraries</source>
          . New York: Acm,
          <year>2018</year>
          , pp.
          <fpage>59</fpage>
          -
          <lpage>68</lpage>
          . doi:
          <volume>10</volume>
          .1145/3197026.3197055.
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          [37]
          <string-name>
            <given-names>J.</given-names>
            <surname>Nicholson</surname>
          </string-name>
          . “
          <source>Popular Imperialism and the Provincial Press: Manchester Evening and Weekly Papers</source>
          ,
          <fpage>1895</fpage>
          -
          <lpage>1902</lpage>
          ”.
          <source>In: Victorian Periodicals Review 13.3</source>
          (
          <issue>1980</issue>
          ), pp.
          <fpage>85</fpage>
          -
          <lpage>96</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          [38]
          <string-name>
            <given-names>A.</given-names>
            <surname>Olieman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Beelen</surname>
          </string-name>
          , M. van
          <string-name>
            <surname>Lange</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Kamps</surname>
            , and
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Marx</surname>
          </string-name>
          . “
          <article-title>Good Applications for Crummy Entity Linkers? The Case of Corpus Selection in Digital Humanities”</article-title>
          .
          <source>Ina:rXiv preprint arXiv:1708.01162</source>
          .
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref41">
        <mixed-citation>
          <string-name>
            <given-names>N.</given-names>
            <surname>Pedrazzini</surname>
          </string-name>
          and
          <string-name>
            <surname>B. McGillivray.</surname>
          </string-name>
          “
          <article-title>Machines in the media: semantic change in the lexicon of mechanization in 19th-century British newspapers”</article-title>
          .
          <source>In:Proceedings of the 2nd International Workshop on Natural Language Processing for Digital Humanities. Taipei: Association for Computational Linguistics</source>
          ,
          <year>2022</year>
          , pp.
          <fpage>85</fpage>
          -
          <lpage>95</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref42">
        <mixed-citation>
          [40]
          <string-name>
            <given-names>F.</given-names>
            <surname>Piccinno</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Ferragina</surname>
          </string-name>
          . “
          <article-title>From TagME to WAT: a new entity annotator”</article-title>
          .
          <source>InP:roceedings of the 昀椀rst international workshop on Entity recognition &amp; disambiguation</source>
          . New York: Association for Computing Machinery,
          <year>2014</year>
          , pp.
          <fpage>55</fpage>
          -
          <lpage>62</lpage>
          . doi:
          <volume>10</volume>
          .1145/2633211.2634350.
        </mixed-citation>
      </ref>
      <ref id="ref43">
        <mixed-citation>
          [41]
          <string-name>
            <given-names>L.</given-names>
            <surname>Ratinov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Roth</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Downey</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Anderson</surname>
          </string-name>
          . “
          <article-title>Local and global algorithms for disambiguation to Wikipedia”</article-title>
          . In:
          <article-title>Proceedings of the 49th annual meeting of the association for computational linguistics: Human language technologies</article-title>
          .
          <source>Portland</source>
          ,
          <year>2011</year>
          , pp.
          <fpage>1375</fpage>
          -
          <lpage>1384</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref44">
        <mixed-citation>
          <string-name>
            <given-names>M.</given-names>
            <surname>Rovera</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Nanni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. P.</given-names>
            <surname>Ponzetto</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Goy</surname>
          </string-name>
          . “
          <article-title>Domain-speci昀椀c Named Entity Disambiguation in Historical Memoirs”</article-title>
          .
          <source>In:Proceedings of the Fourth Italian Conference on Computational Linguistics</source>
          (CLiC-it
          <year>2017</year>
          ). Vol.
          <year>2006</year>
          . CEUR Workshop Proceedings. Rome,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref45">
        <mixed-citation>
          [43]
          <string-name>
            <given-names>R.</given-names>
            <surname>Santos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Murrieta-Flores</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Calado</surname>
          </string-name>
          , and
          <string-name>
            <given-names>B.</given-names>
            <surname>Martins</surname>
          </string-name>
          . “
          <article-title>Toponym matching through deep neural networks”</article-title>
          .
          <source>In: International Journal of Geographical Information Science 32.2</source>
          (
          <issue>2018</issue>
          ), pp.
          <fpage>324</fpage>
          -
          <lpage>348</lpage>
          . doi:
          <volume>10</volume>
          .1080/13658816.
          <year>2017</year>
          .
          <volume>1390119</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref46">
        <mixed-citation>
          [44]
          <string-name>
            <given-names>S.</given-names>
            <surname>Schweter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>März</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Schmid</surname>
          </string-name>
          , and
          <string-name>
            <given-names>E.</given-names>
            <surname>Çano</surname>
          </string-name>
          . “hmBERT:
          <article-title>Historical Multilingual Language Models for Named Entity Recognition”</article-title>
          . In:Conference and
          <article-title>Labs of the Evaluation Forum (CLEF)</article-title>
          .
          <source>Vol. 3180. CEUR Workshop Proceedings</source>
          ,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref47">
        <mixed-citation>
          [45]
          <string-name>
            <given-names>A.</given-names>
            <surname>Sil</surname>
          </string-name>
          , G. Kundu,
          <string-name>
            <given-names>R.</given-names>
            <surname>Florian</surname>
          </string-name>
          , and
          <string-name>
            <given-names>W.</given-names>
            <surname>Hamza</surname>
          </string-name>
          . “
          <article-title>Neural cross-lingual entity linking”</article-title>
          .
          <source>In: Thirty-Second AAAI Conference on Arti昀椀cial Intelligence</source>
          . AAAI Press,
          <year>2018</year>
          , pp.
          <fpage>5464</fpage>
          -
          <lpage>5472</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref48">
        <mixed-citation>
          <article-title>“Assessing the impact of OCR quality on downstream NLP tasks”</article-title>
          .
          <source>In:Proceedings of the 12th International Conference on Agents and Arti昀椀cial Intelligence (ICAART)</source>
          . Volume 1: ARTIDIGH.
          <year>2020</year>
          , pp.
          <fpage>484</fpage>
          -
          <lpage>496</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref49">
        <mixed-citation>
          [47]
          <string-name>
            <given-names>G.</given-names>
            <surname>Tolfo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Vane</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Beelen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Hosseini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lawrence</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Beavan</surname>
          </string-name>
          , and
          <string-name>
            <given-names>K.</given-names>
            <surname>McDonough</surname>
          </string-name>
          . “
          <article-title>Hunting for Treasure: Living with Machines and the British Library Newspaper Collection”</article-title>
          . In:
          <article-title>Digitised Newspapers - A New Eldorado for Historians?: Re昀氀ections on Tools, Methods</article-title>
          and Epistemology. Ed. by E. Bunout,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ehrmann</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Clavert. De Gruyter Oldenbourg</surname>
          </string-name>
          ,
          <year>2023</year>
          , pp.
          <fpage>23</fpage>
          -
          <lpage>46</lpage>
          . doi:
          <volume>10</volume>
          .1515/
          <fpage>9783110729214</fpage>
          -
          <lpage>002</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref50">
        <mixed-citation>
          [49]
          <string-name>
            <given-names>T.</given-names>
            <surname>Wolf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Debut</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Sanh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Chaumond</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Delangue</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Moi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Cistac</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Rault</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Louf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Funtowicz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Davison</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Shleifer</surname>
          </string-name>
          , P. von Platen, C. Ma,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Jernite</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Plu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. Le</given-names>
            <surname>Scao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gugger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Drame</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Lhoest</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Rush</surname>
          </string-name>
          . “Transformers:
          <article-title>State-of-theArt Natural Language Processing”</article-title>
          .
          <source>In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. Online: Association for Computational Linguistics</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>38</fpage>
          -
          <lpage>45</lpage>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2020</year>
          .emnlp-demos.
          <volume>6</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref51">
        <mixed-citation>
          [50]
          <string-name>
            <given-names>I.</given-names>
            <surname>Yamada</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Asai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Sakuma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Shindo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Takeda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Takefuji</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Matsumoto</surname>
          </string-name>
          . “
          <article-title>Wikipedia2Vec: An E昀케cient Toolkit for Learning and Visualizing the Embeddings of Words and Entities from Wikipedia”</article-title>
          .
          <source>InP:roceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. Online: Association for Computational Linguistics</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>23</fpage>
          -
          <lpage>30</lpage>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2020</year>
          .emnlp-demos.
          <volume>4</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref52">
        <mixed-citation>
          [51]
          <string-name>
            <given-names>I.</given-names>
            <surname>Yamada</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Takeda</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Takefuji</surname>
          </string-name>
          . “
          <article-title>Enhancing named entity recognition in Twitter messages using entity linking”</article-title>
          .
          <source>In: Proceedings of the Workshop on Noisy User-generated Text</source>
          . Beijing: Association for Computational Linguistics,
          <year>2015</year>
          , pp.
          <fpage>136</fpage>
          -
          <lpage>140</lpage>
          .
          <year>doi1</year>
          :
          <fpage>0</fpage>
          .186 53/v1/
          <fpage>W15</fpage>
          -4320.
        </mixed-citation>
      </ref>
      <ref id="ref53">
        <mixed-citation>
          <source>"ner_score": 0</source>
          .
          <fpage>999</fpage>
          ,
          <article-title>"pos": 74, "sent_idx": 0, "end_pos": 80, "tag": "LOC", "sentence": "Inspector Liddle said: I am an inspector of police, living in the city of Durham.", "prediction": "Q179815", "</article-title>
          <source>ed_score": 0</source>
          .
          <fpage>039</fpage>
          ,
          <article-title>"cross_cand_score": { "Q179815": 0.396, "</article-title>
          <source>Q23082": 0.327, "Q49229": 0.141, "Q5316459": 0.049, "Q458393": 0.045, "Q17003433": 0.042, "Q1075483": 0</source>
          .
          <fpage>0</fpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>