<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>CLEF</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Named Entity Recognition and Linking on Historical Newspapers: UvA.ILPS &amp; REL at CLEF HIPE 2020</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Vera Provatorova</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Svitlana Vakulenko</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Evangelos Kanoulas</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Koen Dercksen</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Johannes M van Hulst</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Radboud University</institution>
          ,
          <addr-line>Nijmegen</addr-line>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Amsterdam</institution>
          ,
          <addr-line>Amsterdam</addr-line>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2020</year>
      </pub-date>
      <volume>22</volume>
      <fpage>22</fpage>
      <lpage>25</lpage>
      <abstract>
        <p>This paper describes our submission to the CLEF HIPE 2020 shared task on identifying named entities in multi-lingual historical newspapers in French, German and English. The subtasks we addressed in our submission include coarse-grained named entity recognition, entity mention detection and entity linking. For the task of named entity recognition we used an ensemble of ne-tuned BERT models; entity linking was approached by three di erent methods: (1) a simple method relying on ElasticSearch retrieval scores, (2) an approach based on contextualised text embeddings, and (3) REL, a modular entity linking system based on several state-of-the-art components.</p>
      </abstract>
      <kwd-group>
        <kwd>Named Entity Linking</kwd>
        <kwd>Named Entity Recognition</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Named entity identi cation is an important task in information extraction.
Detecting, classifying and linking named entities helps to enable semantic search,
which can be used for di erent domain applications, such as digital
humanities [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. One example is information retrieval from historical corpora. Identifying
entities in historical documents poses several important challenges due to the
nature of historical texts. These challenges include OCR errors in document scans,
historical spelling variations and semantic shifts [
        <xref ref-type="bibr" rid="ref12 ref5">12, 5</xref>
        ]. This paper describes the
submissions prepared by our joint team from the University of Amsterdam and
Radboud University for the CLEF HIPE shared task. The main focus of CLEF
HIPE is on systematic evaluation of named entity recognition and linking
methods on multilingual diachronic historical data [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. The shared task consists of
several subtasks grouped into ve bundles. Every team was allowed to submit
one bundle per language, with the exception of bundle 5 (named entity linking
given canonical mention spans), which was evaluated separately and could be
combined with any other bundle.
      </p>
      <p>Our submission targeted three of the subtasks in HIPE: (1) coarse-grained
named entity recognition (NERC), (2) end-to-end named entity linking (NEL)
using a modi ed NERC task for entity mention detection, and (3) named
entity linking using mention spans provided by the organisers (NEL-only). Entity
mention detection in this case was a supplementary task: it was not evaluated
directly within the system submissions, but served as a preparation step for NEL
in the setting of bundle 2, where entity mention boundaries were not given in the
test data. In all the subtasks, we only considered the literal sense of the entities.</p>
      <p>
        For the rst phase of the shared task, we designed solutions for English,
German and French languages within bundle 2, which included identifying,
classifying and linking coarse-grained entities. For the second phase, bundle 5, we
focused on one language only (English) and compared our results to the
out-ofthe-box tool, Radboud Entity Linker (REL) [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], as a competitive baseline.
2
2.1
      </p>
    </sec>
    <sec id="sec-2">
      <title>Bundle 2: Named Entity Recognition and Linking</title>
      <sec id="sec-2-1">
        <title>Experimental setup</title>
        <p>Datasets and resources. The dataset provided by the CLEF HIPE
organisers consists of diachronically organised digitised historical newspaper articles
in English, German and French. The data is annotated using the standard
inside{outside{beginning (IOB) format and presented as tab-separated values,
where each row corresponds to a single token.</p>
        <p>
          While validation datasets are provided for all of the three languages, training
data are only available for German and French. To provide the token classi
cation model with a su cient amount of training data for English, we used
CoNLL-03 [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] as an auxiliary dataset.
        </p>
        <p>
          Approach. We consider both NERC and entity mention detection tasks as
instances of the sequence classi cation task. For the NERC task, 5 entity types
(org, pers, prod, loc, and time) form 11 classes when annotated in the IOB
format: each of the types has its "B-" and "I-" labels corresponding to the tokens
at the beginning and inside of an entity (e.g., "B-pers" and "I-pers"), while the
"O" label marks the remaining tokens which are outside of named entities. For
mention detection, 3 classes are considered: \B-entity", \I-entity", and \O". To
perform sequence classi cation, we ne-tuned two pretrained BERT models [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]
provided by the Hugging Face Transformers library [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]: bert-base-cased for
English and bert-base-multilingual-cased for French and German. To
improve robustness of the approach, we used a majority vote ensemble of 5 model
instances per language ne-tuned on the training data with di erent numbers of
epochs, as well as di erent random seed values, where 5 num epochs 9
and random seed = 42 + num epochs.
        </p>
        <p>
          To perform entity linking, we used ElasticSearch [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] to index all Wikidata
entity labels and search for each of the entity mentions extracted from the input
data to retrieve candidate entities. All the retrieved entities were included as
candidates, without ltering on type. Candidate entity ranking was performed
based on ElasticSearch retrieval scores combined with several heuristics,
preferring precise matching and shorter entity IDs (assuming that the entities with
shorter IDs that were added to Wikidata earlier are typically more general and
therefore more likely to be correct in many cases). We used the latest
Wikidata dump from 9th of March 2020 which contains more than 55M entities.
An important limitation of our approach is that it relied solely on the
Englishlanguage labels, which is likely to hinder its performance on some of the named
entities that vary across languages, such as \Geneva" in English versus \Genf"
in German.
2.2
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>Results and discussion</title>
        <p>The submissions were evaluated with the HIPE scorer, which is provided by
the shared task organisers and available on github.3 The scores achieved by our
submissions on the NERC task are presented in Table 1.</p>
        <p>
          The baseline provided by the HIPE organisers for the NERC-coarse task
uses a traditional CRF sequence classi cation method. The top solution for
all languages is developed by the L3i team, with extra layers added on top of
several pre-trained BERT models and trained in a multi-task learning setting to
minimize the impact of OCR-generated noise, historical spelling variations and
other challenges speci c to the data [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. Our approach outperforms the baseline
but achieves signi cantly lower results in comparison with the top solution. It
shows that, while transformer-based approaches are a promising direction for
named entity recognition, using a majority vote ensemble of ne-tuned models
without any extra modi cations is not likely to be su cient for the setting of
noisy historical data.
        </p>
        <p>For the end-to-end NEL task, the HIPE baseline is AIDA-light trained on
English Wikipedia. The best solution was submitted by the L3i team using entity
embeddings trained on Wikipedia and Wikidata, combined with probabilistic
mapping. The results achieved by our submissions are presented in Table 2 and
compared with these two approaches.</p>
        <p>For English and German, our submission scores above the baseline but far
below the top solution, which is not surprising given the simplicity of our
approach. For French the recall values of our submission are below the baseline.
We assume that the main reason for this performance drop is due to the fact
that most of the French entities could not be found in the English-only Wikidata
index used in our system. We conclude that the bottleneck of our approach is
the entity retrieval rather than entity mention detection.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Bundle 5: Named Entity Linking with Correct Mention</title>
    </sec>
    <sec id="sec-4">
      <title>Spans</title>
      <p>3.1</p>
      <sec id="sec-4-1">
        <title>Experimental setup</title>
        <p>
          Datasets and resources. Our system runs were prepared using the same HIPE
corpora as in bundle 2, with no extra training data. The algorithm designed for
the rst two runs used pre-trained contextualised Flair string embeddings [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]
provided by the task organisers.
        </p>
        <p>Methods. For the rst two runs, candidate entity retrieval was done the same
way as in bundle 2. To perform candidate entity ranking, we calculated cosine
similarity between contextual embeddings of a sentence containing the target
entity mention and a modi ed sentence, where the target entity mention is
replaced with candidate entity description extracted from Wikidata. For example,
if the target sentence is "We went to London for a weekend" and a candidate
entity is Q84 with the label London and the description "capital and largest
city of the United Kingdom", then the modi ed sentence would be "We went to
capital and largest city of the United Kingdom for a weekend".</p>
        <p>
          The idea behind our approach resides upon two basic assumptions: (1)
Wikidata entity descriptions are semantically similar to the corresponding entity
labels, and (2) contextualised string embeddings capture similarity between
entity descriptions and entity labels. After calculating the cosine similarity score,
it is multiplied by the Levenshtein similarity ratio between target and candidate
entity labels to prefer precise matching where possible. In the example above,
if one of the candidates is Q23306: Greater London then its score would be
multiplied by sim('London', 'Greater London') = 0.6, while the score for Q84:
London would remain the same, as sim('London', 'London') = 1. The similarity
ratio was calculated using the FuzzyWuzzy string matching library [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ].
        </p>
        <p>
          After using the resulting score to rank the list of candidate entities, a NIL
value is inserted to the list before the rst candidate that has a score below
threshold. We chose the threshold value of 0.7 after tuning this parameter on the
development set. For submission 2 only, we added historical spelling variations
to the step of candidate retrieval using Natas library that performs historical
normalisation via neural machine translation [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ].
        </p>
        <p>
          The third run was prepared using REL [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] { a modular system that is
based on several state-of-the-art components, available as a Python library as
well as a web API4. Entity linking in REL is divided into three components: (i)
mention detection, (ii) candidate selection, and (iii) entity disambiguation. For
this submission, mention detection was skipped since the mention spans were
already provided by the organisers as the ground truth. Candidate selection
consists of retrieving seven candidates for each mention. The rst four candidates
are retrieved based on the co-occurence probability of entities given a speci c
mention (a so called p(ejm) index ). The remaining three are selected based on
their contextual similarity to the mention in an embedding space.
        </p>
        <p>Entity disambiguation decisions are made by combining local compatibility
(which includes prior importance and contextual similarity) and coherence with
the other entity linking decisions in a document (global context).
3.2</p>
      </sec>
      <sec id="sec-4-2">
        <title>Results and discussion</title>
        <p>
          Run 1: Baseline. While the results @1 are below the HIPE baseline (Table 3),
the performance @3 and @5 is better (Table 4). Similar results were achieved on
the development set: while the correct entity would often make it to the top-5 or
top-3 of the ranked candidate list, it was rarely selected by the algorithm as the
most relevant answer, and the di erence between candidate scores was usually
small. The algorithm was not directly optimised for top-1 candidate selection.
Another obstacle for the algorithm was NIL detection: as 30% of the mentions
were not linkable [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ], simply adding the NIL value to the ranked list of candidates
based on the xed threshold value was not a su cient approach and resulted in
an overwhelming number of false positives.
        </p>
        <p>@3 @5</p>
        <p>F P R F P R
Run #1 Baseline .463 .467 .465 .552 .557 .555</p>
        <p>
          Run #2 Historical .451 .463 .457 .540 .555 .548
Run 2: Historical normalisation. Adding extra candidate entities by means
of historical normalisation in the second submission has resulted in more false
positives and slightly decreased overall performance in comparison to the rst
submission. A likely explanation is that the normalisation algorithm was focusing
on infrequent historical spellings [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ], most of which are not likely to be present
in the HIPE dataset.
        </p>
        <p>Run 3: REL. REL performs very well and takes the second place in the scoring
table, which is rather remarkable for an out-of-the box linking system. We showed
that REL provides a strong baseline for the NEL task on historical documents,
demonstrating the state-of-the-art performance that can be reached without
accounting for additional properties, such as OCR errors and language change.
4</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusion and future work</title>
      <p>Our contributions within the CLEF HIPE shared task approached coarse-grained
named entity recognition (NERC) and two settings of entity linking: end-to-end
and NEL-only. The results for NERC show that although ne-tuning BERT
models for sequence classi cation is enough to outperform the baselines for all
three languages, achieving top performance requires extra modi cations in
order to deal with the challenges speci c to historical data. The NEL results show
that, while using an embedding-based approach that takes historical spelling
variations into account is better than relying solely on ElasticSearch retrieval
scores, this approach is clearly outperformed by REL, as well as by many other
solutions { mostly due to its poor performance on NIL prediction and an
overwhelming number of false positives on the candidate selection step. REL, in
its turn, proves very e cient in the setting of the shared task, even without
speci cally addressing the challenges of the historical data.</p>
      <p>
        There are several possible directions for future work considering all the
subtasks that we approached in the context of the shared task:
Entity recognition and classi cation. Some examples of the ways to achieve
improvements over the state-of-the-art sequence classi cation methods within
the given task setup include (i) performing a more extensive parameter search
for the Transformer models; (ii) ne-tuning more advanced pre-trained models
(such as RoBERTa [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]), and (iii) reducing the impact of the noise in the training
data by using OCR correction algorithms, such as [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>Entity linking. Since the task of entity linking consists of several steps,
including candidate generation and entity disambiguation, we see further opportunities
for improvement on each of these steps. Firstly, candidate generation can be
improved to increase recall. One of the ways to achieve this goal is to use OCR
correction as a pre-processing step in the algorithm. Secondly, entity
disambiguation should be improved upon in order to increase precision by decreasing the
number of false positives. We consider graph-based disambiguation methods as
a promising research direction. Thirdly, using entity types as features instead of
only relying on mention boundaries could also improve entity disambiguation in
the end-to-end setting.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgements</title>
      <p>This research was supported by the NWO Innovational Research Incentives
Scheme Vidi (016.Vidi.189.039), the NWO Smart Culture - Big Data /
Digital Humanities (314-99-301), the H2020-EU.3.4. - SOCIETAL CHALLENGES
Smart, Green And Integrated Transport (814961) the Google Faculty Research
Awards program. All content represents the opinion of the authors, which is not
necessarily shared or endorsed by their respective employers and/or sponsors.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Akbik</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bergmann</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Blythe</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rasul</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schweter</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vollgraf</surname>
          </string-name>
          , R.:
          <article-title>Flair: An easy-to-use framework for state-of-the-art nlp</article-title>
          .
          <source>In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations)</source>
          . pp.
          <volume>54</volume>
          {
          <issue>59</issue>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Boros</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Linhares Pontes</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cabrera-Diego</surname>
            ,
            <given-names>L.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hamdi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Moreno</surname>
            ,
            <given-names>J.G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sidere</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Doucet</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Robust Named Entity Recognition and Linking on Historical Multilingual Documents</article-title>
          . In: Cappellato,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Eickho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Neveol</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . (eds.) CLEF 2020 Working Notes. Working Notes of CLEF 2020 -
          <article-title>Conference and Labs of the Evaluation Forum. CEUR-WS (</article-title>
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Devlin</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>M.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Toutanova</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Bert: Pre-training of deep bidirectional transformers for language understanding</article-title>
          . arXiv preprint arXiv:
          <year>1810</year>
          .
          <volume>04805</volume>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Divya</surname>
            ,
            <given-names>M.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goyal</surname>
            ,
            <given-names>S.K.</given-names>
          </string-name>
          :
          <article-title>Elasticsearch: An advanced and quick search technique to handle voluminous data</article-title>
          .
          <source>Compusoft</source>
          <volume>2</volume>
          (
          <issue>6</issue>
          ),
          <volume>171</volume>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Ehrmann</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Colavizza</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rochat</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kaplan</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Diachronic Evaluation of NER Systems on Old Newspapers</article-title>
          .
          <source>In: Proceedings of the 13th Conference on Natural Language Processing (KONVENS</source>
          <year>2016</year>
          )
          <article-title>)</article-title>
          . pp.
          <volume>97</volume>
          {
          <fpage>107</fpage>
          .
          <string-name>
            <surname>Bochumer Linguistische Arbeitsberichte</surname>
          </string-name>
          (
          <year>2016</year>
          ), https://infoscience.ep .ch/record/221391?ln=en
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Ehrmann</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Romanello</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , Fluckiger,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Clematide</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.</surname>
          </string-name>
          :
          <source>Overview of CLEF HIPE</source>
          <year>2020</year>
          :
          <article-title>Named Entity Recognition and Linking on Historical Newspapers</article-title>
          . In: Arampatzis,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Kanoulas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            ,
            <surname>Tsikrika</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Vrochidis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Joho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            ,
            <surname>Lioma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Eickho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Neveol</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Cappellato</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Ferro</surname>
          </string-name>
          , N. (eds.)
          <article-title>Experimental IR Meets Multilinguality, Multimodality, and Interaction</article-title>
          .
          <source>Proceedings of the 11th International Conference of the CLEF Association (CLEF</source>
          <year>2020</year>
          ).
          <source>Lecture Notes in Computer Science (LNCS)</source>
          , vol.
          <volume>12260</volume>
          . Springer (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Gonzalez</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rodrigues</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cohen</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Fuzzywuzzy: Fuzzy string matching in python (</article-title>
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8. Hamalainen,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Hengchen</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.</surname>
          </string-name>
          :
          <article-title>From the paft to the iture: a fully automatic NMT and word embeddings method for OCR post-correction</article-title>
          .
          <source>In: Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP</source>
          <year>2019</year>
          ). pp.
          <volume>431</volume>
          {
          <fpage>436</fpage>
          . INCOMA Ltd.,
          <string-name>
            <surname>Varna</surname>
          </string-name>
          ,
          <source>Bulgaria (Sep</source>
          <year>2019</year>
          ), https://www.aclweb.org/anthology/R19-1051
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9. Hamalainen,
          <string-name>
            <surname>M.</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Sa</given-names>
            ily, T.,
            <surname>Rueter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Tiedemann</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          , Makela, E.:
          <article-title>Revisiting NMT for normalization of early English letters</article-title>
          .
          <source>In: Proceedings of the 3rd Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage</source>
          ,
          <source>Social Sciences, Humanities and Literature</source>
          . pp.
          <volume>71</volume>
          {
          <fpage>75</fpage>
          . Association for Computational Linguistics, Minneapolis, USA (Jun
          <year>2019</year>
          ), https://www.aclweb.org/anthology/W19-2509
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10. van Hulst,
          <string-name>
            <given-names>J.M.</given-names>
            ,
            <surname>Hasibi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            ,
            <surname>Dercksen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            ,
            <surname>Balog</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            ,
            <surname>de Vries</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.P.</surname>
          </string-name>
          : REL:
          <article-title>An entity linker standing on the shoulders of giants</article-title>
          .
          <source>In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR '20</source>
          ,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ott</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goyal</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Du</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Joshi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Levy</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lewis</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zettlemoyer</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stoyanov</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Roberta: A robustly optimized bert pretraining approach</article-title>
          . arXiv preprint arXiv:
          <year>1907</year>
          .
          <volume>11692</volume>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Piotrowski</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Natural language processing for historical texts</article-title>
          .
          <source>Synthesis lectures on human language technologies 5(2)</source>
          ,
          <volume>1</volume>
          {
          <fpage>157</fpage>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Provatorova</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kanoulas</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Carlgren</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dupre</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hendriksen</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <string-name>
            <surname>Art</surname>
            <given-names>DATIS</given-names>
          </string-name>
          :
          <article-title>Improving search in multilingual corpora to support art historians</article-title>
          . Digital Humanities Benelux '
          <volume>19</volume>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Tjong Kim Sang</surname>
            ,
            <given-names>E.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>De Meulder</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition</article-title>
          .
          <source>In: Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003</source>
          . pp.
          <volume>142</volume>
          {
          <issue>147</issue>
          (
          <year>2003</year>
          ), https://www.aclweb.org/anthology/W03-0419
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Wolf</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Debut</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sanh</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chaumond</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Delangue</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Moi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cistac</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rault</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Louf</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Funtowicz</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , et al.:
          <article-title>Huggingface's transformers: State-ofthe-art natural language processing</article-title>
          . ArXiv pp.
          <source>arXiv{1910</source>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>