<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>DeLFT and entity-fishing : Tools for CLEF HIPE 2020 Shared Task</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Inria</institution>
          ,
          <addr-line>Paris</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>This article presents an overview of approaches and results during our participation in the CLEF HIPE 2020 NERC-COARSE-LIT and EL-ONLY tasks for English and French. For these two tasks, we use two systems: 1) DeLFT, a Deep Learning framework for text processing; 2) entity-fishing, generic named entity recognition and disambiguation service deployed in the technical framework of INRIA.</p>
      </abstract>
      <kwd-group>
        <kwd>entity recognition</kwd>
        <kwd>entity linking</kwd>
        <kwd>machine learning</kwd>
        <kwd>deep learning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Both DeLFT1 and entity-fishing 2 are open-source systems under an Apache
2 license. Codes, models, and resources that are publicly available through the
Github repository allow users and contributors, including us, to use existing
services and models and to contribute to system and model improvements and
tests. With the use of DeLFT, our goal is to re-build DL-based models for
recognizing mentions within English and French HIPE historical corpus belonging
to Person (pers), Location (LOC), Organization (ORG), Product (PROD), and
Time (TIME) classes. Meanwhile, with entity-fishing service, we use the
service to disambiguate provided mentions within French and English HIPE data
against Wikidata entries.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Tools</title>
      <p>2.1</p>
      <sec id="sec-2-1">
        <title>DeLFT</title>
        <p>This section provides a general description of the two systems we use, DeLFT and
entity-fishing. For a better understanding of technical discussions, it is advisable
to refer directly to their repositories and documentation.</p>
        <p>Deep Learning Framework for Text (DeLFT) is an open-source framework for
text processing, including sequence labeling (e.g., NE tagging) and text
classification problems. This Keras and TensorFlow framework re-implements standard
state-of-the-art DL architectures for text processing.</p>
        <p>
          DeLFT supports many DL architectures (e.g., Bidirectional LSTMs and
Conditional Random Fields [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ], Bidirectional LSTM and Convolutional Neural
Networks [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ], Bidirectional Gated Recurrent Unit [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]) and contextualized
embeddings (e.g., ELMo, BERT). For using the desired pre-trained word embeddings,
we need to provide them from the source separately. DeLFT then loads and
manages these embeddings by compiling at the very first access to be stored in
a database.
2.2
        </p>
        <p>
          entity-fishing
entity-fishing is a generic NERD system against Wikidata. Deployed as part
of the French national infrastructure Huma-Num3, the service provides a
standardized interface, open and flexible architecture, allowing easy deployment,
including in digital humanities contexts. Initiated in the context of the EU FP7
Cendari project from 2013 to 2016, entity-fishing aimed at setting up a digital
research environment for historians of the medieval and WWI periods to access
archival contents and acquire numerous assets or entities information [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ].
        </p>
        <p>In general, entity-fishing has three phases: language identification, mention
recognition, and entity resolution. First, language identification is necessary for</p>
        <sec id="sec-2-1-1">
          <title>1 https://github.com/kermitt2/delft 2 https://github.com/kermitt2/entity-fishing 3 https://www.huma-num.fr/</title>
          <p>selecting appropriate utilities for text processing (e.g., tokenizer, sentence
segmentation) and a specific Wikipedia from the knowledge base. Second, mention
recognition has the responsibility to extract entity mentions from the input.
To support the generic nature, even though prepared with a set of recognizers,
entity-fishing provides the possibility for users to extend with additional ones.</p>
          <p>entity-fishing supports traditional mention extractors: named entity
recognition, Wikipedia lookup, acronym extraction. For NER, entity-fishing uses
grobidner. grobid-ner4 is a library for processing texts, extracting named entities, and
classifying these entities into 27 classes (e.g., person, location, media,
organization, period) using a Conditional Random Field (CRF) statistical model.
Meanwhile, Wikipedia lookup is complementary to the machine learning NER
approach. The lookup attempts to find all mentions that correspond to either a
title or an anchor in Wikipedia using an N-gram-based matching approach. For
the acronyms, entity-fishing treats them as mentions and uses the base for
disambiguation. The resolved entity is then further propagated in the text for each
occurrence of the acronym. The result of the mention recognition step is an
aggregated list of objects containing raw values from the original text, their actual
positions, and NER classes (within the 27 classes).</p>
          <p>Lastly, entity resolution is the process of matching entity mentions to their
corresponding Wikidata entries. The entity resolution has three stages: candidate
generation, candidate ranking, candidate selector. In the candidate generation
phase, each mention has a list of possible candidates for the disambiguation.
Then, in the candidate ranking, each candidate is assigned a confidence score
calculated as regression probability using various features.
2.3</p>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>Auxiliary Resources</title>
        <p>We use external datasets and embeddings in addition to those provided by the
HIPE organizers.</p>
        <p>Datasets The HIPE corpus consists of training, dev, and test datasets for
each task and language. However, since for English, HIPE does not provide the
training data, we use a pre-trained CoNLL-2012 (based on Ontonotes 5.0) [17]
model and test the model with the HIPE test data.</p>
        <p>Moreover, motivated by the promising French model results published by
DeLFT, we use the annotated [19] French TreeBank (FTB) corpus and the HIPE
data to re-build and benchmark the NER French model. This French journalistic
genre corpus from the year 1990 is the annotated 2007 version of the FTB
treebank containing the span, the literal type, sometimes completed with a subtype,
and Aleda unique identifier of each mention. They use seven basic classes:
Person, Location, Organization, Company, Product, POI (Point of Interest), and
FictionChar (fictional character). FTB corpus contains 11,636 manually
annotated mentions with the distribution of 3,761 location names, 3,357 company</p>
        <sec id="sec-2-2-1">
          <title>4 https://github.com/kermitt2/grobid-ner</title>
          <p>published systems.</p>
          <p>Model</p>
          <p>
            DeLFT models [
            <xref ref-type="bibr" rid="ref3">3</xref>
            ]
CoNLL-2003-BiLSTM-CRF + GloVe
CoNLL-2003-BiLSTM-CRF + GloVe + ELMo
CoNLL-2003-BiLSTM-CRF + GloVe + ELMo + valid set
CoNLL-2003-BiLSTM-CNN-CRF + GloVe
CoNLL-2003-BiLSTM-CNN-CRF + GloVe + ELMo
CoNLL-2003-BiLSTM-CNN-CRF + GloVe + ELMo + valid set
CoNLL-2003-BiLSTM-CNN + GloVe
CoNLL-2003-BiLSTM-CNN + GloVe + ELMo
CoNLL-2003-BiLSTM-CNN + GloVe + ELMo + valid set
CoNLL-2003-BiGRU-CRF + GloVe
CoNLL-2003-BiGRU-CRF + GloVe + ELMo
CoNLL-2003-BiGRU-CRF + GloVe + ELMo + valid set
CoNLL-2003-BERT-base
CoNLL-2003-BERT-base + CRF
CoNLL-2012-BiLSTM-CRF + fastText
CoNLL-2012-BiLSTM-CRF + fastText + ELMo
FTB-BiLSTM-CRF + fastText
FTB-BiLSTM-CRF + fastText + ELMo
Lample, et al. (2016) [
            <xref ref-type="bibr" rid="ref8">8</xref>
            ]
Ma and Hovy (2016) [
            <xref ref-type="bibr" rid="ref10">10</xref>
            ]
Chiu and Eric (2016) [
            <xref ref-type="bibr" rid="ref2">2</xref>
            ]
Peters, et al. (2018) [16]
Devlin, et al. (2018) [
            <xref ref-type="bibr" rid="ref4">4</xref>
            ]
Ratinov and Roth (2009) [18]
Passos, et al. (2014) [
            <xref ref-type="bibr" rid="ref13">13</xref>
            ]
Luo, et al. (2015) [
            <xref ref-type="bibr" rid="ref9">9</xref>
            ]
Luo, et al. (2015) + linking [
            <xref ref-type="bibr" rid="ref9">9</xref>
            ]
          </p>
          <p>neural achitectures
non neural achitectures
91.35
92.71
93.09
91.07
92.57
93.04
89.47
92.00
92.16
90.72
92.44
92.71
90.90
91.20
90.94
91.21
91.62
92.22
92.80
90.80
90.90
89.90
91.20
87.01
89.01
86.28
82.30
87.45
89.23
names, 2,381 organization names, 2,025 person names, 67 product names, 29
fictional character names, and 15 POIs.</p>
          <p>
            Word Embeddings We use various static word embeddings: Global Vectors
for Word Representation (GloVe) [
            <xref ref-type="bibr" rid="ref14">14</xref>
            ], English fastText Common Crawl [
            <xref ref-type="bibr" rid="ref1 ref11">1, 11</xref>
            ],
and French Wikipedia fastText.5 We also use ELMo [16] contextualized word
representations for English6 and French7.
7 https://traces1.inria.fr/oscar/files/models/cc/fr.zip
reason, we compare several published NER systems as well as DeLFT
pretrained models against various corpora (i.e., CoNLL-2003, CoNLL-2012, FTB)
and present them in Table 1.
          </p>
          <p>
            [18] non-neural machine learning system achieves a 90.80 F1-score on
CoNLL2003. [
            <xref ref-type="bibr" rid="ref13">13</xref>
            ] improves with 90.90 on CoNLL-2003 and 82.30 on Ontonotes 5.0.
Although [
            <xref ref-type="bibr" rid="ref9">9</xref>
            ] exceeds the previous achievements with their NERD system, which
pushes the F1-score to 91.20. Nevertheless, their NER pure system reaches 89.90.
          </p>
          <p>
            Meanwhile, for neural architectures, [
            <xref ref-type="bibr" rid="ref8">8</xref>
            ] reaches a 90.94 F1-score on
CoNLL2003, then [
            <xref ref-type="bibr" rid="ref10">10</xref>
            ] improves the results. [
            <xref ref-type="bibr" rid="ref2">2</xref>
            ] reports an F1-score of 91.62 on
CoNLL2003 and 86.28 on OntoNotes 5.0. [16] ELMo enhanced bidirectional LSTM with
a CRF layer (BiLSTM-CRF) achieves an averaged F1-score of 92.22 over five
runs. [
            <xref ref-type="bibr" rid="ref4">4</xref>
            ] has a competitive performance with state-of-the-art systems where its
BERTLARGE fine-tuning approach tested on CoNLL-2003 reaches 92.80.
          </p>
          <p>
            DeLFT has reimplemented neural architectures for NER [
            <xref ref-type="bibr" rid="ref3">3</xref>
            ]. Table 1 presents
reported best F1-scores over ten runs for the English model using CoNLL-2003
and CoNLL-2012 and the French model using the FTB corpus.
          </p>
          <p>The model trained with CoNLL-2003 within the BiLSTM-CRF architecture
and GloVe word embedding, tested against the test set, achieves a 91.35
F1score. The result is improving with GloVe combined with ELMo. Within BERT
architecture and the CRF activation layer for fine-tuning, the model achieves an
average of 91.20 F1-score. The best F1-score on CoNLL-2003 is 93.09 when using
both train and validation datasets within a BiLSTM-CRF architecture, coupled
with GloVe and ELMo embeddings. Meanwhile, the CoNLL-2012-based model
within the BiLSTM-CRF architecture and the fastText embedding achieves an
F1-score of 87.01. The involvement of ELMo increases the score by 2 points to
89.01.</p>
          <p>The French model trained with the FTB corpus within the BiLSTM-CRF
architecture and French Wikipedia fastText reaches an 87.45 F1-score. Meanwhile,
with the use of French ELMo, the score is improving into 89.23.</p>
          <p>From these results, we learn that DL-based systems have better performance
than conventional machine learning systems. The use of contextualized word
embeddings within the BiLSTM-CRF architecture improves the scores. The
results in the CoNLL-2003 column also show that ELMo-based models give better
F1-score than BERT-based models.
4</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Work Phases</title>
      <p>In general, the HIPE shared task contains two tasks:
1. Named Entity Recognition and Classification (NERC): the recognition and
classification of entity mentions with predefined high-level (i.e., pers, org,
prod, loc, time), finer-grained, or nested entity classes.
2. Named Entity Linking (NEL): the task of matching identified entity mentions
to Wikidata entries, with or without prior knowledge of mention types and
boundaries.
4.1</p>
      <sec id="sec-3-1">
        <title>Named Entity Recognition and Classification (NERC)</title>
        <p>Although the English model built with the CoNLL-2003 dataset is promising,
this model does not support the Time (Date) entities. Moreover, since HIPE
does not provide training data for English, we use a CoNLL-2012 pre-trained
model within the BiLSTM-CRF architecture, and ELMo contextualized word
embeddings. For the French model, we enrich the French HIPE (i.e., the version
1.2 train and dev) dataset with annotated FTB data.</p>
        <p>For training the models, we follow the default hyper-parameters 8 as applied
for other pre-trained sequence labeling models in DeLFT, except for the batch
size and maximum epoch, we follow as indicates here.9
Challenges Combining data from different environments poses challenges,
particularly with the reason of different NE class definitions as well as annotation
guidelines. CoNLL-2012 define 18 classes. FTB corpus comes with seven NE
classes. Meanwhile, HIPE uses five classes.</p>
        <p>HIPE annotates absolute dates without months and hours, which confirms
to the CoNLL-2012 DATE class. Furthermore, the HIPE Location (loc) entities
suites with those belonging to CoNLL-2012 FAC (i.e., buildings, airports,
highways, bridges), GPE (i.e., countries, cities, states), and LOCATION (i.e.,
nonGPE locations, mountain ranges, bodies of water) entities. It’s also challenging
to search the equivalence of the HIPE PROD entities, which we understand as
media products since CoNLL-2012 classifies them in the ORG class.
Experiments We benchmarked the French NER trained with the HIPE data
(i.e., train and dev v-1.2) only and the HIPE plus additional FTB data. The
only HIPE model achieved an F1-score of 85.71 on the dev set. Meanwhile, the
enriched model performs better with an increase of 3 scores into 88.46.
4.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Named Entity Linking</title>
        <p>For the NEL task, we use entity extraction and disambiguation services provided
by entity-fishing. There are several ways to access these services; however, the
most straightforward way is to use the service RESTful web services.10</p>
        <p>First, we collect the text from the HIPE data. Then, we include this text
as part of the JSON input query. The entity-fishing query processing service
takes as input a JSON structured query and returns the JSON query enriched
with a list of identified and, when possible, disambiguated entities. The JSON
query format for the response file is identical to the input query. The client must
respect the JSON query format, which is as follow:
8 https://github.com/kermitt2/delft/blob/master/delft/sequenceLabelling/config.py
9 https://github.com/kermitt2/delft/blob/master/nerTagger.py
10 https://nerd.readthedocs.io/en/latest/restAPI.html
},
"entities": [],
"mentions": ["ner","wikipedia"],
"nbest": 0,
"sentence": false,
"customisation": "generic",
"processSentence": [],
"structure": "grobid"</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Results</title>
      <p>Table 2 lists the best system, our system, and the baseline results for the
NE-COARSELIT and EL-ONLY tasks.11 Our NER system performs worse than the L3i system.
However, we perform better than the provided baseline, which is a CRF sequence
classifier. We have an exception to the French NE-COARSE-LIT-strict result, which is
slightly below the baseline F1-score.</p>
      <p>
        It turns out that our EL system, especially for English, performs better than the
L3i team and the aidalight-baseline, which corresponds to [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. Our French EL system
is better than the L3i EL system in terms of recall but rather appalling in terms of
precision.
      </p>
      <p>Table 3 and Table 4 present our English and French NER and EL performance
on the HIPE test data with detailed information on false positive and false negative
numbers. Strict NER, which is a more demanding task, performs worse than fuzzy NER.
Looking further at each class, we highlight that there are considerably misclassified
PROD entities and thus contribute to numerous false negatives.</p>
      <p>Lang Evaluation Label
EN NE-COARSE-LIT-micro-fuzzy ALL
EN NE-COARSE-LIT-micro-strict ALL
FR NE-COARSE-LIT-micro-fuzzy ALL
FR NE-COARSE-LIT-micro-strict ALL
11
https://github.com/impresso/CLEF-HIPE-2020/blob/master/evaluationresults/ranking_summary_final.md
DeLFT and entity-fishing achieve good F1-scores, their performance is quite sensitive
to noise data.</p>
      <p>Acknowledgements We thank the anonymous reviewers for their insightful
comments.
16. Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K.,
Zettlemoyer, L.: Deep contextualized word representations. arXiv preprint
arXiv:1802.05365 (2018)
17. Pradhan, S., Moschitti, A., Xue, N., Uryupina, O., Zhang, Y.: Conll-2012 shared
task: Modeling multilingual unrestricted coreference in ontonotes. In: Joint
Conference on EMNLP and CoNLL-Shared Task. pp. 1–40 (2012)
18. Ratinov, L., Roth, D.: Design challenges and misconceptions in named entity
recognition. In: Proceedings of the Thirteenth Conference on Computational Natural
Language Learning (CoNLL-2009). pp. 147–155 (2009)
19. Sagot, B., Richard, M., Stern, R.: Annotation référentielle du Corpus Arboré de
Paris 7 en entités nommées. In: Antoniadis, G., Blanchon, H., Sérasset, G. (eds.)
Traitement Automatique des Langues Naturelles (TALN). Actes de la conférence
conjointe JEP-TALN-RECITAL 2012, vol. 2 - TALN. Grenoble, France (Jun 2012),
https://hal.inria.fr/hal-00703108
20. Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task:
Languageindependent named entity recognition. arXiv preprint cs/0306050 (2003)
21. Santos, C.N.d., Guimaraes, V.: Boosting named entity recognition with neural
character embeddings. arXiv preprint arXiv:1505.05008 (2015)</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Bojanowski</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grave</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Joulin</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mikolov</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Enriching word vectors with subword information</article-title>
          .
          <source>Transactions of the Association for Computational Linguistics</source>
          <volume>5</volume>
          ,
          <fpage>135</fpage>
          -
          <lpage>146</lpage>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Chiu</surname>
            ,
            <given-names>J.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nichols</surname>
          </string-name>
          , E.:
          <article-title>Named entity recognition with bidirectional lstm-cnns</article-title>
          .
          <source>Transactions of the Association for Computational Linguistics</source>
          <volume>4</volume>
          ,
          <fpage>357</fpage>
          -
          <lpage>370</lpage>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>3. Delft. https://github.com/kermitt2/delft (2018-2020)</mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Devlin</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>M.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Toutanova</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Bert: Pre-training of deep bidirectional transformers for language understanding</article-title>
          . arXiv preprint arXiv:
          <year>1810</year>
          .
          <volume>04805</volume>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Ehrmann</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Romanello</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Flückiger</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Clematide</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <source>Overview of CLEF HIPE</source>
          <year>2020</year>
          :
          <article-title>Named Entity Recognition and Linking on Historical Newspapers</article-title>
          . In: Arampatzis,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Kanoulas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            ,
            <surname>Tsikrika</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Vrochidis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Joho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            ,
            <surname>Lioma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Eickhoff</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Névéol</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Cappellato</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Ferro</surname>
          </string-name>
          , N. (eds.)
          <article-title>Experimental IR Meets Multilinguality, Multimodality, and Interaction</article-title>
          .
          <source>Proceedings of the 11th International Conference of the CLEF Association (CLEF</source>
          <year>2020</year>
          ).
          <source>Lecture Notes in Computer Science (LNCS)</source>
          , vol.
          <volume>12260</volume>
          . Springer (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Foppiano</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Romary</surname>
          </string-name>
          , L.:
          <article-title>entity-fishing: a DARIAH entity recognition and disambiguation service</article-title>
          .
          <source>In: Digital Scholarship in the Humanities</source>
          . Tokyo, Japan (Sep
          <year>2018</year>
          ), https://hal.inria.fr/hal-01812100
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Habibi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weber</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Neves</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wiegandt</surname>
            ,
            <given-names>D.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Leser</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          :
          <article-title>Deep learning with word embeddings improves biomedical named entity recognition</article-title>
          .
          <source>Bioinformatics</source>
          <volume>33</volume>
          (
          <issue>14</issue>
          ),
          <fpage>i37</fpage>
          -
          <lpage>i48</lpage>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Lample</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ballesteros</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Subramanian</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kawakami</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dyer</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Neural architectures for named entity recognition</article-title>
          .
          <source>arXiv preprint arXiv:1603.01360</source>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Luo</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Huang</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lin</surname>
            ,
            <given-names>C.Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nie</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          :
          <article-title>Joint entity recognition and disambiguation</article-title>
          .
          <source>In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing</source>
          . pp.
          <fpage>879</fpage>
          -
          <lpage>888</lpage>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Ma</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hovy</surname>
          </string-name>
          , E.:
          <article-title>End-to-end sequence labeling via bi-directional lstm-cnns-crf</article-title>
          .
          <source>arXiv preprint arXiv:1603.01354</source>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Mikolov</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grave</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bojanowski</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Puhrsch</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Joulin</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Advances in pretraining distributed word representations</article-title>
          .
          <source>arXiv preprint arXiv:1712.09405</source>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Nguyen</surname>
            ,
            <given-names>D.B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hoffart</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Theobald</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weikum</surname>
          </string-name>
          , G.:
          <article-title>Aida-light: High-throughput named-entity disambiguation</article-title>
          .
          <source>LDOW</source>
          <volume>1184</volume>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Passos</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kumar</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McCallum</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Lexicon infused phrase embeddings for named entity resolution</article-title>
          .
          <source>arXiv preprint arXiv:1404.5367</source>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Pennington</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Socher</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Manning</surname>
          </string-name>
          , C.D.: Glove:
          <article-title>Global vectors for word representation</article-title>
          .
          <source>In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP)</source>
          . pp.
          <fpage>1532</fpage>
          -
          <lpage>1543</lpage>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Peters</surname>
            ,
            <given-names>M.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ammar</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bhagavatula</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Power</surname>
          </string-name>
          , R.:
          <article-title>Semi-supervised sequence tagging with bidirectional language models</article-title>
          .
          <source>arXiv preprint arXiv:1705.00108</source>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>