<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>LaSTUS-TALN at IberLEF 2019 eHealth-KD Challenge</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Alex Bravo</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pablo Accuosto</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Horacio Saggion</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>LaSTUS/TALN Research Group, DTIC Universitat Pompeu Fabra</institution>
          ,
          <addr-line>Spain C/Tanger 122-140, 08018 Barcelona</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2019</year>
      </pub-date>
      <fpage>51</fpage>
      <lpage>59</lpage>
      <abstract>
        <p>This paper presents the participation of the LASTUS-TALN team in the IberLEF eHealth-KD 2019 challenge, which proposes 2 subtasks in the context of biomedical text processing in Spanish: i) the detection and classi cation of key phrases and ii) the identi cation of the semantic relationships between them. We propose an architecture based on a bidirectional long short-term memory (BiLSTM) with a conditional random eld (CRF) classi er as the last layer of the network to nd and classify the relevant key phrases. Concerning relation extraction problem, for each candidate relationship, we describe a global and local context representing the supposed relationship and the context of the candidate key phrases, respectively and divided the problem into three simpler classi cation tasks: i) decide if the entities are related, ii) identify the type of relationship and iii) obtain the correct direction. In our model, these three classi cation tasks were trained at the same time. When key phrase extraction and relation extraction were run in sequence, our system achieved the third highest F1 score in the main evaluation.</p>
      </abstract>
      <kwd-group>
        <kwd>Information Extraction</kwd>
        <kwd>Deep Learning</kwd>
        <kwd>Biomedical Text</kwd>
        <kwd>Natural Language Processing</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Information Extraction (IE) is the process of nding relevant entities and their
relationships in text [
        <xref ref-type="bibr" rid="ref13 ref14">14, 13</xref>
        ]. This process is a crucial step to structure the
valuable knowledge locked in biomedical literature for a variety of purposes (e.g.
information retrieval, knowledge discovery and documents recommendation).
      </p>
      <p>
        Many biomedical challenges have been proposed to promote the
development of systems to extract, classify and index biomedical knowledge, such as
SemEval1, CLEF2 campaigns and others [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>
        Previously, the eHealth-KD 2018 challenge [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], held at the Workshop on
Semantic Analysis at the SEPLN (TASS)3, promoted the development and
evaluation of systems that are able to automatically extract a large variety of knowledge
from biomedical documents written in Spanish, including the extraction and
classi cation of key phrases and semantic relations between them4. The challenge
was organized in two subtasks: i) the detection of entities and ii) classi cation
and the recognition of semantic relationships. Six teams successfully concluded
their participation with a great variety of proposed systems [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. In general, the
most competitive approaches in individual subtasks were led by state-of-the-art
machine learning approaches. In particular, for the detection of semantic
relations, deep learning architectures seemed to outperform more classic techniques.
In addition, including domain-speci c knowledge (e.g. UMLS) provided signi
cant boost to the results. The best results in the detection and classi cation of
entities were of 0.87 and 0.96 of F-score, respectively. Regarding the detection of
semantic relations, the best scores were around 0.45 in F-score. This reinforces
the belief that relation extraction is still a challenging task.
      </p>
      <p>Recently, in the IberLEF eHealth-KD 2019 challenge, the organizers also
proposed two subtasks (see Fig. 15): i) the recognition of key phrases (subtask A,
covering both, identi cation and classi cation) and ii) the detection of semantic
relationships between them (subtask B). In this paper, we present our approaches
and results for the participation in this challenge. From the previous challenge,
we could observe that the best systems in the recognition of key phrases do not
correlate with the best systems in relation extraction. Under this assumption,
we propose a di erent deep learning approach for each subtask.</p>
    </sec>
    <sec id="sec-2">
      <title>1 International Workshop on Semantic Evaluation</title>
    </sec>
    <sec id="sec-3">
      <title>2 Conference and Labs of the Evaluation Forum</title>
    </sec>
    <sec id="sec-4">
      <title>3 http://www.sepln.org/workshops/tass</title>
    </sec>
    <sec id="sec-5">
      <title>4 http://www.sepln.org/workshops/tass/2018/task-3/index.html</title>
    </sec>
    <sec id="sec-6">
      <title>5 https://knowledge-learning.github.io/ehealthkd-2019/tasks</title>
      <p>2
2.1</p>
      <sec id="sec-6-1">
        <title>Methods</title>
        <p>Subtask A: Identi cation and classi cation of key phrases
In this section, we describe our proposal for identifying and classifying key
phrases in biomedical texts. Key phrases are considered to be all the entities
(single word or multiple words) that represent semantically relevant elements in
a sentence.</p>
        <p>
          There are four potential classes for key phrases, as described in [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]:
Concept, Action, Predicate and Reference. The input is a tokenized text document
with a sentence per line. The output consists of a plain text le where each line
represents a key phrase with its unique ID, positions of the starting and
ending character of the text span, the assigned category and the full span of text
containing the key phrase.
        </p>
        <p>
          For this task we propose an architecture based on a BiLSTM with a CRF
classi er as the last layer of the network. We based our implementation on the one
made available by the Ubiquitous Knowledge Processing Lab of the Technische
Universitat Darmstadt [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]6. We use two BiLSTM layers with 100 recurrent units
with Adam optimizer and naive dropout probability of 0.25.
        </p>
        <p>In order to make them compatible with the proposed architecture, we
transform the format of the provided training, development and test sets into text
les containing one token per line. The corresponding classes are encoded in the
standard beginning-inside-outside (BIO) sequence tagging scheme.</p>
        <p>
          The tokens are fed into the network as 1024-dimensional embeddings
obtained by averaging the three output layers of a deep contextualized ELMo
model [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. As the o cial Allen NLP ELMo models7 are only available for
English, we used Spanish pre-trained ELMo models made available by [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]8. We
made the necessary modi cations to the UKP sequence tagger in order to make
it compatible with these representations, as they are not directly pluggable into
the Allen NLP API used in the original implementation.
2.2
        </p>
        <p>Subtask B: Detection of semantic relations
The goal of this subtask is to recognize the thirteen semantic relationships
between the key phrases detected and labelled in each sentence. In addition, every
semantic relation is directed, that is, the involved entities must match the correct
direction.</p>
        <p>In this subtask, we implemented a multi-task learning approach. First, we
have broken down the subtask B into three simpler classi cations: i) decide if the
entities are related, ii) identify the type of relationship and iii) decide the correct
direction. Then, in our model, these three classi cation tasks were trained at the
same time.</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>6 https://github.com/UKPLab/elmo-bilstm-cnn-crf</title>
    </sec>
    <sec id="sec-8">
      <title>7 https://allennlp.org/elmo</title>
    </sec>
    <sec id="sec-9">
      <title>8 https://github.com/HIT-SCIR/ELMoForManyLangs</title>
      <p>Before the relation extraction process, we considered as candidates all entity
pairs detected in the same sentence. For each entity pair, we design a global
context representing the supposed relationship and a local context representing
the environment of each candidate entity.</p>
      <p>
        Global and Local contexts from a relationship Following the philosophy
of [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], we have organized the information of each candidate relationship
into two scenarios: global and local contexts. In detail, the global context is based
on the assumption that an association between two entities is more likely to be
expressed within one of three sequences [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]:
{ Fore-Between: from the words before and between the two candidates.
{ Between: from the words between the two candidate entities.
{ Between-After : from the words between and after the two candidates.
      </p>
      <p>On the other hand, we also de ned the local context of each candidate entity.
The local context can provide useful information for detection of the type and
direction of the relationship, as well as the presence of the relation itself. This
context is also based on a sequence of words (EntityA-Context and
EntityBContext ), which contains the information from the words located at the left and
right of the candidate entities (with a window size of 2).</p>
      <p>In both contexts, each sequence is represented using the concatenation of the
following embeddings: tokens, PoS tags, entity types and dependencies.
Model The model consists of a BiLSTM model with an attention layer on
top for each concatenated embedding. The model captures the most important
syntactic and semantic information from each sequence (Fore-Between, Between,
Between-After, EntityA-Context and EntityB-Context) to face the three tasks:
to detect if the key phrases are related, the type of relationship and its direction.
In Fig. 2 a simpli ed schema of our model can be seen. In the following we explain
how the model works.</p>
      <p>First, the sentences were processed with Spacy9 to obtain the tokens, PoS tags
and dependencies. From the previous subtask A, we also included the entity type
information. Then, the ve sequences are generated from the tokenized sentence.</p>
      <p>Second, the embedding layers transform each token in the sequence into a
set of low-dimensions related to the token itself, the POS tag, the dependency
and the entity type.</p>
      <p>
        Speci cally, in the case of the tokens, an embedding layer was randomly
initialized from a uniform distribution (between -0.8 and 0.8 values and with 300
dimensions) and then, the embedding layer was updated with the word vectors
computed from the sentences in Spanish on MedlinePlus by means of fastText [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
Similarly, the rest of the embedding layers were also randomly initialized from
a uniform distribution, but with only 10 dimensions and without pre-trained
      </p>
    </sec>
    <sec id="sec-10">
      <title>9 https://spacy.io/</title>
      <p>embeddings. As shown in Fig. 2, for each token of the sequence, the related
embeddings are concatenated.</p>
      <p>Next, for each sequence, a BiLSTM layer gets high-level features from its
corresponding concatenated embeddings. The BiLSTM gets a word embedding
sequentially, left-to-right as well as right-to-left order in parallel, producing a
hidden step and keeps its hidden state through time. Therefore, it gives two
hidden states as output at each step and is able to capture backwards and
longrange dependencies.</p>
      <p>
        A critical and apparent disadvantage of LSTM models is that they compress
all information into a xed-length vector, causing the incapability of
remembering long sequences. Attention mechanism aims to overcome the limitation
of xed-length vector keeping relevant information from long sequences.
Attention techniques have been recently demonstrated success in multiple areas of the
NLP such as question answering, machine translations, speech recognition and
relation extraction [
        <xref ref-type="bibr" rid="ref1 ref16 ref5 ref8">1, 8, 5, 16</xref>
        ]. For that reason, we added an attention layer
(after each BiLSTM layer), which produces a weight vector and merge word-level
features from each time step into a sequence-level feature vector, by multiplying
the weight vector [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. Furthermore, to alleviate over tting during the training,
we applied dropout regularization, which sets randomly to zero a proportion of
the hidden units during forward propagation, creating more generalizable
representations of data. In the model, we employ dropout on the embeddings and
BiLSTM layers. The dropout rate was set to 0.5 in all cases.
      </p>
      <p>Then, the nal relation-level feature vector produced by the previous
BiLSTM layers feed the following dense layer, which directs its output to three
parallel fully-connected layers to classify each task. Note that the three nal output
layers are connected in cascade, that is, the output of the rst classi cation (are
these entities related?) also feeds the second classi cation task (type of semantic
relationship?), and the last one (direction of the relationship?) is feed by the
outputs from the previous dense layer, the rst and the second classi cation
task.</p>
      <p>Finally, we consider that a relationship has been detected when our model
predicts a positive value in the three classi cation tasks.
Team F-Score Precision Recall
LASTUS-TALN 0.8167 0.7997 0.8344
Highest Score 0.8203 0.8073 0.8336
Average Score 0.7749 0.7746 0.7774
Baseline 0.5466 0.5129 0.5851
The organizers proposed a main evaluation scenario (Scenario 1) where subtasks
A and B are performed in sequence. Additionally, two optional scenarios were
considered in order to evaluate each subtask independently of the other (Scenario
2 for subtask A and Scenario 3 for subtask B).</p>
      <p>Results of our system for each scenario are given in Tables 1, 2 and 3. In
addition to our results, we include the highest and average scores obtained for
all the participants in each scenario. Please note that, as there was a bug in our
submitted implementation for subtask B, which was subsequently xed, Tables
1, 2 and 3 show the o cial results achieved in the challenge as well as the xed
results, which we comment bellow.</p>
      <p>For subtask A, our system achieved one of the best results in the challenge
(see Table 2). In contrast, for subtask B, our system obtained one of the lowest
scores (see Table 3). When the tasks are performed sequentially (see Table 1) the
errors in subtask B are mitigated by the good performance obtained for subtask
A, leaving our system in the third position of the challenge.
4</p>
      <sec id="sec-10-1">
        <title>Conclusions</title>
        <p>
          In this paper, we presented the participation of the LASTUS-TALN team in the
IberLEF eHealth-KD 2019 challenge, which proposed 2 sub-tasks, the detection
and classi cation of key phrases and the identi cation of the semantic
relationships between them. In the rst scenario, we proposed a BiLSTM network for
sequence tagging, based on the architecture made available by the Ubiquitous
Knowledge Processing Lab of the Technische Universitat Darmstadt [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]. We
adapted this architecture in order to use an alternative version of ELMo deep
contextualized embeddings in Spanish.
        </p>
        <p>On the other hand, we followed a daring philosophy to represent relationships
in multiple context. Although our results for task 2 were not as expected, we
think that this representation has a lot of potential. For that reason, our future
work will focus on the study of this representation and how it behaves in neural
network. In this sense, we want to achieve a performance close to the
state-ofthe-art in this challenge.</p>
      </sec>
      <sec id="sec-10-2">
        <title>Acknowledgments</title>
        <p>Funding: This work is partly supported by the Spanish Government under the
Mar a de Maeztu Units of Excellence Programme (MDM-2015-0502).</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Bahdanau</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cho</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bengio</surname>
            ,
            <given-names>Y.:</given-names>
          </string-name>
          <article-title>Neural machine translation by jointly learning to align and translate</article-title>
          .
          <source>arXiv preprint arXiv:1409.0473</source>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Bojanowski</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grave</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Joulin</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mikolov</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Enriching word vectors with subword information</article-title>
          .
          <source>arXiv preprint arXiv:1607.04606</source>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Bravo</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , Pin~ero, J.,
          <string-name>
            <surname>Queralt-Rosinach</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rautschka</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Furlong</surname>
            ,
            <given-names>L.I.</given-names>
          </string-name>
          :
          <article-title>Extraction of relations between genes and diseases from text and large-scale data analysis: implications for translational research</article-title>
          .
          <source>BMC bioinformatics 16(1)</source>
          ,
          <volume>55</volume>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Che</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          , Liu,
          <string-name>
            <given-names>Y.</given-names>
            ,
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            ,
            <surname>Zheng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <surname>T.</surname>
          </string-name>
          :
          <article-title>Towards Better UD Parsing: Deep Contextualized Word Embeddings, Ensemble, and Treebank Concatenation</article-title>
          .
          <source>In: Proceedings of the CoNLL</source>
          <year>2018</year>
          <article-title>Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies</article-title>
          . pp.
          <volume>55</volume>
          {
          <fpage>64</fpage>
          . Association for Computational Linguistics, Brussels, Belgium (
          <year>October 2018</year>
          ), http://www.aclweb.org/anthology/K18-2005
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Chorowski</surname>
            ,
            <given-names>J.K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bahdanau</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Serdyuk</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cho</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bengio</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Attention-based models for speech recognition</article-title>
          .
          <source>In: Advances in neural information processing systems</source>
          . pp.
          <volume>577</volume>
          {
          <issue>585</issue>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Giuliano</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lavelli</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Romano</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Exploiting shallow linguistic information for relation extraction from biomedical literature</article-title>
          .
          <source>In: 11th Conference of the European Chapter of the Association for Computational Linguistics</source>
          (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Gonzalez-Hernandez</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sarker</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>OConnor</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Savova</surname>
          </string-name>
          , G.:
          <article-title>Capturing the patients perspective: a review of advances in natural language processing of healthrelated text</article-title>
          .
          <source>Yearbook of medical informatics</source>
          <volume>26</volume>
          (
          <issue>01</issue>
          ),
          <volume>214</volume>
          {
          <fpage>227</fpage>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8. Hermann,
          <string-name>
            <given-names>K.M.</given-names>
            ,
            <surname>Kocisky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Grefenstette</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            ,
            <surname>Espeholt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Kay</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            ,
            <surname>Suleyman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Blunsom</surname>
          </string-name>
          ,
          <string-name>
            <surname>P.</surname>
          </string-name>
          :
          <article-title>Teaching machines to read and comprehend</article-title>
          .
          <source>In: Advances in Neural Information Processing Systems</source>
          . pp.
          <volume>1693</volume>
          {
          <issue>1701</issue>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Mart nez Camara</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Almeida Cruz</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          , D az Galiano,
          <string-name>
            <given-names>M.C.</given-names>
            ,
            <surname>Estevez-Velarde</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.</surname>
          </string-name>
          , Garc a Cumbreras,
          <string-name>
            <surname>M.A.</surname>
          </string-name>
          , Garc a Vega,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Gutierrez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            ,
            <surname>Montejo</surname>
          </string-name>
          <string-name>
            <surname>Raez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Montoyo</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          , Mun~oz, R., et al.:
          <article-title>Overview of tass 2018: Opinions, health and emotions (</article-title>
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Mooney</surname>
            ,
            <given-names>R.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bunescu</surname>
            ,
            <given-names>R.C.</given-names>
          </string-name>
          :
          <article-title>Subsequence kernels for relation extraction</article-title>
          .
          <source>In: Advances in neural information processing systems</source>
          . pp.
          <volume>171</volume>
          {
          <issue>178</issue>
          (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Peters</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Neumann</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Iyyer</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gardner</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Clark</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zettlemoyer</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Deep Contextualized Word Representations</article-title>
          .
          <source>In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</source>
          , Volume
          <volume>1</volume>
          (
          <string-name>
            <given-names>Long</given-names>
            <surname>Papers</surname>
          </string-name>
          <article-title>)</article-title>
          .
          <source>vol. 1</source>
          , pp.
          <volume>2227</volume>
          {
          <issue>2237</issue>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Piad-Mor s</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gutierrez</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Consuegra-Ayala</surname>
            ,
            <given-names>J.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Estevez-Velarde</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Almeida-Cruz</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          , Mun~oz, R.,
          <string-name>
            <surname>Montoyo</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Overview of the ehealth knowledge discovery challenge at iberlef</article-title>
          <year>2019</year>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Piskorski</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yangarber</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          :
          <article-title>Information extraction: Past, present and future</article-title>
          . In: Poibeau,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Saggion</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            ,
            <surname>Piskorski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Yangarber</surname>
          </string-name>
          ,
          <string-name>
            <surname>R</surname>
          </string-name>
          . (eds.) Multi-source,
          <source>Multilingual Information Extraction and Summarization</source>
          , pp.
          <volume>23</volume>
          {
          <fpage>49</fpage>
          .
          <source>Theory and Applications of Natural Language Processing</source>
          , Springer (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Poibeau</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Saggion</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Piskorski</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yangarber</surname>
          </string-name>
          , R.: Multi-source,
          <source>Multilingual Information Extraction and Summarization</source>
          . Springer Publishing Company, Incorporated (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Reimers</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gurevych</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Reporting score distributions makes a di erence: Performance study of LSTM-networks for sequence tagging</article-title>
          .
          <source>In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing</source>
          . pp.
          <volume>338</volume>
          {
          <issue>348</issue>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Zhou</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shi</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tian</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Qi</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hao</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Attention-based bidirectional long short-term memory networks for relation classi cation</article-title>
          .
          <source>In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume</source>
          <volume>2</volume>
          :
          <string-name>
            <given-names>Short</given-names>
            <surname>Papers</surname>
          </string-name>
          <article-title>)</article-title>
          .
          <source>vol. 2</source>
          , pp.
          <volume>207</volume>
          {
          <issue>212</issue>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>