<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>KFU at CLEF eHealth 2017 Task 1: ICD-10 Coding of English Death Certi cates with Recurrent Neural Networks</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Zulfat Miftahutdinov</string-name>
          <email>zulfatmi@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Elena Tutubalina</string-name>
          <email>ElVTutubalina@kpfu.ru</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Kazan (Volga Region) Federal University</institution>
          ,
          <addr-line>Kazan</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper describes the participation of the KFU team in the CLEF eHealth 2017 challenge. Speci cally, we participated in Task 1, namely \Multilingual Information Extraction - ICD-10 coding" for which we implemented recurrent neural networks to automatically assign ICD10 codes to fragments of death certi cates written in English. Our system uses Long Short-Term Memory (LSTM) to map the input sequence into a vector representation, and then another LSTM to decode the target sequence from the vector. We initialize the input representations with word embeddings trained on user posts in social media. The encoderdecoder model obtained F-measure of 85.01% on a full test set with signi cant improvement as compared to the average score of 62.2% for all participants' approaches. We also obtained signi cant improvement from 26.1% to 44.33% on an external test set as compared to the average score of the submitted runs.</p>
      </abstract>
      <kwd-group>
        <kwd>ICD-10 coding</kwd>
        <kwd>ICD-10 codes</kwd>
        <kwd>medical concept coding</kwd>
        <kwd>recurrent neural network</kwd>
        <kwd>sequence to sequence</kwd>
        <kwd>sequence-to-sequence architecture</kwd>
        <kwd>encoder-decoder model</kwd>
        <kwd>deep learning</kwd>
        <kwd>machine learning</kwd>
        <kwd>death certi cates</kwd>
        <kwd>CepiDC</kwd>
        <kwd>healthcare</kwd>
        <kwd>CLEF eHealth</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Extracting and linking medical information from textual documents has
attracted extensive interest from both academia and industry. Automatic
matching of text phrases to medical concepts and corresponding classi cation codes
is a highly important task for many clinical applications in the elds of health
management and patient safety.</p>
      <p>The International Classi cation of Diseases (ICD) is the diagnostic system
that is used to monitor and classify causes of health problems and death and
provide information for clinical purposes. Each medical concept is mapped onto
a unique identi er which consists of a single alphabet pre x and several digits.
Single alphabet pre x represents a class of common diseases (e.g. \J" covers
diseases of the respiratory system, \V" covers external causes of morbidity)
and digits represent speci c type of disease (e.g. \J20.2" covers acute bronchitis
due to streptococcus", \V25" covers \Motorcycle rider injured in collision with
railway train or railway vehicle"). Table 1 contains examples of ICD-10 codes.</p>
      <p>
        Machine learning methods have been widely successful in various NLP
applications including named entity recognition and relation extraction [1{3],
machine translation [4{6], opinion mining [7{9], detection of demographic
information from health-related user posts [
        <xref ref-type="bibr" rid="ref10 ref11">10, 11</xref>
        ]. Recurrent Neural Networks (RNN),
in particular, Long Short-Term Memory (LSTM) and Gated Recurrent Units
(GRU) are considered to be among the most powerful methods for sequence
modeling [12{14, 4]. Motivated by the recent success of deep recurrent networks,
herein we have explored an application of RNN-based encoder-decoder models
to the task of automated ICD coding.
      </p>
      <p>
        We describe participation of our team in the task 1 for English death
certi cates. The goal of this task is to assign one or more relevant ICD-10 codes
to sentences in the death certi cates. We employ an annotated corpus named
the CepiDC Causes of Death Corpus, which contains free-text descriptions of
causes of death reported by physicians. More speci cally, we employ the part of
the corpus with English texts. The CepiDC corpus of French texts was initially
provided for the task of ICD-10 coding in CLEF eHealth 2016 (task 2) [
        <xref ref-type="bibr" rid="ref15 ref16">15, 16</xref>
        ].
The organizers recently extended this corpus with additional data for CLEF
eHealth 2017 [
        <xref ref-type="bibr" rid="ref17 ref18">17, 18</xref>
        ]. Our neural network relies on two sources of information:
word representations learned from unannotated corpora and a manually curated
ICD-10 dictionary provided by the organizers of the task.
      </p>
      <p>The rest of the paper is structured as follows. Section 2 contains our system
description, Section 3 provides evaluation results. In Section 4, we discuss some
related work from CLEF eHealth 2016. Finally, Section 5 provides concluding
remarks.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Our Approach</title>
      <p>The basic idea of our approach is to map the input sequence to a xed-sized
vector, more precisely, some semantic representation of this input, and then
unroll this representation in the target sequence using a neural network model.
This intuition is formally captured in a encoder-decoder architecture. In the
following subsections, we provide a brief description of recurrent neural networks
(RNNs) and the encoder-decoder model.
2.1</p>
      <sec id="sec-2-1">
        <title>Recurrent Neural Networks</title>
        <p>
          RNNs are naturally used for sequence learning, where both input and output are
word and label sequences, respectively. RNN has recurrent hidden states, which
aim to simulate memory, i.e., the activation of a hidden state at every time
step depends on the previous hidden state [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]. The recurrent unit computes
a weighted sum of the input signal. There is the di culty of training RNNs
to capture long-term dependencies due to the e ect of vanishing gradients [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ],
so the most widely used modi cation of a RNN unit is the Long Short-Term
Memory (LSTM) [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ]. LSTM provides the \constant error carousel" and does
not preclude free gradient ow. The basic LSTM architecture contains three
gates: input gate, forget gate, and output gate, together with a recurrent cell.
LSTM cells are usually organized in a chain, with outputs of previous LSTMs
connected to the inputs of subsequent LSTMs.
        </p>
        <p>
          An important modi cation of the basic RNN architecture is bidirectional
RNNs, where the past and the future context is available in every time step [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ].
Bidirectional LSTMs, developed by Graves and Schmidhuber [
          <xref ref-type="bibr" rid="ref14 ref21">14, 21</xref>
          ], contain
two chains of LSTM cells owing in both forward and backward direction, and
the nal representation is either a linear combination or simply concatenation
of their states.
2.2
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>Encoder-Decoder Model</title>
        <p>
          We introduce the sequence-to-sequence architecture, more precisely, an
encoderdecoder model proposed earlier [
          <xref ref-type="bibr" rid="ref4 ref6">4, 6</xref>
          ] for the ICD-10 coding task. As shown in
Figure 1, the model consists of two components based on RNNs: an encoder
and a decoder. The encoder processes the input sequence, while the decoder
generates the output sequence.
        </p>
        <p>
          We adopted the architecture as described in [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. As encoder RNN we used
bidirectional LSTM, as decoder RNN we used left-to-right LSTM. The input
layer of our model is vector representations of individual words. Word embedding
models represent each word using a single real-valued vector. Such representation
groups together words that are semantically and syntactically similar [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ]. The
word embeddings are trained using an unlabelled corpus of user reviews.
        </p>
        <p>In order to incorporate prior knowledge, we additionally concatenated
cosine similarities vector to the encoded state. CLEF participants were provided
with a manually created dictionary. This dictionary named AmericanDictionary
contains quadruplets (diagnosis text, codes Icd1, IcdC, Icd2). We only consider
pairs (diagnosis text, Icd1) for our system since most entries in the dictionary
are associated with these codes.</p>
        <p>Cosine similarities vector was calculated as follows. First, for each ICD-10
code present in the dictionary a document was constructed by simply
concatenating diagnosis texts belonging to that code. For the resulting document set,
TF-IDF transformation was computed; thus, every ICD-10 code was provided
with a vector representation. For a given input sequence, the TF-IDF vector
representation was calculated. Using the vector representation of the input sequence
and each ICD-10 code, vector of cosine similarities was constructed such as to
have in the i-th position the cosine similarity measure between input sequence
representation and i-th ICD-10 code representation.</p>
        <p>We have made the implementation of our model available at the github
repository1.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Experiments</title>
      <p>In this section, we discuss the performance of our LSTM-based encoder-decoder
model for ICD coding.
1 https://github.com/dartrevan/clef 2017
3.1</p>
      <sec id="sec-3-1">
        <title>Evaluation Dataset</title>
        <p>The CLEF e-Health 2017 Task 1 participants were provided with data from
13,330 death certi cates for training. Each certi cate contains information about
the demographic attributes of each person (gender, age), other metadata (e.g.,
a location of death) and one or more codes of the primary cause of death.
Diagnostic statements with multiple codes were repeated for each code assigned by
physicians. The test set contained 14,833 certi cates.</p>
        <p>The experiments were also carried out on the following sets:
1. The full version of the CepiDC test set named the \ALL" set.
2. The part of the full test set named the \EXTERNAL" set.</p>
        <p>
          The \ALL" test set consists of texts associated with all ICD codes. The
\EXTERNAL" test set is limited to textual fragments with ICD codes linked with a
particular type of deaths, called \external causes" or violent deaths. The
"EXTERNAL" set was selected due to two reasons: (i) there is a special interest
for the public health policies that can target ICD codes speci cally, e.g. suicide
prevention; (ii) the semantic analysis of the context associated with these deaths
is more complex in terms of comorbidity, a ected people and language models
used to describe the event. External causes are characterized by codes V01 to
Y98. Please refer to the task overview paper [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ] for more details.
3.2
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>Experimental Setting</title>
        <p>
          Word embeddings We used the word embeddings trained on 2,5 millions of
health-related reviews from [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. Statistics of there reviews is presented in Table
2. The embeddings were trained with the Continuous Bag of Words model with
the following parameters: vector size of 200, the length of local context of 10,
negative sampling of 5, vocabulary cuto of 10.
Model tuning To nd optimal neural network con guration and word
embeddings, the 5-fold cross-validation procedure was applied to the training set.
We compared architectures with di erent numbers of neurons in hidden layersof
encoder and decoder LSTM. The best cross-validation F-score is obtained for
the architecture with 600 neurons in the hidden layer of encoder LSTM and
1000 neurons in the hidden layer of the decoder LSTM. We tested bidirectional
LSTM as decoder but did not achieve an improvement over the left-to-right
LSTM. We also established that 10 epochs are enough for stable performance
on the validation sets.
        </p>
        <p>
          We have implemented networks with the Keras library [
          <xref ref-type="bibr" rid="ref23">23</xref>
          ]. LSTM is trained
on top of the embedding layer. We use the 600-dimensional hidden layer for
the encoder RNN chain. Finally, the last hidden state of LSTM chain output
concatenated with cosine similarities vector is fed into a decoding LSTM layer
with 1000-dimensional hidden layer and softmax activation. In order to prevent
neural networks from over tting, we used dropout of 0.5 [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ]. We used categorical
cross entropy as the objective function and the Adam optimizer [
          <xref ref-type="bibr" rid="ref25">25</xref>
          ] with the
batch size of 20.
        </p>
        <p>
          In addition, we have evaluated word embeddings trained on biomedical
literature indexed in PubMed from [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ] as well as on health-related reviews from
[
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. Embeddings on health-related reviews showed better results during
crossvalidation. We also tried to exploit meta-information along with cosine
similarities vectors but we did not observe any signi cant improvement.
3.3
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>Results</title>
        <p>Our neural models were evaluated on texts in English using common evaluation
metrics such as precision, recall and balanced F-measure. We trained our model
for 10 epochs (Run1) and 15 epochs (Run2). The reported results are presented
in Tables 3 and 4.</p>
        <p>As shown in Tables 3 and 4 our performance results are signi cantly better
than the average and median score of all submitted runs. The system obtained
F-scores of 85.01% and 44.33% on the full test set and the \EXTERNAL" set,
respectively. The di erence of results on these sets is explained by a small number
of codes in the latter case. The \ALL" set includes 18,928 codes (900 unique
codes), while the \EXTERNAL" set includes only 126 codes (28 unique codes).
We note that RNNs and word embeddings can be successfully applied to medical
concept coding tasks without any task-speci c feature engineering e ort.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Related Work</title>
      <p>
        Di erent approaches have been developed for ICD coding task, mainly falling
into two categories: (i) knowledge-based methods [27{29]; and (ii) machine
learning approaches [
        <xref ref-type="bibr" rid="ref30 ref31">30, 31</xref>
        ].
      </p>
      <p>
        In the CLEF eHealth 2016, ve teams participated in the shared task 2
about the ICD-10 coding of death certi cates in French [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. Most methods
utilized dictionary-based semantic similarity and, to some extent, string matching.
Mulligen et al. [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ] obtained the best results by combining a Solr tagger with
ICD-10 terminologies. The terminologies were derived from the task training set
and a manually curated ICD-10 dictionary. They achieved F-measure of 84.8%.
Cabot et al. [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ] applied an approximate string matching method and obtained
F-measure of 68.0%. Mottin et al. [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ] used a pattern matching approach and
obtained F-measure of 55.4%. Dermouche et al. [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ] applied two machine learning
methods: (i) a supervised extension of Latent Dirichlet Allocation (LDA), i.e.,
Labeled-LDA and (ii) Support Vector Machine (SVM) based on bag-of-word
features. For Labeled-LDA, they used ICD-10 codes from the training set as
documents classes. The Labeled-LDA and SVM classi er archived F-measures
of 73.53% and 75.19%, respectively. This study did not focus on designing
effective features to obtain better classi cation performance. Zweigenbaum and
Lavergne [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ] proposed a classi er with TF-IDF transformer for tokens and used
cosine similarity for ranking of classi cation codes. They studied the problem of
learning to accurately rank a set of candidate codes obtained as a result of
classi cation. The authors explored the e ectiveness of several groups of features
including meta-information and n-grams of normalized tokens. They focused
only on statements which are associated with a singular code. The proposed
approach obtained F-measure of 65.2% due to low recall of 56.8%. In recent work
[
        <xref ref-type="bibr" rid="ref32">32</xref>
        ], Zweigenbaum and Lavergne utilized a hybrid method combining simple
dictionary projection and mono-label supervised classi cation. They used
Linear SVM trained on the full training corpus and the 2012 dictionary provided
for CLEF participants. This hybrid method obtained an F-measure of 85.86%.
Overall, the participants of task 2 did not use word embeddings or deep neural
networks, which are proved useful in many natural language processing tasks.
      </p>
      <p>
        Besides experiments on CLEF eHealth data sets, the medical concept coding
task has also been studied by several researchers. Ontologies of medical concepts
such as the Uni ed Medical Language System (UMLS) [
        <xref ref-type="bibr" rid="ref33">33</xref>
        ], SNOMED CT [
        <xref ref-type="bibr" rid="ref34">34</xref>
        ],
and ICD-9 or ICD-10 are widely used for this task. In order to map texts to
medical concepts in the UMLS, the National Library of Medicine (NLM) developed
MetaMap [
        <xref ref-type="bibr" rid="ref35">35</xref>
        ]. This system is based on a linguistic approach using variants
of terms and rules. Recent studies applied machine learning methods such as
learning-to-rank methods [
        <xref ref-type="bibr" rid="ref36">36</xref>
        ] and convolutional neural networks [
        <xref ref-type="bibr" rid="ref37">37</xref>
        ]. Leaman
et al. introduced a DNorm system based on pairwise learning-to-rank technique
with a prede ned set of features [
        <xref ref-type="bibr" rid="ref36">36</xref>
        ]. Features were based on a dictionary of
diseases derived from the UMLS Metathesaurus. Recently, Limsopatham and
Collier [
        <xref ref-type="bibr" rid="ref37">37</xref>
        ] experimented with convolutional and recurrent neural networks with
pre-trained word embeddings for mapping social media texts to medical concepts.
The authors observed that training can be e ectively achieved at 40-70 epochs
for corpora of tweets and user reviews. Experiments showed that both neural
networks outperformed the DNorm system and a multi-class logistic regression.
Word embeddings trained on a Google News corpus improved signi cantly over
embeddings on medical articles downloaded from BioMed Central. In [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ],
using word embeddings trained on social media produces better scores than using
embeddings trained on PubMed articles for disease named entity recognition.
We also mark word embeddings trained on electronic health records [38{40] for
future work.
5
      </p>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>In this paper, we have developed RNN-based encoder-decoder models for ICD-10
coding on Task 1 of the 2017 CLEF eHealth evaluation lab. Our results show
that the neural network performs signi cantly better than the o cial median and
average computed using the participants' runs, reaching F-measure of 85.01%
on the full test set. In further studies, we plan to implement other
encoderdecoder architectures and convolutional neural networks. We also plan to carry
out a qualitative analysis on the extracted codes. Additionally, we would like
to explore alternative distributed word representations trained on medical notes
from electronic health records.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgements</title>
      <p>This work was supported by the Russian Science Foundation grant no.
15-1110019.
40. Dernoncourt, F., Lee, J.Y., Uzuner, O., Szolovits, P.: De-identi cation of patient
notes with recurrent neural networks. Journal of the American Medical Informatics
Association 24(3) (2017) 596{606</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Miftahutdinov</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tutubalina</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tropsha</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Identifying Disease-related Expressions in Reviews using Conditional Random Fields</article-title>
          .
          <source>In: Proceedings of International Conference on Computational Linguistics and Intellectual Technologies Dialog</source>
          . Volume
          <volume>1</volume>
          . (
          <year>2017</year>
          )
          <volume>155</volume>
          {
          <fpage>167</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Zeng</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhao</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          :
          <article-title>Distant Supervision for Relation Extraction via Piecewise Convolutional Neural Networks</article-title>
          . In: EMNLP. (
          <year>2015</year>
          )
          <volume>1753</volume>
          {
          <fpage>1762</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Solovyev</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ivanov</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Knowledge-driven event extraction in Russian: corpusbased linguistic resources</article-title>
          .
          <source>Computational intelligence and neuroscience</source>
          <year>2016</year>
          (
          <year>2016</year>
          )
          <fpage>16</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Cho</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          , Van Merrienboer,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Gulcehre</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Bahdanau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Bougares</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            ,
            <surname>Schwenk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            ,
            <surname>Bengio</surname>
          </string-name>
          ,
          <string-name>
            <surname>Y.</surname>
          </string-name>
          :
          <article-title>Learning phrase representations using RNN encoder-decoder for statistical machine translation</article-title>
          .
          <source>arXiv preprint arXiv:1406.1078</source>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Cho</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          , Van Merrienboer,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Bahdanau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Bengio</surname>
          </string-name>
          ,
          <string-name>
            <surname>Y.</surname>
          </string-name>
          :
          <article-title>On the properties of neural machine translation: Encoder-decoder approaches</article-title>
          .
          <source>arXiv preprint arXiv:1409.1259</source>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Sutskever</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vinyals</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Le</surname>
            ,
            <given-names>Q.V.</given-names>
          </string-name>
          :
          <article-title>Sequence to sequence learning with neural networks</article-title>
          .
          <source>In: Advances in neural information processing systems</source>
          . (
          <year>2014</year>
          )
          <volume>3104</volume>
          {
          <fpage>3112</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>Dos</given-names>
            <surname>Santos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.N.</given-names>
            ,
            <surname>Gatti</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          :
          <article-title>Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts</article-title>
          . In: COLING. (
          <year>2014</year>
          )
          <volume>69</volume>
          {
          <fpage>78</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Joty</surname>
            ,
            <given-names>S.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Meng</surname>
            ,
            <given-names>H.M.</given-names>
          </string-name>
          :
          <article-title>Fine-grained Opinion Mining with Recurrent Neural Networks and Word Embeddings</article-title>
          . In: EMNLP. (
          <year>2015</year>
          )
          <volume>1433</volume>
          {
          <fpage>1443</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Deriu</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gonzenbach</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Uzdilli</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lucchi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>De Luca</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jaggi</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>SwissCheese at SemEval-2016 Task 4: Sentiment classi cation using an ensemble of convolutional neural networks with distant supervision</article-title>
          .
          <source>Proceedings of SemEval</source>
          (
          <year>2016</year>
          )
          <volume>1124</volume>
          {
          <fpage>1128</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Tutubalina</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nikolenko</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <source>Automated Prediction of Demographic Information from Medical User Reviews. In: International Conference on Mining Intelligence and Knowledge Exploration</source>
          , Springer, Cham (
          <year>2016</year>
          )
          <volume>174</volume>
          {
          <fpage>184</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Benton</surname>
            , A., Mitchell,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hovy</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Multitask learning for mental health conditions with limited social media data</article-title>
          ,
          <source>EACL</source>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Elman</surname>
            ,
            <given-names>J.L.</given-names>
          </string-name>
          :
          <article-title>Finding structure in time</article-title>
          .
          <source>Cognitive science 14(2)</source>
          (
          <year>1990</year>
          )
          <volume>179</volume>
          {
          <fpage>211</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Schuster</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paliwal</surname>
            ,
            <given-names>K.K.:</given-names>
          </string-name>
          <article-title>Bidirectional recurrent neural networks</article-title>
          .
          <source>IEEE Transactions on Signal Processing</source>
          <volume>45</volume>
          (
          <issue>11</issue>
          ) (
          <year>1997</year>
          )
          <volume>2673</volume>
          {
          <fpage>2681</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Graves</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fernandez</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schmidhuber</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          :
          <article-title>Bidirectional LSTM networks for improved phoneme classi cation and recognition</article-title>
          .
          <source>Arti cial Neural Networks: Formal Models and Their Applications{ICANN</source>
          <year>2005</year>
          (
          <year>2005</year>
          )
          <volume>753</volume>
          {
          <fpage>753</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Neveol</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goeuriot</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kelly</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cohen</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grouin</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hamon</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lavergne</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rey</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Robert</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tannier</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          , et al.:
          <article-title>Clinical information extraction at the CLEF eHealth evaluation lab 2016</article-title>
          .
          <article-title>In: Proceedings of CLEF 2016 Evaluation Labs</article-title>
          and Workshop: Online Working Notes. CEUR-WS (
          <year>September 2016</year>
          ).
          <article-title>(</article-title>
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Lavergne</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Neveol</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Robert</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grouin</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rey</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zweigenbaum</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>A Dataset for ICD-10 Coding of Death Certi cates: Creation and Usage</article-title>
          .
          <source>BioTxtM</source>
          <year>2016</year>
          (
          <year>2016</year>
          )
          <fpage>60</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Goeuriot</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kelly</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Suominen</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nvol</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Robert</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kanoulas</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Spijker</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Palotti</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zuccon</surname>
          </string-name>
          , G.:
          <article-title>CLEF 2017 eHealth Evaluation Lab Overview</article-title>
          .
          <source>Lecture Notes in Computer Science (including subseries Lecture Notes in Arti cial Intelligence and Lecture Notes in Bioinformatics)</source>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Nvol</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Anderson</surname>
            ,
            <given-names>R.N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cohen</surname>
            ,
            <given-names>K.B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grouin</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lavergne</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rey</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Robert</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rondet</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zweigenbaum</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <string-name>
            <surname>CLEF eHealth 2017 Multilingual Information</surname>
          </string-name>
          <article-title>Extraction task overview: ICD10 coding of death certi cates in English and French</article-title>
          . In: Working Notes of Conference and
          <article-title>Labs of the Evaluation (CLEF) Forum</article-title>
          . CEUR Workshop Proceedings. (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Bengio</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Simard</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Frasconi</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Learning long-term dependencies with gradient descent is di cult</article-title>
          .
          <source>IEEE transactions on neural networks 5(2)</source>
          (
          <year>1994</year>
          )
          <volume>157</volume>
          {
          <fpage>166</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Gre</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Srivastava</surname>
            ,
            <given-names>R.K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Koutn</surname>
            <given-names>k</given-names>
          </string-name>
          , J.,
          <string-name>
            <surname>Steunebrink</surname>
            ,
            <given-names>B.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schmidhuber</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>LSTM: A search space odyssey</article-title>
          .
          <source>IEEE transactions on neural networks and learning systems</source>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Graves</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schmidhuber</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          :
          <article-title>Framewise phoneme classi cation with bidirectional LSTM networks</article-title>
          .
          <source>In: Neural Networks</source>
          ,
          <year>2005</year>
          . IJCNN'
          <fpage>05</fpage>
          .
          <string-name>
            <surname>Proceedings</surname>
          </string-name>
          .
          <source>2005 IEEE International Joint Conference on. Volume</source>
          <volume>4</volume>
          ., IEEE (
          <year>2005</year>
          )
          <year>2047</year>
          {
          <fpage>2052</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Mikolov</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sutskever</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Corrado</surname>
            ,
            <given-names>G.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dean</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          :
          <article-title>Distributed representations of words and phrases and their compositionality</article-title>
          .
          <source>In: Advances in neural information processing systems</source>
          . (
          <year>2013</year>
          )
          <volume>3111</volume>
          {
          <fpage>3119</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Chollet</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          , et al.: Keras. https://github.com/fchollet/keras (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Srivastava</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hinton</surname>
            ,
            <given-names>G.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Krizhevsky</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sutskever</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Salakhutdinov</surname>
          </string-name>
          , R.:
          <article-title>Dropout: a simple way to prevent neural networks from over tting</article-title>
          .
          <source>Journal of Machine Learning Research</source>
          <volume>15</volume>
          (
          <issue>1</issue>
          ) (
          <year>2014</year>
          )
          <year>1929</year>
          {
          <fpage>1958</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <surname>Kinga</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Adam</surname>
            ,
            <given-names>J.B.</given-names>
          </string-name>
          :
          <article-title>A method for stochastic optimization</article-title>
          .
          <source>In: International Conference on Learning Representations (ICLR)</source>
          .
          <article-title>(</article-title>
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26.
          <string-name>
            <surname>Moen</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ananiadou</surname>
            ,
            <given-names>T.S.S.:</given-names>
          </string-name>
          <article-title>Distributional semantics resources for biomedical text processing (</article-title>
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          27.
          <string-name>
            <surname>Van Mulligen</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Afzal</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Akhondi</surname>
            ,
            <given-names>S.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vo</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kors</surname>
            ,
            <given-names>J.A.</given-names>
          </string-name>
          :
          <string-name>
            <surname>Erasmus</surname>
            <given-names>MC</given-names>
          </string-name>
          <source>at CLEF eHealth</source>
          <year>2016</year>
          :
          <article-title>Concept recognition and coding in French texts</article-title>
          ,
          <source>CLEF</source>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          28.
          <string-name>
            <surname>Cabot</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Soualmia</surname>
            ,
            <given-names>L.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dahamna</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Darmoni</surname>
            ,
            <given-names>S.J.</given-names>
          </string-name>
          :
          <source>SIBM at CLEF eHealth Evaluation Lab</source>
          <year>2016</year>
          :
          <article-title>Extracting Concepts in French Medical Texts with ECMT and CIMIND</article-title>
          ,
          <string-name>
            <surname>CLEF</surname>
          </string-name>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          29.
          <string-name>
            <surname>Mottin</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gobeill</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mottaz</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pasche</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gaudinat</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ruch</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <source>BiTeM at CLEF eHealth Evaluation Lab 2016 Task</source>
          <volume>2</volume>
          : Multilingual Information Extraction
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          30.
          <string-name>
            <surname>Dermouche</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Looten</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Flicoteaux</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chevret</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Velcin</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Taright</surname>
          </string-name>
          , N.:
          <article-title>ECSTRA-INSERM@ CLEF eHealth2016-task 2: ICD10 code extraction from death certi cates</article-title>
          ,
          <source>CLEF</source>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          31.
          <string-name>
            <surname>Zweigenbaum</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lavergne</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>LIMSI ICD10 coding experiments on CepiDC death certi cate statements</article-title>
          ,
          <source>CLEF</source>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          32.
          <string-name>
            <surname>Zweigenbaum</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lavergne</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Hybrid methods for icd-10 coding of death certi - cates</article-title>
          .
          <source>EMNLP</source>
          <year>2016</year>
          (
          <year>2016</year>
          )
          <fpage>96</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          33.
          <string-name>
            <surname>Bodenreider</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          :
          <article-title>The uni ed medical language system (UMLS): integrating biomedical terminology</article-title>
          .
          <source>Nucleic acids research 32(suppl 1)</source>
          (
          <year>2004</year>
          )
          <article-title>D267</article-title>
          {
          <fpage>D270</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          34.
          <string-name>
            <surname>Spackman</surname>
            ,
            <given-names>K.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Campbell</surname>
          </string-name>
          , K.E., C^ote, R.A.:
          <article-title>SNOMED RT: a reference terminology for health care</article-title>
          .
          <source>In: Proceedings of the AMIA annual fall symposium</source>
          , American Medical Informatics Association (
          <year>1997</year>
          )
          <fpage>640</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          35.
          <string-name>
            <surname>Aronson</surname>
            ,
            <given-names>A.R.:</given-names>
          </string-name>
          <article-title>E ective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program</article-title>
          .
          <source>In: Proceedings of the AMIA Symposium</source>
          , American Medical Informatics Association (
          <year>2001</year>
          )
          <fpage>17</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          36.
          <string-name>
            <surname>Leaman</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Islamaj</surname>
            <given-names>Dogan</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Lu</surname>
          </string-name>
          ,
          <string-name>
            <surname>Z.</surname>
          </string-name>
          :
          <article-title>DNorm: disease name normalization with pairwise learning to rank</article-title>
          .
          <source>Bioinformatics</source>
          <volume>29</volume>
          (
          <issue>22</issue>
          ) (
          <year>2013</year>
          )
          <volume>2909</volume>
          {
          <fpage>2917</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          37.
          <string-name>
            <surname>Limsopatham</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Collier</surname>
          </string-name>
          , N.:
          <article-title>Normalising Medical Concepts in Social Media Texts by Learning Semantic Representation</article-title>
          . In: ACL. (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          38.
          <string-name>
            <surname>Grnarova</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schmidt</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hyland</surname>
            ,
            <given-names>S.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Eickho</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Neural Document Embeddings for Intensive Care Patient Mortality Prediction</article-title>
          .
          <source>arXiv preprint arXiv:1612.00467</source>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          39.
          <string-name>
            <surname>Fries</surname>
            ,
            <given-names>J.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Center</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          : Brundle y at SemEval-2016
          <source>Task</source>
          <volume>12</volume>
          :
          <article-title>Recurrent Neural Networks vs. Joint Inference for Clinical Temporal Information Extraction</article-title>
          .
          <source>Proceedings of SemEval</source>
          (
          <year>2016</year>
          )
          <volume>1274</volume>
          {
          <fpage>1279</fpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>