<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>IxaMed at CLEF eHealth 2018 Task 1: ICD10 Coding with a Sequence-to-Sequence approach</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>A. Atutxa</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>A. Casillas</string-name>
          <email>arantza.casillas@ehu.eus</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>N. Ezeiza</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>V. Fresno</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>I. Goenaga</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>K. Gojenola</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>R. Mart nez</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>M. Oronoz</string-name>
          <email>maite.oronozg@ehu.eus</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>O. Perez-de-Vin~aspre</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Dpt. Electricity and Electronics, Fac. of Science and Technology, UPV/EHU Leioa</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Dpt. Languages and Computer Systems, Faculty of Informatics, UPV/EHU Donostia-San Sebastian</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Dpt. Languages and Computer Systems, School of Engineering, UPV/EHU Bilbao</institution>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Dpt. Languages and Computer Systems. School of Computer Engineering</institution>
          ,
          <addr-line>UNED Madrid</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>Hospital systems routinely assign disease codes (ICD10 codes) to medical records. The challenge stands on treating natural and nonstandard language in which doctors express their diagnoses and, additionally, to solve a large-scale classi cation problem, as there are thousands of possible codes. In this working notes paper, we present our system and the results of the CLEF 2018 eHealth Evaluation Task 1 on Multilingual Information Extraction - ICD10 coding. This benchmark addresses information extraction in written text with focus on several languages, speci cally Hungarian, Italian and French. The goal is to automatically assign ICD10 codes to diagnostic terms of death certi cates. The problem can be cast in di erent ways, for example as a multilabel classi cation task or as sequence-to-sequence prediction. Our proposal follows this last approach, with promising results, well above the average results for the task. It only relies on the material provided by the task organizers, allowing the application of the same system to all datasets.</p>
      </abstract>
      <kwd-group>
        <kwd>Natural language processing Clinical texts ICD10 coding Death certi cates Machine learning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        The aim of this paper is to explore computer aided approaches to classify Medical
Records following the World Health Organization`s International Classi cation
of Diseases (ICD). These records are written in di erent languages and the
CLEF 2018 eHealth Evaluation Task 1 on Multilingual Information Extraction
consists of assigning the right ICD10 coding [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ] according to the diagnostic
terms provided for each Medical Record. Medical Records belong to several
services (pharmacy, documentation, etc.) and achieving their right coding is crucial
to exchange and consult medical information on a daily basis as the ICD codes
serve as a reference to exchange information (e.g., billing, epidemiologies or
mortality) between hospitals in a country and even between countries. So far, it is
common practice in the hospitals to classify the records manually, but there
is an increasing interest in the evolution of the automatic or semi-automatic
classi cation, amongst others, due to economic factors. According to [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], the
approximate cost of ICD-9-CM coding clinical records and correcting related errors
is estimated to be about $25 billion per year in the US. The ICD-10-CM coding
is more complex than the previous ICD-9-CM and the costs will be presumably
higher. For the Clinical Documentation Services, automatically classifying 1%
of the electronic health records would have an outstanding impact in terms of
person-months work.
      </p>
      <p>However, the encoding of diagnoses with ICD codes is a di cult, time
consuming and expensive task for health services. These records are written in a
non-standard medical language causing problems for retrieving and
exchanging information due to elements such as misspellings or colloquial and speci c
language. This lack of standardization also poses a challenge for the automatic
classi cation process due to:
{ Acronyms: the adoption of non standard contractions for the word-forms.
{ Abbreviations.
{ Omissions: often prepositions, articles or verbs are omitted in an attempt to
write the word-form quickly.
{ Synonyms: some technical words are typically replaced by others.
{ Misspells: sometimes words are incorrectly written.</p>
      <p>The IxaMed group has approached the automatic ICD10 coding for French,
Italian and Hungarian with a neural model that tries to map the input text
snippets with the output ICD10 codes. Our solution does not make assumptions
about the content of the input and output data, treating them by means of a
machine learning approach that assigns a set of labels to any input line. The
solution is language-independent, in the sense that treating a new language only
needs a set of (input, output) examples, making no use of language-speci c
information apart from terminological resources such as ICD10 dictionaries, when
available.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>Computer aided classi cation of medical records can be seen as a pattern
recognition task, as the aim is to recognize unknown instances of expressions and
assign them one or more elements from a set of possible labels. This problem
has been approached in several tasks and challenges using di erent techniques.</p>
      <p>
        The 2007 Computational Medicine Challenge [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], the rst shared task related
to ICD coding, was designed: (i) to facilitate advances in mining clinical free
text and (ii) to create a publicly available gold standard that could serve as
the seed for a larger, open source clinical corpus. This Challenge involved the
classi cation of English clinical free texts by automatically assigning ICD-9-CM
codes in a limited domain devoted to radiology reports. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] addressed this shared
task employing machine learning approaches. Their results showed that
handcrafted systems could be reproduced by replacing several laborious steps in their
construction with machine learning models, reporting an F1-measure of 0.8893.
By contrast to [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] we focus on the entire scope of the ICD10 catalog. That is,
while they were dealing with 45 classes, we have to cope with thousands of
classes.
      </p>
      <p>
        Perez et al. [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ] proposed the use of Finite-State Transducers (FSTs) that
constrain the allowed input diagnostic string, synchronously producing the
output ICD class. FSTs are versatile and e cient to implement soft-matching
operations between terms expressed in natural language to standard terms and,
hence, to the nal ICD code. The FSTs were built up from a corpora and
standard resources such as the ICD-9-CM and SNOMED CT amongst others. An
F1-measure of 0.9120 was achieved on a test-set of 2,850 randomly selected
diagnostic terms. A di erence with the present work is that in their system the
input diagnostic terms were correctly aligned by physicians one by one, while at
the 2018 shared task most ICD10 codes are aligned only at the document level,
which makes the task harder.
      </p>
      <p>
        Perez et al. [11] tackle diagnostic term normalization employing Weighted
Finite-State Transducers (WFSTs) that learn how to translate sequences into
standard representations given a set of samples. They are highly exible and
easily adaptable to terminological singularities of di erent hospitals and
practitioners. They also implemented a similarity metric to enhance
spontaneousstandard term matching. Looking at their results, they found that only 7.71%
of the diagnostics were written in their standard form matching the ICD. This
WFST-based system enabled matching spontaneous ICD codes with a Mean
Reciprocal Rank of 0.68, which means that, on average, the right ICD code for
each diagnosis is found between the rst and second position among the
normalized set of candidates. Similarly, Almagro et al. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] experiment a combination of
techniques for ICD-10 coding in Spanish.
      </p>
      <p>
        CLEF eHealth 2017 Task 1 is a similar challenge but multilingual since the
texts were both English and French, more extensive because it was not limited to
an speci c service and employed ICD-10-CM for coding instead of ICD-9-CM. [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]
implemented recurrent neural networks to automatically assign ICD10 codes to
fragments of death certi cates written in English. Their system used Long
ShortTerm Memory (LSTM) to map the input sequence into a vector representation,
and then another LSTM to decode the target sequence from the vector. They
initialized the input representations with word embeddings trained on user posts
in social media. Their encoder-decoder model obtained an F-measure of 0.8501
on a test set, with signi cant improvement as compared to the average score of
0.6220 for all participants approaches.
      </p>
      <p>
        Other systems presented at the CLEF 2017 shared task made use of varied
approaches. In [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], they composed a large scale feature set comprising more
than 40k features based on bag of words, bag of 2-grams, bag of 3-grams, latent
Dirichlet allocation, and the ontologies of WordNet and UMLS. [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] used concept
detection and normalization experiments, starting upon dictionary projection
and supervised multi-class, mono-label text classi cation using simple features,
and extending the system in several dimensions with multi-label classi cation
and new features, including a combination of dictionary and classi er.
      </p>
      <p>To summarize, we can say that the problem presents a complex
characterization due to multiple factors, like non-standard language variation, spontaneous
writing, or large-scale multilabel classi cation. Accordingly, there are plenty and
varied approaches to tackle it, ranging from knowledge-based solutions to
statistical and deep learning ones.
3
3.1</p>
    </sec>
    <sec id="sec-3">
      <title>Resources and Methods</title>
      <sec id="sec-3-1">
        <title>Corpus</title>
        <p>
          In the present challenge [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ], French, Italian and Hungarian are the languages
under study. There are two sources of information:
{ ICD-10 dictionaries.
{ Di erent sets of documents and their corresponding (text lines, ICD10 code)
pairs.
        </p>
        <p>The sets of document-ICD10 codes come in two di erent formats: raw and
aligned, though the aligned version is only available for French. For the raw
version, the diagnostic terms as expressed in the original death certi cate are stored
in one le (CausesBrutes, see Table 1) separately from the coding which is stored
in another one (CausesCalculees, see Table 2). The link between them can be
carried out through indexing information common to both: document identi er,
year of the death certi cate, and line number within the death certi cate
representing the exact location in the text. As previously stated, diagnostic terms
in the CausesBrutes les appear as originally expressed in the death certi cates
and therefore they show orthographic misspellings (infacrtus vs. infarctus ) and
abbreviations (HTA vs. hypertension artrielle, see Table 1).</p>
        <p>It is important to notice that a one-to-one correspondence between the raw
diagnostic term and the ICD is not assured. Missmatches occur like the ones
shown in document 100569, where line 5 in Table 1 has no correspondence in
Table 2. The correspondence appears in line 6. It might happen to nd more
than one diagnostic term in one line separated by commas, and coordination by
means of complementizers or prepositions.</p>
        <p>In the aligned version, the original text is accompanied with the standard
text and the ICD associated (see Table 3).
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Description of the System</title>
        <p>Preprocessing. With the aim of boosting the ICD assignation, we preprocessed
the raw corpora to organize the information at three levels: document level, line
level and nally ICD level. At the document level and line level, we grouped all
diagnostic terms and ICD codes by document and by line respectively, hoping
that the system could capture dependencies among the di erent ICD codes. It
seems logic to think that ICD codes within a document or within a line are
related to each other and, if so, ensemble recognition might be helpful. The
preprocess to obtain the line level information consisted mostly in trying to
overcome the alignment mistakes in the original corpus as shown in Tables 1
and 2. At the ICD level, we treated separately each (diagnostic term ICD) pair
aiming to simplify the assignment process but at the cost of missing any
interrelation that could exist. This level required a more re ned preprocessing since
the original information was set at the line level. Remember that certain lines
showed several diagnostic terms.</p>
        <p>As a rst step in normalization, the input texts were preprocessed in the
following order: tokenization, lowercasing and substitution of accents. These are
standard operations in sequence-to-sequence learning, that help to improve the
results.</p>
        <p>ICD10 coding. In neural sequence-to-sequence modeling, the encoder-decoder
model has been used to encode a variable-length input sequence of tokens into
a sequence of vector representations, and to then decode those representations
into a sequence of output tokens, in this case ICD10 codes.</p>
        <p>This decoding is conditioned on information from both the latent input vector
encodings as well as its own continually updated internal state, motivating the
idea that the model should be able to capture meanings and interactions beyond
those at the word level [12, 13].</p>
        <p>The supplied data was divided in three subsets. A training set was iteratively
evaluated on a second hold-out evaluation set and, nally, the best performing
system was evaluated on an independent third set. For the nal submission,
the training and hold-out sets were merged, using the third subset for iterative
evaluation, and applying the best system on the unseen test set.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Results and Discussion</title>
      <p>the diagnostic codes which is kept at the line level but cannot be assumed at
document level.
This work tackles the classi cation of medical records following the ICD10
standard. The classi cation problem is tough for several reasons: 1) the gap between
spontaneous written language and standard one; and 2) it is a large-scale
classi cation system, being the number of possible classes the number of di erent
diseases within the ICD10 catalogue.</p>
      <p>Our best system showed high-quality results, and this fact opens a promising
avenue for the task of automatically assigning ICD10 codes to medical
documents. Moreover, the method is language independent and it allows e cient
training, given only a set of annotated documents, not requiring complex feature
engineering.
6</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgements</title>
      <p>This work has been partially funded by:</p>
      <p>We gratefully acknowledge the support of NVIDIA Corporation with the
donation of the Titan X Pascal GPU used for this research.
11. Perez, A., Atutxa, A., Casillas, A., Gojenola, K., Sellart, A.: Inferred joint
multigram models for medical term normalization according to ICD. International
Journal of Medical Informatics, 110, pp. 111{117 2018.
12. Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to Sequence Learning with Neural
Networks. Advances in Neural Information Processing Systems 27: Annual
Conference on Neural Information Processing Systems (NIPS), pp. 3104{3112 2014.
13. Cho, K., van Merrienboer, B., Gulcehre, G., Bougares, F., Schwenk, H., Bengio,
Y.: Learning Phrase Representations using RNN Encoder-Decoder for Statistical
Machine Translation. Conference on Empirical Methods in Natural Language
Processing (EMNLP 2014) 2014.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Suominen</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kelly</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goeuriot</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kanoulas</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Azzopardi</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Spijker</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Neveol</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ramadier</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Robert</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Palotti</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zuccon</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <article-title>Overview of the CLEF eHealth Evaluation Lab 2018</article-title>
          .
          <source>In: CLEF 2018 - 8th Conference and Labs of the Evaluation Forum, Lecture Notes in Computer Science (LNCS)</source>
          , Springer,
          <year>September 2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Neveol</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Robert</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grippo</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Morgand</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Orsi</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pelikan</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ramadier</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rey</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zweigenbaum</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <string-name>
            <surname>CLEF eHealth 2018 Multilingual Information</surname>
          </string-name>
          <article-title>Extraction task Overview: ICD10 Coding of Death Certi cates in French, Hungarian and Italian</article-title>
          . In:
          <article-title>CLEF 2018 Evaluation Labs</article-title>
          and Workshop: Online Working Notes, CEUR-WS,
          <year>September 2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Farkas</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Szarvas</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          :
          <article-title>Automatic construction of rule-based ICD-9-CM coding systems</article-title>
          .
          <source>BMC Bioinformatics</source>
          ,
          <volume>9</volume>
          (
          <issue>Suppl</issue>
          . 3), 1{
          <fpage>9</fpage>
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Perez</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Casillas</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gojenola</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Oronoz</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Aguirre</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Amillano</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          :
          <article-title>The aid of machine learning to overcome the classi cation of real health discharge reports written in Spanish</article-title>
          . Revista
          <string-name>
            <surname>de Procesamiento de Lenguaje Natural</surname>
          </string-name>
          (ISSN:
          <fpage>1135</fpage>
          -
          <lpage>5948</lpage>
          )
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Perez</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gojenola</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Casillas</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Oronoz</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>D az de Ilarraza</surname>
          </string-name>
          , A.:
          <article-title>Computer aided classi cation of diagnostic terms in spanish</article-title>
          .
          <source>Expert Systems with Applications</source>
          ,
          <volume>42</volume>
          (
          <issue>6</issue>
          ),
          <volume>2949</volume>
          {
          <fpage>2958</fpage>
          .
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Almagro</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mart</surname>
            <given-names>nez</given-names>
          </string-name>
          , R.,
          <string-name>
            <surname>Fresno</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Montalvo</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          : Estudio preliminar de la anotacin automtica de cdigos CIE-10 en informes de alta hospitalarios. Revista de Procesamiento de Lenguaje Natural (ISSN:
          <fpage>1135</fpage>
          -
          <lpage>5948</lpage>
          ) (
          <year>60</year>
          )
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Pestian</surname>
            ,
            <given-names>John P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brew</surname>
            , Christopher, Matykiewicz, Pawel, Hovermale,
            <given-names>D. J.</given-names>
          </string-name>
          , Johnson, Neil, Cohen,
          <string-name>
            <given-names>K.</given-names>
            <surname>Bretonnel</surname>
          </string-name>
          , Duch, Wlodzislaw:
          <string-name>
            <given-names>A Shared</given-names>
            <surname>Task</surname>
          </string-name>
          <article-title>Involving Multilabel Classi cation of Clinical Free Text</article-title>
          .
          <source>In: Proceedings of the Workshop on BioNLP</source>
          <year>2007</year>
          :
          <article-title>Biological, Translational, and Clinical Language Processing</article-title>
          ,
          <source>BioNLP '07</source>
          , pp.
          <volume>97</volume>
          {
          <fpage>104</fpage>
          . Association for Computational Linguistics, Stroudsburg, PA,
          <string-name>
            <surname>USA</surname>
          </string-name>
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Miftahutdinov</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tutubalina</surname>
          </string-name>
          , E.:
          <article-title>KFU at CLEF eHealth 2017 Task 1: ICD-10 Coding of English Death Certi cates with Recurrent Neural Networks</article-title>
          .
          <source>In: CLEF 2017 Conference and Labs of the Evaluation Forum</source>
          , Online Working Notes, CEURWS,
          <year>September</year>
          .
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Zweigenbaum</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lavergne</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Multiple Methods for Multi-class, Multi-label ICD10 Coding of Multi-granularity, Multilingual Death Certi cates</article-title>
          .
          <source>In: CLEF 2017 Conference and Labs of the Evaluation Forum</source>
          , Online Working Notes, CEUR-WS,
          <year>September 2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Ebersbach</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Herms</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Eibl</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Fusion Methods for ICD10 Code Classi cation of Death Certi cates in Multilingual Corpora</article-title>
          . In:
          <article-title>CLEF 2017 Conference and Labs of the Evaluation Forum</article-title>
          , Online Working Notes, CEUR-WS,
          <year>September 2017</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>