<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>CLEF</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>ICD-10 coding based on semantic distance: LSI UNED at CLEF eHealth 2020 Task 1 ?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>National University of Distance Education (UNED)</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Madrid</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Spain</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>malmagro</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>raquel</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>vfresnog@lsi.uned.es</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>King Juan Carlos University (URJC)</institution>
          ,
          <addr-line>28933 Madrid</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University College London (UCL)</institution>
          ,
          <addr-line>London NW1 2DA</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2020</year>
      </pub-date>
      <volume>22</volume>
      <fpage>22</fpage>
      <lpage>25</lpage>
      <abstract>
        <p>This paper describes our contribution to the CLEF eHealth 2020 Task 1, consisting of the CIE-10-ES annotation of Spanish Electronic Health Records (EHRs). CIE-10-ES coding is the extended version of the ICD-10 in Spain. One of the sub-tasks is aimed at the interpretability of proposals, which is in line with the latest demands in Natural Language Processing (NLP). Moreover, ICD-10 entries generated by hospitals usually follow an extreme distribution, involving complex annotation challenges. For that reason, an unsupervised semantic similarity-based method has been explored using a representation based on SNOMED-CT clinical terminology. Since example-based learning is able to capture complex patterns, the proposal has been combined with Gradient Boosting methods to model the codes with more instances. mAP scores of 0.517 are achieved for CIE-10-ES codes associated with diagnoses and 0.398 for CIE-10-ES procedure codes. The mixed approach improves the strict supervised proposals by more than 38% and 13% respectively. Finally, the unsupervised component is used to provide code evidences in EHRs exploiting a greater interpretability.</p>
      </abstract>
      <kwd-group>
        <kwd>Semantic similarity</kwd>
        <kwd>Ensemble method</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Health services produce vast amounts of data every day. A signi cant proportion
of this information is captured by clinicians in EHRs. Although EHRs may
contain structured information such as medical test results, both the patient's
history and the clinician's judgment are often described in natural language,
requiring more exibility and higher level of customization to capture all the
details.</p>
      <p>The clinical domain is characterized by a very diverse language, full of acronyms,
misspellings, large specialized vocabularies, unstructured phrases, and a
substantial richness of granularities. While there is a wide range of general NLP tools,
the reduced accessibility of biomedical texts hinders the implementation of
domain techniques to deal with all these complexities. In turn, the non-domain
tools do not seem to t properly into the requested tasks because clinical
language involves a lack of lexical standardization. Therefore, syntactic rules are
often relaxed with respect to texts in the general domain.</p>
      <p>One of the tasks with the greatest impact on hospital funding is the
ICD10 coding, which aims to translate causes of morbidity and mortality written
in natural language into structured data that can be quanti ed for statistical
analysis. ICD-10 coding is a high-level task, which requires dealing with strong
lexical variability, language understanding for document-level decision making,
and extremely unbalanced data distributions. These restrictions are
accompanied by the General Data Protection Regulation (GDPR) applied by Europe,
which greatly reduces private access to patient data in order to preserve privacy.
Limited access to clinical data and frequent associated biases often result in the
scarcity of data and the absence of numerous codes within available data sets.
These challenge raise questions about the viability of some data-driven models.
Thus data augmentation, transfer learning, and unsupervised techniques become
particularly relevant.</p>
      <p>
        In this paper we present our contribution to the CLEF eHealth 2020 Task
1 [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], which aims at CIE-10-ES coding of EHRs. An unsupervised
similaritybased method that can suggest codes not present in the training data is proposed.
In addition, a boosting method is analysed to reveal the limitations of
datadriven systems. Finally, both methods are combined to exploit the provided
data but mitigating the scarcity of instances.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>This is not the rst time that the Conference and Labs of the Evaluation Forum
(CLEF) encourages the proposal of methods to solve ICD coding task. Between
2016 and 2018, tasks have been organized for the classi cation of death notes.
The documents used consist of a couple of short lines of text describing the
diagnoses. The correspondence per sentence is `zero-to-many' codes per sentence.
Although some sentences do not contain diagnoses, the tendency is for the
relevant information to be highly condensed.</p>
      <p>
        Unsupervised approaches have been explored in these conditions. For
example, Van Mulligen et al. [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ] and Ho-Dac et al. [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] focus on applying Information
Retrieval (IR) techniques to search for similar instances, Cabot et al. deal with
coding as a Name Entity Recognition (NER) task [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] and Cossin et al. use lexical
similarity based on Levenshtein distance to assign codes [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Multiple
featurebased approaches have also been proposed, such as linear regression [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], Naive
Bayes (NB) [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], and random forests [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. In addition, sequence-to-sequence
approaches have been explored [
        <xref ref-type="bibr" rid="ref10 ref23 ref3">23, 10, 3</xref>
        ], in which an encoder-decoder structure is
used to transform a sequence of words into a sequence of codes.
      </p>
      <p>
        In 2019, the codi cation of summaries of animal experiments into ICD-10
subchapters was proposed [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. The summaries provided include paragraphs where
the relevant information is more sparse. In contrast, the number of classes is
severely reduced to a few categories, signi cantly simplifying the problem. Two
models based on BERT [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] are proposed exploiting fewer and more frequent
labels. Amin et al. use the multilingual BERT-Base model directly on the German
reports [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], and Sanger et al. ne-tune the BioBERT model [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] on the reports
translated into English [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]. The latter reaches the best results. In line with
the unsupervised proposals, Ahmed et al. explore the application of k-Nearest
Neighbor (kNN) [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>
        The current proposed task is more in line with the need of hospitals. It
consists of coding complete EHRs, which are longer than the documents of previous
years, using all the o cial codes. In this paper we have followed proposals for
semantic similarity between terms and codes such as those proposed by Ning et
al. [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] and Chen et al. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], but at the document level, with all the di culties
that this entails.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Proposed Approach</title>
      <p>We explore a semantic similarity-based method for CIE-10-ES coding to deal
with the abundance of biases in the data.</p>
      <p>We start by using an NER method by associating SNOMED-CT concepts
to o cial CIE-10-ES descriptions and EHRs text. Subsequently, the method
estimates the similarity between concepts using the hierarchical structure of
SNOMED-CT. Once the similarities between concepts are known, code a nity to
text is computed as the similarity between sets: the concepts de ning a code and
those concepts within a piece of EHRs text. Finally, the code ranking associated
to a document is established by ordering code relatedness to all pieces of EHRs
text. In addition, the combination of the resulting ranking with a supervised
learning method has been explored. An overview of our approach and the overall
pipeline is shown in Figure 1. More details of each process are given below.
3.1</p>
      <sec id="sec-3-1">
        <title>SNOMED-CT association</title>
        <p>As early mentioned, access to Spanish clinical data is very restricted, which
complicates modeling Spanish representations based on distributional semantics
with a su ciently diverse vocabulary. Accordingly, domain knowledge bases are
used to provide an overall representation of all the di erent concepts.</p>
        <p>
          One of the most widely used standardized clinical terminology is
SNOMEDCT [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ], which hierarchically encompasses more than 300,000 unique clinical
concepts and 70,000 unique clinical terms. SNOMED-CT is designed as a low-level
terminology to deal with lexical diversity rather than abstract concepts,
covering a broad scope. The assignment of SNOMED concepts to descriptions and
EHRs has been done through a partial lexical matching. For this purpose, a
pre-processing step has been rstly carried out by stripping accents and
punctuation marks, converting text to lowercase, lemmatizing and stemming, removing
stop words, grouping pertainym words (terms "pertaining to" others, such as
pulmonary and lung), and handling term exclusions.
        </p>
        <p>
          A Spanish lemmatizer built from WordNet [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] and ConceptNet [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ] has been
used to standardize the text with a greater coverage. Lexical disambiguation does
not seem to be particularly relevant to the task, so the use of this
knowledgebased lemmatizer has been preferred to other tool based on supervised models,
such as spaCy4 for general domain and IxaMed [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] for clinical domain. All the
words not found by the lemmatizer are subsequently stemmed. In addition, a
        </p>
        <sec id="sec-3-1-1">
          <title>4 https://spacy.io</title>
          <p>list of pertainyms has also been generated using WordNet in conjunction with
machine translation techniques and human supervision.
3.2</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>Similarity computation</title>
        <p>
          SNOMED-CT organizes concepts using hierarchical `is-a' relationships, so it is
relatively easy to estimate similarities between nodes according to some of the
well-known semantic similarity measures in graphs. Multiple similarity measures
based on path or Information Content (IC) are described in [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ]. Although
pathbased measures are simpler, linearity in the hierarchy is assumed, i.e. `is-a'
relationships are equally relevant in general and speci c concepts. For this reason,
IC-based measures seem to better t this task, suggesting greater similarity for
speci c nearby concepts than for general close concepts. In particular, the Lin
measure (Equation 1) has been applied in this proposal.
(1)
(2)
Slin(c1; c2) =
2 IC(lcs(c1; c2))
        </p>
        <p>IC(c1) + IC(c2)
where c1 and c2 are the couple of concepts, lcs is the lowest common subsumer,
and IC is the Information Content, which is usually computed with the frequency
of concepts in large corpus. A xed IC value is assumed based on the depth of
the node in the branch in order to avoid corpus dependency and accessibility
constraints.</p>
        <p>
          The similarity between code sets (SG in Equation 2) can be seen as a problem
of maximizing pair arguments. Pair assignments can be de ned as a bipartite
graph G = (V; E), with the vertices V being the two sets of codes and the edges
E being the similarities between codes. We use the Kuhn{Munkres algorithm [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ]
to solve the optimization problem.
        </p>
        <p>SG =
max PNcode PNehr Mlin(i; j)B(i; j)
i=1 j=1</p>
        <p>Ncode
where Nehr is the number of SNOMED-CT concepts in the piece of document,
Ncode is the number of SNOMED-CT concepts in the code description, Mlin(i; j)
is a matrix with the Lin similarity values, and B(i; j) is a binary value which
is only active if concept i has been paired with concept j. There is only one
positive value of B for each i.</p>
        <p>Once the similarity values between all codes and each of the pieces of a
given document (SG) are calculated, a rst ranking is created by sorting the
codes by SG at the document level. The nal ranking is subsequently produced
by recalculating all code similarity values through an iterative exclusion of the
SNOMED-CT concepts already used by the codes at the top of the rst ranking.
With this second computation, the number of codes associated with a single
SNOMED-CT concept subset is limited to only one.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3 Supervised learning</title>
        <p>A priori, one would expect worse accuracy values for sub-tasks 1 and 2 due to lack
of learning for complex patterns and better recall values by accessing all possible
codes and a wide variety of vocabulary. We have explored such complex decisions
by implementing a Gradient Boosting multi-label algorithm, based on binary
classi ers using a `one-vs-the-rest' (OvR) strategy. These classi ers chain a series
of consecutive learning models, iteratively emphasizing the mistakes made by the
previous model. Boosting techniques seem to produce better results for these
types of problems where signi cant imbalance is the main factor. Finally, we
rank the codes by prioritizing the predictions of the Gradient Boosting classi ers,
which have been ordered according to the con dence values. The predictions are
followed by the codes suggested by the similarity-based approach.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4 Experiments</title>
      <p>The following subsections describe the used data, the proposal setup and the
achieved results.
(Spanish)
Describimos varo´n de 37 an˜os con vida previa activa que refiere
dolores osteoarticulares.</p>
      <p>Durante el ingreso para estudio del sı´ndrome febril con
antecedentes epidemiolo´gicos de posible exposicio´n a Brucella
presenta un cuadro de orquiepididimitis derecha.</p>
      <p>La exploracio´n fı´sica revela: Ta 40,2 C; T.A: 109/68 mmHg; Fc:
105 lpm.
(English)
We describe the case of a 37-year-old man with a previous
active life who complained of osteoarticular pain.</p>
      <p>During admission for study of the febrile syndrome with
epidemiological history of possible exposure to Brucella presents a
picture of right orchiepididymitis.</p>
      <p>Physical examination revealed: Ta 40.2 C; T.A: 109/68 mmHg; Fc:
105 bpm.</p>
      <p>Fig. 2: Examples of Spanish and English sentences in the CodiEsp-SPACCC
corpus { expressions associated with CIE-ES-10 codes are shown in bold.
4.1</p>
      <sec id="sec-4-1">
        <title>Data sets</title>
        <p>CIE-10-ES coding challenge has been evaluated in the CodiEsp-SPACCC
corpus5. It consists of 1000 EHRs, with an average of 16.5 long sentences per
document, and split into training, development, and test data sets. Each document
usually contains a wide range of information, including medical history,
medical examinations, test results, and clinical judgments, all without a prede ned
structure. A piece of document is shown in Figure 2.</p>
        <p>The examples illustrate a series of evidences distributed along the text. The
CIE-ES-10 associated with the expressions found in the example are collected
in Table 1. The terms used by clinicians in EHRs di er in granularity from
those de ning code descriptions. ICD descriptions tend to be more abstract in
order to capture multiple cases, which gives the coding task a high degree of
complexity. Fortunately, SNOMED-CT handles these granularities, extending
the more general concepts in the hierarchical structure.
In terms of data distribution, ICD-10 codes tend to follow extreme
distributions, characterized by a large imbalance, a scarcity of instances for most codes,</p>
        <sec id="sec-4-1-1">
          <title>5 https://zenodo.org/record/3837305#.XwMHiBHtZH5</title>
          <p>and the presence of hospital biases. In this case, such a power-law distribution
can be seen in Table 2. The training codes are ordered according to frequency
and clustered in 8 groups, trying to gather a similar number of instances.</p>
          <p>Table 2 shows that the nine most frequent diagnoses in group G8 appear
about the same number of times as the 704 least frequent diagnoses in group
G1. In the case of procedures, those 13 codes that appear between 11 and 19
times (group G6) are equivalent in volume of entries to the 84 codes of group
G3, which appear 3 times each.</p>
          <p>
            Figure 3 plots a normalized histogram of these groups for the training (in
grey) and development (in black) data sets. A ninth group (G0) has been
included to represent all unseen codes in the training data set. The training and
development distributions for both diagnoses and procedures are di erent, with
the unseen codes reaching almost 20% and 30% of instance volume, respectively.
The lack of information on labels implies a huge problem for data-driven
approaches as a considerable percentage of the codes are left out of the model. For
this reason, we have proposed a method that directly uses CIE-10-ES
descriptions and does not require training data.
(a) Diagnosis Code Frequency Histogram (b) Procedure Code Frequency Histogram
The CLEF eHealth 2020 Task 1 [
            <xref ref-type="bibr" rid="ref17">17</xref>
            ] is structured into 3 sub-tasks related but
with di erent objectives: the suggestion of CIE-10-ES codes corresponding to
diagnoses (CodiEsp-D), the recommendation of CIE-10-ES codes associated with
procedures (CodiEsp-P), and the prediction of both codes facilitating evidences
within the report (CodiEsp-X). The rst two sub-tasks consist of generating a
ranking of codes ordered by their a nity to a given EHR, while the objective
of the last task is to retrieve only those codes that are most probable, together
with the evidence. Based on these objectives, we propose di erent settings for
each sub-task.
          </p>
          <p>We explore a similarity-based method including the speci cations described
in Section 3. The evidences of the codes in the training data set has been used
as part of the descriptions to feed the method with more speci c information
about the codes. Furthermore, we exploit a digitized version of the
CIE-10ES tabular list, including additional code descriptions in combination with the
o cial entries provided by the organizers. Finally, we propose two approaches
in the rst two sub-tasks, one method without grouping pertainym words
(SIMBASIC) and another using these related words (SIM-EXT). As for the third
sub-task, we implement the approach that does not group related words, using
di erent similarity thresholds to choose the retrieved codes per document. The
thresholds 0.7 (SIM-BASIC-7), 0.8 (SIM-BASIC-8), and 0.9 (SIM-BASIC-9)
have been applied.</p>
          <p>As for the supervised learning method, we explore a Gradient Boosting
algorithm for sub-tasks 1 and 2. The model is also trained on Spanish abstracts from
Lilacs and Ibecs annotated with CIE-10-ES codes in addition to the training data
set. These data sets are also provided by the organizers and reach the amount
of 355,840 abstracts, with an average of 2.5 codes per abstract. The distribution
of these codes is also extreme unbalanced, so a subsampling is performed during
the training to avoid excessively increasing the negative instances per code.
Regarding the representation, classic BoW features are applied due to the presence
of a large volume of codes with less than 20 instances. In particular, label-speci c
features (GB-BNS) are used in order to focus learning and prediction on
coderelevant patterns. For this purpose, we extract the term frequency weighed by
Bi-normal separation (TF-BNS). Global features such as TF-IDF (GB-IDF) are
also used for procedures as this data set has few labels and less statistical
information. None of these approaches are proposed in the last sub-task because
those lack su cient interpretability.</p>
          <p>We implement a nal approach (GB-SIM) for the rst two sub-tasks by
combining the Gradient Boosting method that uses TF-BNS features (GB-BNS) and
the similarity-based method that groups related words together (SIM-EXT). As
a result, the codes predicted by the classi ers are ranked according to the
condence value and merged with the ranking generated by the similarity method.
Codes suggested by classi ers are placed at the top of the new ranking assuming
that supervised learning methods generally lead to higher precision values.
4.3</p>
        </sec>
      </sec>
      <sec id="sec-4-2">
        <title>Evaluation</title>
        <p>Two di erent evaluation metrics have been de ned for the CLEF eHealth 2020
Task 1 according to the objectives of each sub-task: mAP and F1.</p>
        <p>The rst two sub-tasks aim to generate a list of codes per EHR ranked by
relevance and are evaluated with mAP in order to quantify how many signi cant
codes are in the top positions. mAP is speci ed below.
(3)
mAP =</p>
        <p>PiN APi</p>
        <p>N
(4)
where n and N are the total number of retrieved codes, P recision@k is the
precision considering the top k codes, rel@k is a relevance function, and APi is
the Average Precision for the i top codes.</p>
        <p>In contrast, the last sub-task is designed to associate CIE-10-ES codes to
EHRs in an explainable way, providing the text fragment containing the
keywords. In this case, the number of retrieved and successful codes is assessed
computing F1.
4.4</p>
      </sec>
      <sec id="sec-4-3">
        <title>Results</title>
        <p>Metrics are calculated using all the codes described in the CIE-10-ES coding and
only those that appear in the training data set. The results of sub-tasks 1 and
2 are shown in Table 3. Approaches based on semantic similarity obtain better
results than the supervised method, considering both all codes and only those
seen during training. SIM-BASIC achieves an mAP score of 0.51, signi cantly
higher than the 0.37 mAP score for GB-BNS, with a di erence of 0.14. This seems
to support the ability of SIM-BASIC to predict unseen or rare codes as opposed
to the need for higher volumes of instances in GB-BNS. Indeed, the di erence
between scores when evaluating only seen codes is slightly reduced, reaching
the mAP scores of 0.596 and 0.475 respectively. Apparently, Gradient Boosting
method fails to model rare codes. Moreover, better results are generally obtained
with the GB-SIM approach, which combines the characterization of codes with
little information from SIM-BASIC and the learning of more complex patterns
from SIM-BNS.</p>
        <p>Precision, Recall and F1 are shown in Table 4 for third task. There are
di erent methods to improve the interpretability of supervised models such as
distillation into decision trees. However, exploring these lines has not been the
purpose of this paper. Thus, only similarity-based approaches have been used
to predict codes by retrieving textual evidence. In particular, the SIM-BASIC
approach is applied with di erent similarity thresholds. Using a threshold of 0.9,
SIM-BASIC-9 achieves an F1 score of 0.451 with all codes and 0.494 with only
those seen in the training data set. Unrelated codes are apparently associated
when choosing the less restrictive thresholds, resulting in a decrease in F1.
ICD-10 coding, and in particular the Spanish extension of this task (CIE-10-ES
coding), cannot be easily automated with the existing techniques. One of the
main problems is the biased distribution of data that prevents the availability
of su cient data for all codes, hindering the development of supervised learning
methods.</p>
        <p>In this work, we propose an unsupervised method that improves recall by
suggesting also rare or unseen training codes. Our method achieves the
characterization of less frequent codes through a representation based on the clinical
terminology SNOMED-CT. Finally, we found that an ensemble that introduces
supervised learning methods is able to provide a better characterization of
frequent codes.</p>
        <p>We plan to extend our work by adding other intrinsic relationships of the
SNOMED-CT concepts that provide alternative information. We also plan to
explore the generation of similarity-based features to address few-shot and
zeroshot learning.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Ahmed</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ar</surname>
            <given-names>bas</given-names>
          </string-name>
          , A.,
          <string-name>
            <surname>Alpkocak</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <source>DEMIR at CLEF eHealth</source>
          <year>2019</year>
          :
          <article-title>Information Retrieval based Classi cation of Animal Experiment Summaries</article-title>
          .
          <source>In: CLEF (Working Notes)</source>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Amin</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Neumann</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <article-title>Dun eld</article-title>
          ,
          <string-name>
            <given-names>K.</given-names>
            ,
            <surname>Vechkaeva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Chapman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            ,
            <surname>Wixted</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.:</surname>
          </string-name>
          <article-title>MLT-DFKI at CLEF eHealth 2019: Multi-label Classi cation of ICD-10 Codes with BERT</article-title>
          .
          <source>In: CLEF (Working Notes)</source>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Atutxa</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Casillas</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ezeiza</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goenaga</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fresno</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gojenola</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Martinez</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Oronoz</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Perez-de Vinaspre</surname>
          </string-name>
          , O.:
          <article-title>Ixamed at clef ehealth 2018 task 1: Icd10 coding with a sequence-to-sequence approach</article-title>
          .
          <source>In: CLEF (Working Notes)</source>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Bounaama</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          , El Amine Abderrahim,
          <string-name>
            <surname>M.</surname>
          </string-name>
          : Tlemcen university at celf ehealth
          <year>2018</year>
          <article-title>team techno: Multilingual information extraction-icd10 coding</article-title>
          .
          <source>In: CLEF (Working Notes)</source>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Cabot</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Soualmia</surname>
            ,
            <given-names>L. F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Darmoni</surname>
            ,
            <given-names>S. J.:</given-names>
          </string-name>
          <article-title>Sibm at clef ehealth evaluation lab 2017: Multilingual information extraction with cim-ind</article-title>
          .
          <source>In: CLEF (Working Notes)</source>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lu</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Automatic icd-10 coding algorithm using an improved longest common subsequence based on semantic similarity</article-title>
          .
          <source>PloS one 12(3)</source>
          ,
          <year>e0173410</year>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Cossin</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jouhet</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mougin</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Diallo</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thiessard</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Iam at clef ehealth 2018: Concept annotation and coding in french death certi cates</article-title>
          .
          <source>In: CLEF (Working Notes)</source>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Devlin</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Toutanova</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>BERT: pre-training of deep bidirectional transformers for language understanding</article-title>
          . CoRR abs/
          <year>1810</year>
          .04805 (
          <year>2018</year>
          ), http://arxiv.org/abs/
          <year>1810</year>
          .04805
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Donnelly</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>SNOMED-CT: The advanced terminology and coding system for eHealth</article-title>
          . In:
          <article-title>Studies in health technology and informatics</article-title>
          , vol.
          <volume>121</volume>
          , pp.
          <volume>279</volume>
          (
          <year>2006</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>eblee</surname>
          </string-name>
          , S.,
          <string-name>
            <surname>Budhkar</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Milic</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pinto</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pou-Prom</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vishnubhotla</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hirst</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rudzicz</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Toronto cl at clef 2018 ehealth task 1: Multi-lingual icd-10 coding using an ensemble of recurrent and convolutional neural networks</article-title>
          .
          <source>In: CLEF (Working Notes)</source>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Fellbaum</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Miller</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          :
          <article-title>WordNet: An Electronic Lexical Database. 2nd edn</article-title>
          . MIT press (
          <year>1998</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Gojenola</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Oronoz</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Perez</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Casillas</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>IxaMed: Applying Freeling and a Perceptron Sequential Tagger at the Shared Task on Analyzing Clinical Texts</article-title>
          .
          <source>In: The 8th International Workshop on Semantic Evaluation, SemEval@ COLING</source>
          <year>2014</year>
          , vol.
          <volume>1</volume>
          , pp.
          <volume>361</volume>
          {
          <fpage>365</fpage>
          . Dublin, Ireland (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Ho-Dac</surname>
            ,
            <given-names>L.-M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fabre</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Birski</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Boudraa</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bourriot</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cassier</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Delvenne</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Garcia-Gonzalez</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kang</surname>
          </string-name>
          , E.-B.,
          <string-name>
            <surname>Piccinini</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          , et al.:
          <article-title>Litl at clef ehealth2017: automatic classication of death reports</article-title>
          .
          <source>In: CLEF (Working Notes)</source>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yoon</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>So</surname>
            ,
            <given-names>C.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kang</surname>
          </string-name>
          , J.:
          <article-title>Biobert: a pre-trained biomedical language representation model for biomedical text mining</article-title>
          .
          <source>Bioinformatics</source>
          <volume>4</volume>
          (
          <issue>34</issue>
          ),
          <volume>1234</volume>
          {
          <fpage>1240</fpage>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wei</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bao</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lu</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>Ecnu at 2018 ehealth task1 multilingual information extraction</article-title>
          .
          <source>In: CLEF (Working Notes)</source>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Lopez-Ubeda</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Diaz-Galiano</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Martin-Valdivia</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Urena-Lopez</surname>
            ,
            <given-names>L. A.:</given-names>
          </string-name>
          <article-title>Machine learning to detect icd10 codes in causes of death</article-title>
          .
          <source>In: CLEF (Working Notes)</source>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Miranda-Escalada</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gonzalez-Agirre</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Armengol-Estape</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Krallinger</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Overview of automatic clinical coding: annotations, guidelines, and solutions for non-English clinical cases at CodiEsp track of eHealth CLEF 2020</article-title>
          . In: CLEF (Working Notes) (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Munkres</surname>
          </string-name>
          , J.:
          <article-title>Algorithms for the assignment and transportation problems</article-title>
          .
          <source>Journal of the society for industrial and applied mathematics 1(5)</source>
          ,
          <volume>32</volume>
          {
          <fpage>38</fpage>
          (
          <year>1957</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Neves</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Butzke</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , Dorendahl,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Leich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Hummel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            , Schonfelder, G.,
            <surname>Grune</surname>
          </string-name>
          ,
          <string-name>
            <surname>B.</surname>
          </string-name>
          :
          <article-title>Overview of the CLEF eHealth 2019 Multilingual Information Extraction</article-title>
          . In: Crestani,
          <string-name>
            <given-names>F.</given-names>
            ,
            <surname>Braschler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Savoy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Rauber</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          , et al. (eds.)
          <article-title>Experimental IR Meets Multilinguality, Multimodality, and Interaction</article-title>
          .
          <source>Proceedings of the Tenth International Conference of the CLEF Association (CLEF</source>
          <year>2019</year>
          ).
          <source>Lecture Notes in Computer Science</source>
          . Springer, Berlin Heidelberg, Germany (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Ning</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yu</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhang</surname>
          </string-name>
          , R.:
          <article-title>A hierarchical method to automatically encode chinese diagnoses through semantic similarity estimation. BMC medical informatics and decision making 16(1</article-title>
          ),
          <volume>1</volume>
          {
          <fpage>12</fpage>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Pedersen</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pakhomov</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Patwardhan</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chute</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Measures of semantic similarity and relatedness in the biomedical domain</article-title>
          .
          <source>Journal of biomedical informatics</source>
          <volume>3</volume>
          (
          <issue>40</issue>
          ),
          <volume>288</volume>
          {
          <fpage>299</fpage>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22. Sanger,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Weber</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Kittner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Leser</surname>
          </string-name>
          ,
          <string-name>
            <surname>U.</surname>
          </string-name>
          :
          <article-title>Classifying German Animal Experiment Summaries with Multi-lingual BERT at CLEF eHealth 2019 Task 1</article-title>
          . In: CLEF (Working Notes) (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Seva</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sa</surname>
          </string-name>
          nger,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Leser</surname>
          </string-name>
          ,
          <string-name>
            <surname>U.</surname>
          </string-name>
          :
          <article-title>Wbi at clef ehealth 2018 task 1: Language independent icd-10 coding using multi-lingual embeddings and recurrent neural networks</article-title>
          .
          <source>In: CLEF (Working Notes)</source>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Speer</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chin</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Havasi</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>ConceptNet 5.5: An Open Multilingual Graph of General Knowledge</article-title>
          .
          <source>In: Proceedings of the Thirty-First AAAI Conference on Arti cial Intelligence</source>
          , pp.
          <volume>4444</volume>
          {
          <fpage>4451</fpage>
          . San Francisco, California USA (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <surname>Van Mulligen</surname>
            ,
            <given-names>E. M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Afzal</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Akhondi</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dang</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kors</surname>
          </string-name>
          , J.:
          <article-title>Erasmus mc at clef ehealth 2016: Concept recognition and coding in french texts</article-title>
          .
          <source>In: CLEF (Working Notes)</source>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>