<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>IAM at CLEF EHEALTH 2020: CONCEPT ANNOTATION in SPANISH ELECTRONIC HEALTH RECORDS</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Natural Lan-</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>CHU de Bordeaux, Pôle de santé publique, Service d'information médicale, Informatique et Archivistique Médicales (IAM)</institution>
          ,
          <addr-line>F-33000 Bordeaux</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Univ. Bordeaux, Inserm, Bordeaux Population Health Research Center</institution>
          ,
          <addr-line>team ERIAS, UMR 1219, F-33000 Bordeaux</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this paper, we describe the approach and the results of our participation in task 1 (multilingual information extraction) of the CLEF eHealth 2020 challenge. We tackled the task of automatically assigning ICD-10 diagnosis and procedure codes to Spanish electronic health records. We used a dictionary-based approach using only materials provided by the task organizers. The training set consisted of 750 clinical cases annotated by a medical expert. Our system achieved an F1-score of 0.69 for the detection of diagnoses and 0.52 for the detection of procedures on a test set of 250 clinical cases.</p>
      </abstract>
      <kwd-group>
        <kwd>Semantic annotation</kwd>
        <kwd>Entity recognition guage Processing</kwd>
        <kwd>Electronic Health Records</kwd>
        <kwd>Spanish</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        An electronic health record is a patient-centered record that contains medical
information about a patient’s medical history, past and current medications, lab
results, diagnoses etc. Most of this medical information is provided by health care
professionals in free text format. Free text has many advantages like familiarity,
ease of use and freedom to express complex things [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. However unstructured
data are difficult to reuse and query to retrieve information. Natural Language
Processing (NLP) develops methods to manage free-text data and extract
information required by applications such as clinical decision support systems. A
frequent step in a NLP pipeline is the detection of medical entities (treatment,
diagnosis) with named entity recognition (NER) algorithms. Linking each
detected entity to a terminology or ontology is essential to leverage the power of
knowledge graphs that bring external knowledge and meaning [
        <xref ref-type="bibr" rid="ref4 ref5">4,5</xref>
        ].
      </p>
      <p>The objective of shared tasks is to foster the development of NLP tools. For
many years CLEF eHealth has proposed challenges to solve several real-world
problems of information extraction in free-text data.</p>
      <p>
        In this paper, we describe our approach and present the results for our
participation in the task 1 (multilingual information extraction) of the CLEF eHealth
2020 challenge [
        <xref ref-type="bibr" rid="ref2 ref6">2,6</xref>
        ]. This task focused on diagnosis and procedure coding of
Spanish electronic health records. We addressed 3 subtasks:
1. ICD10-CM codes assignment. In this sub-track the systems must detect
symptoms and diseases mentioned in clinical notes by predicting
ICD10CM codes (International Classification of Diseases, Tenth Revision, Clinical
Modification) for each document.
2. ICD10-PCS codes assignment. In this sub-track the systems must detect
procedures mentioned in clinical notes by predicting ICD10-PCS codes
(International Classification of Diseases, Tenth Revision, Procedure Coding System)
for each document.
3. Explainable AI. In this sub-track the systems must provide text annotations
for each ICD10-CM and ICD10-PCS code prediction of sub-track 1 and 2.
      </p>
      <p>We developed a biomedical semantic annotation tool for our own needs at
Bordeaux hospital. The main motivation to participate in the challenge was to
compare our system and to learn from others on a shared task.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Methods</title>
      <p>2.1</p>
      <sec id="sec-2-1">
        <title>Corpora</title>
        <p>In the following subsections, we describe the corpora, the terminologies used in
this challenge, an exploratory analysis of the data and our system.
The dataset provided by the organizers was called the CodiEsp corpus. The
corpus comprised 1,000 Spanish clinical case studies selected by a practicing
physician. Each clinical case was a plain text file and the filename was the
identifier. The train set and the development set contained 500 and 250 clinical cases,
respectively. For these sets, annotations were published.</p>
        <p>3,001 clinical cases had to be annotated by the participants, of which 250
formed the test set and were only known by the organizers to avoid manual
corrections.</p>
        <p>The annotation format was a tab-separated file with 2 fields for subtasks 1
and 2 corresponding to the clinical case identifier and an ICD10-CM or
ICD10PCS code. Three more fields were expected for the third substask: the start and
end offset of each term detected and whether the code came from the ICD10-CM
or the ICD10-PCS terminology.
2.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>Coding terminologies</title>
        <p>Spanish versions of the ICD10-CM and the ICD10-PCS terminologies were
provided by the organizers. The terminologies contained 98,288 and 75,789 different
codes, respectively. In this task only 2,921 distinct codes (1.7%) were present in
the train and development sets. Therefore, a vast majority of codes were not
used while others were frequent.
2.3</p>
      </sec>
      <sec id="sec-2-3">
        <title>Corpora exploration</title>
        <p>
          The Brat annotation tool [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] was used to visualize the annotations made by the
medical expert. To do so, a script was developed to transform the task file format
to the Brat file format. Figure 1 presents a screenshot of the Brat interface with
the first 4 lines of a clinical case from the development set.
        </p>
        <p>Four key insights emerged from this visualization:
– All clinical diagnoses and procedures detected in a clinical note had to be
coded. It’s different from medical coding used for reimbursement in that only
what was done during a medical encounter should be coded.
– The system didn’t have to detect negation. The mention of the absence of a
diagnosis or a procedure was also coded.
– The different words that make up an annotation can be very far apart in
a clinical note, often without syntactic dependencies. Detecting terms
composed of nonadjacent words seemed to be very challenging.
– Spelling mistakes were very rare.</p>
        <p>The development and train sets were combined into a single set later referred
to as the ’training set’. The training set contained 750 clinical cases with 10,678
and 3,018 annotations of diagnoses and procedures, respectively. The number of
annotations made up of nonadjacent words was 14.6% and 39.7% for diagnoses
and procedures, respectively.
2.4</p>
      </sec>
      <sec id="sec-2-4">
        <title>Algorithm</title>
        <p>
          We reused the algorithm we developed for the multilingual information
extraction task at CLEF eHealth 2018 which consisted of automatically assigning
ICD10 codes to French death certificates [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. The algorithm was described in details
at this occasion and the code is available1. The algorithm uses a
dictionarybased approach. It takes in input a normalized dictionary and stores it in a
tree data structure where each token corresponds to a node. A text to annotate
is tokenized after undergoing the same normalization process as the
terminology: words are normalized through accents (diacritical marks) and punctuation
removal, lowercasing and stopwords removal (if a stopword list is given). The
algorithm tries to match a token in a text with a token in the tree using three
different techniques: exact match, abbreviation match or a string-distance match
based on the Levenshtein distance. The abbreviation match technique uses a
dictionary of abbreviations that may be provided in input. When the last token of
a term at the leaf of the tree is matched, the algorithm outputs an annotation,
meaning a term was found. This algorithm can’t detect a term if its words are
not in the right order or nonadjacent which occurred frequently in this task.
Dictionaries Interestingly, only 4% of the terms annotated by the medical
expert were found in the terminologies after normalization. By comparing the
terms from the annotations and the terms from the terminology, a list of the
most frequent stopwords was created (no especificado, no especificada...). This
list was used to normalize the terminology.
        </p>
        <p>Two dictionaries were constructed:
– The first dictionary (run1) contained only the terms from the annotations of
the medical expert from the training set which corresponded to 6,316 terms.
– The second dictionary (run2) was the combination of the first dictionary and
the normalized labels of the ICD10-CM terminology. It contained a total of
94,386 terms.</p>
        <p>These two dictionaries were tested on the training set to detect and remove
terms that could hinder the evaluation metrics. For each term we calculated the
number of times it was annotated by the algorithm and by the human annotator.
If the ratio between these two numbers was greater than 2, the term was removed
from the dictionary. For example, the term "renal" was annotated 67 times by
the algorithm but only 2 times by the human expert in the development set.
Keeping this term would decrease the precision and the F1 score although the
recall would be slightly increased.
1 https://github.com/scossin/IAMsystem</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Results</title>
      <p>We submitted two runs (one for each dictionary) for the three subtasks. It took
less than 5 seconds to annotate the 3,001 documents for the three subtasks on
a laptop with Intel Core i7-5700HQ @2.70GH x 8CPUs. We obtained our best
F1 score with dictionary 1 for detecting the diagnoses and with dictionary 2 for
identifying the procedures. Table 1 shows the performance of our system.</p>
      <p>In subtask 3, an error was detected after publication of the official results.
A miscalculation error of the end offset position of each term was fixed and the
performance (non official) was reassessed by the organizers.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Discussion</title>
      <p>The performance was better for the diagnosis subtask than for the procedure
one. It was not surprising since the number of terms composed of nonadjacent
words were higher in this last task and our algorithm cannot detect such terms.
This missing functionality was the main limitation of our system and it probably
had a strong impact on our results.</p>
      <p>In 2018, the same algorithm obtained a F1-score of 0.786 (precision: 0.794,
recall: 0.779) on the task of coding French Death Certificates with the ICD-10
terminology. These better results in 2018 can be explained by a greater number
of terms annotated by a medical expert, a shorter text to annotate and no long
dependency between words in death certificates.</p>
      <p>Adding additional terms (run 2) did not improve the recall in subtask 1 and
even reduced the precision. The opposite was observed for procedures (subtask
2) where addition of terms (run 2) improved both recall and precision. The
labels in ICD-10 terminologies were of little interest compared to labels from the
annotations.</p>
      <p>The main advantage of our algorithm is its simplicity and speed. All it needs
is a dictionary, a list of abbreviations (optional) and stopwords (optional). The
algorithm provides an explanation by outputting start and end position of each
detected term. The proposed algorithm can be used as a baseline method for
any named entity task and could be integrated into another system to create a
more complex approach.</p>
      <p>Recently deep learning models based on CNNs and RNNs have shown to
achieve better performance on NER tasks in clinical domain [8]. However these
models are more data hungry and their training is very costly in terms of
computational power. Their advantages are diminished when the number of
annotations is low because it is very difficult to predict unseen codes compared to a
dictionary-based approach. In this task only 1.7% of the codes were present in
the training set.</p>
      <p>Further improvement may be possible by using a better curated terminology,
a longer list of abbreviations and a phonetic matching strategy to our dictionary
based approach.
8. Wu, Y., Jiang, M., Xu, J., Zhi, D., Xu, H.: Clinical Named Entity Recognition
Using Deep Learning Models. AMIA Annual Symposium Proceedings 2017, 1812–
1819 (Apr 2018), https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5977567/</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Cossin</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jouhet</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mougin</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Diallo</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thiessard</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <source>IAM at CLEF eHealth</source>
          <year>2018</year>
          :
          <article-title>Concept Annotation and Coding in French Death Certificates</article-title>
          . arXiv:
          <year>1807</year>
          .03674 [cs] (
          <year>Jul 2018</year>
          ), http://arxiv.org/abs/
          <year>1807</year>
          .03674, arXiv:
          <year>1807</year>
          .03674
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Goeuriot</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Suominen</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kelly</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Miranda-Escalada</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Krallinger</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pasi</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          , Gonzales Saez,
          <string-name>
            <given-names>G.</given-names>
            ,
            <surname>Viviani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          :
          <article-title>Overview of the CLEF eHealth Evaluation Lab 2020</article-title>
          . In: Arampatzis,
          <string-name>
            <surname>A</surname>
          </string-name>
          . (ed.)
          <source>Experimental IR Meets Multilinguality, Multimodality, and Interaction: Proceedings of the Eleventh International Conference of the CLEF Association (CLEF</source>
          <year>2020</year>
          ), vol.
          <volume>12260</volume>
          .
          <string-name>
            <surname>LNCS</surname>
          </string-name>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Johnson</surname>
          </string-name>
          , S.B.,
          <string-name>
            <surname>Bakken</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dine</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hyun</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mendonça</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Morrison</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bright</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Van Vleck</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wrenn</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stetson</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>An Electronic Health Record Based on Structured Narrative</article-title>
          .
          <source>Journal of the American Medical Informatics Association : JAMIA</source>
          <volume>15</volume>
          (
          <issue>1</issue>
          ),
          <fpage>54</fpage>
          -
          <lpage>64</lpage>
          (
          <year>2008</year>
          ). https://doi.org/10.1197/jamia.M2131, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2274868/
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Jovanović</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bagheri</surname>
          </string-name>
          , E.:
          <article-title>Semantic annotation in biomedicine: the current landscape</article-title>
          .
          <source>Journal of Biomedical Semantics 8 (Sep</source>
          <year>2017</year>
          ). https://doi.org/10.1186/s13326-017-0153-x, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5610427/
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Karadeniz</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ozgur</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Linking entities through an ontology using word embeddings and syntactic re-ranking</article-title>
          .
          <source>BMC Bioinformatics 20 (Mar</source>
          <year>2019</year>
          ). https://doi.org/10.1186/s12859-019-2678-8, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6437991/
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Miranda-Escalada</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gonzalez-Agirre</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Armengol-Estapé</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Krallinger</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Overview of automatic clinical coding: annotations, guidelines, and solutions for non-English clinical cases at CodiEsp track of eHealth CLEF 2020</article-title>
          .
          <article-title>CEUR-WS (</article-title>
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Stenetorp</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pyysalo</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Topić</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ohta</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ananiadou</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tsujii</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>BRAT: A Web-based Tool for NLP-assisted Text Annotation</article-title>
          .
          <source>In: Proceedings of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics</source>
          . pp.
          <fpage>102</fpage>
          -
          <lpage>107</lpage>
          . EACL '
          <volume>12</volume>
          ,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computational Linguistics, Stroudsburg, PA, USA (
          <year>2012</year>
          ), http://dl.acm.org/citation.cfm?id=
          <volume>2380921</volume>
          .
          <fpage>2380942</fpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>