<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Automatic coding of death certificates to ICD-10 terminology</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jitendra Jonnagaddala</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Feiyan Hu</string-name>
          <email>feiyan.hu@dcu.ie</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Insight Centre for Data Analytics, Dublin City University</institution>
          ,
          <country country="IE">Ireland</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Prince of Wales Clinical School</institution>
          ,
          <addr-line>UNSW Sydney</addr-line>
          ,
          <country country="AU">Australia</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>School of Public Health and Community Medicine</institution>
          ,
          <addr-line>UNSW Sydney</addr-line>
          ,
          <country country="AU">Australia</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this study, we present methods to automatically assign ICD-10 codes to short plain text description extracted from death certificates in English. We deployed an approach to tackle the task by solely using dictionary lookup, also known as dictionary matching or dictionary projection. The first step is to index manually coded ICD-10 lexicon followed by dictionary matching. Priority rules are applied to retrieve the relevant entity/entities and their corresponding ICD-10 code(s) given free text cause of death description. Because of the dictionary based method that we applied, we were able to evaluate our method even on the training set. The advantages of a dictionary look up method include speed and no need for training data. We present our results of 3 different experimental settings each of which has 2 individual runs. The performance is evaluated by precision, recall and F-measure. We identified several major issues in the corpus contributing to the low performance of our methods. This reiterates the fact that the quality of lexicon plays a significant role on the performance of dictionary lookup based methods.</p>
      </abstract>
      <kwd-group>
        <kwd>Death certificates coding</kwd>
        <kwd>Cause of death coding</kwd>
        <kwd>ICD-10 coding</kwd>
        <kwd>ICD-10 code assignment</kwd>
        <kwd>Concept normalization</kwd>
        <kwd>String matching</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>ICD also known as the International List of Causes of Death, was adopted by the
International Statistical Institute in the year 1893[1]. ICD includes the universe of diseases,
disorders, injuries and other related health conditions, listed in a comprehensive,
hierarchical way to facilitate storage, retrieval, analysis and exchange of information. It is
one of the widely used international standards to report diseases and health conditions
and to identify health trends and statistics globally. Uses of ICD include monitoring of
* Corresponding author
the incidence and prevalence of diseases, observing reimbursements and resource
allocation trends, and keeping track of safety and quality guidelines. Another important use
is to report deaths as well as diseases, injuries, symptoms, reasons for encounter, factors
that influence health status, and external causes of disease.</p>
      <p>
        World Health Organization (WHO published the 6th version in 1948 which is known
as ICD-6. All member states of the WHO are regulated to use the most current ICD
revision to report mortality and morbidity statistics. The ICD has been revised and
published in a series of editions to reflect advances in health and medical science over time.
The current ICD version is ICD-10, which was initially used in 1990. It covers more
than 20,000 codes including diagnoses and procedures, but only a subset of these codes
can be causes of death. Although delayed, ICD-11 is being currently drafted and is
expected to be released i
        <xref ref-type="bibr" rid="ref11">n 2017</xref>
        .
      </p>
      <p>Manually assigning ICD codes to a free text description is expensive and
time-consuming due to the vast coverage and size of ICD terminology, thus automated methods
are required to assist the manual coders and public health reporting officials[2]. We can
consider the process of assigning ICD codes as a classification problem, or entity
recognition problem or, entity recognition and normalization problem, depending on the
context. This will allow us to leverage various techniques based on machine learning
and/or natural language processing. In recent automatic object detection tasks in
images, we have seen deep learning based neural networks outperforming human players
[3]. It is legitimate to hypothesize that the ICD code assigning task in future could be
completely automated.</p>
      <p>Researchers have been investigating ICD code assignment in different types of
medical records such as pathology reports, discharge summaries and death certificates.
Recent studies proposed various methods specifically for ICD code assignment in death
certificates. Recently, supervised learning methods using Support Vector Machines
(SVM) to assign ICD codes has been applied [4-6]. Methods in unsupervised manner
are used in few other studies [7, 8]. The methods applied not based on classification
models are normally based on dictionary lookup, also known as dictionary matching or
projection. Mottin et al. used entity relocation and entity normalization to automatically
categorize text, compute similarity metric like cosine similarity of features in order to
find and rank input text [7]. The feature vector is formed by TF-IDF weighted bag of
words. Others claim that the hybrid method of dictionary projection and supervised
learning can outperform both dictionary projection and supervised learning [4, 6]. Our
method is based on dictionary lookup and priority rules. We applied exact and partial
string matching to look up a manually coded ICD-10 dictionary. The result is the
corresponding ICD codes of the matching query in the dictionary. The performance of
dictionary projection is conditioning on the fact that provided lexicon has good quality.
The advantage of such method is that it is easy and cheap to compute on a large scale
dataset.</p>
    </sec>
    <sec id="sec-2">
      <title>Methods</title>
      <sec id="sec-2-1">
        <title>Corpus</title>
        <p>We have used the CDC, distributed as part of the CLEF e-Health 2017 Task 1, for
developing our methods[9, 10]. The corpus included censored free-text descriptions of
causes of death reported by the clinicians in death certificates. These free-text
descriptions were manually coded by the experts using ICD-10 terminology[11]. A manually
curated ICD-10 lexicon was provided with the corpus. The methods employed in the
construction of this corpus are the same as CépiDc Causes of Death French corpus [12].
The corpus comprised of training and test sets.</p>
        <p>A sample (with modified content) ICD-10 coded death certificate from the corpus is
shown in Fig.1. In the sample death certificate with ID 0808, there were 3 causes of
death entities with ICD-10 codes assigned at line 1, 2 and 6 of the original full death
certificate (i.e. “pneumonia”, “atrial fibrillation”, and “CVA parkinsons disease”).
There were two ICD-10 codes assigned and ranked manually by the experts for the
cause of death statement – “CVA PARKINSONS DISEASE”. The primary cause of
death is coded as I48, which stands for “Atrial fibrillation and flutter” in ICD-10
standard terminology. In this study, we only focused on coding all the entities observed in
the death certificate. The identification of primary cause of death is beyond the scope
of this study. It is also important to note that the corpus didn’t include the full original
contents of the death certificates rather it just included only ‘cause of death’ entities.</p>
      </sec>
      <sec id="sec-2-2">
        <title>Concept coding using dictionary lookup and priority rules</title>
        <p>The proposed methods are based on our previous work, on coding PubMed articles with
MeSH terminology [13, 14]. Our methods are mainly based on dictionary lookup and
priority rules. String matching is a critical technique for dictionary lookup, which can
either be exact or partial matching (i.e. proximity and fuzzy matching). The dictionary
lookup approach has various advantages and can provide competitive results when used
with the right lexicon [14].</p>
        <p>Initially, the ICD-10 lexicon and the input free text descriptions in the corpus are
subjected to a few pre-processing steps. The pre-processing included tokenization,
lemmatization and stop words removal using the Apache Lucene* library. This is followed
by the expansion of abbreviations identified in the free text descriptions based on the
abbreviations lexicon. This lexicon was developed by the authors in a previous
study[14]. Finally, the dictionary matching is performed between the ICD-10 lexicon
and the free text descriptions. To identify the right code, we implemented several
priority rules. Highest priority is given to the code with an exact match, followed by partial
phrase match and partial token match. In many situations, more than one code is
identified by each rule, thus we employed another rule to consider only the top code
retrieved which had the highest score. The highest score should be greater than 0.5.
Similar methods have been employed in a previous study where dictionary look up was
used in conjunction with priority rules [7]. However, in our study the priority rules are
not just limited to exact match but also cover phrase and term matches.
2.3</p>
      </sec>
      <sec id="sec-2-3">
        <title>Experimental Setup</title>
        <p>The training set from the corpus was used to perform initial experiments. The methods
discussed in the above section were later evaluated on the test set. Three different
experiments (Exp1, Exp2, Exp3), each with two runs (Run1, Run2) were performed on
the test set. In each experiment, Run1 refers to the setup where Okapi BM25 scoring
was used and TF-IDF scoring for Run2 to rank the retrieved ICD-10 codes[15].</p>
        <p>Exp1 considered only ICD-10 codes retrieved which met the priority rule conditions
and had the highest-ranking score. No lemmatization and stop word removal steps were
employed. Exp2 considered only ICD-10 codes retrieved which met the priority rule
conditions and had highest-ranking score. However, lemmatization and stop word
removal steps were employed. Exp3 was very similar to Exp2 except with the addition
of abbreviation expansion component. We developed a separate lexicon which included
abbreviations with their full forms using MEDIC vocabulary [16]. Exp1 and Exp2 were
performed on the test set solely based on our initial experiments on the train set. In
other words, we didn’t access the ground truth of the test set while performing these
experiments. Exp3 was performed after performing error analysis on the predicted
ICD-10 codes from previous experiments by accessing the ground truth of the test set.
* http://lucene.apache.org/core/
2.4</p>
      </sec>
      <sec id="sec-2-4">
        <title>Evaluation metrics</title>
        <p>The performance of the proposed methods was assessed using the standard metrics
precision (P), recall (R) and F-measure (F) by identifying the true positives (TP), false
positives (FP) and false negatives (FN).</p>
        <p>P =
R =</p>
        <p>TP
TP + FP</p>
        <p>TP</p>
        <p>TP + FN
F =
(2 × P × R)</p>
        <p>P + R
(1)
(2)
(3)</p>
        <p>The metrics by default consider all the ICD-10 codes irrespective of their type or
group in the terminology. However, the metrics were also used to evaluate performance
on violent deaths type (codes from V01 to Y98) of ICD-10 codes. The intuition behind
evaluating the performance of this type was specifically that public health professionals
in general are keen to identify, analyze and intervene in these avoidable deaths. Only
Exp2 runs were evaluated for violent deaths type.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Results</title>
      <p>The above proposed automatic methods were applied to all the death certificates in the
dataset. The distribution of the training and test sets of the corpus is summarized in
Table 1. We noticed that the performance on the test set is lower than initial experiments
performed on the training set.
* Tokens are calculated using NLTK tokenizer http://www.nltk.org/</p>
      <p>The results of our experiments described in the previous section on the test set are
presented in the Table 2. In Exp2 and Exp3 the BM25 scoring based Run1 outperformed
TF-IDF scoring based Run2. The performance results specifically for violent deaths
type for Exp2 runs were as follows, Exp2-Run1 achieved 0.1684(P), 0.2619(R) and
0.205(F), while Exp2-Run2 achieved 0.043(P), 0.3095(R) and 0.0755(F).
Our results demonstrate that the performance of dictionary lookup based approach for
ICD-10 code assignment in death certificates is inferior to supervised and/or hybrid
based methods [5, 6]. To identify the possible reasons for large number of FN and FP,
a thorough error analysis was manually performed on a subset of predicted ICD-10
codes based on the Exp2 setup. Many issues were noticed ranging from quality of the
lexicon supplied in the corpus to short comings in our experimental setup. One of the
short-comings we addressed was addition of abbreviation expansion as part of the
Exp3. We identified that the testing set and training set included various abbreviated
‘cause of death’ entities which were not addressed in Exp1 and Exp2. HTN, CAD,
COPD, CHF, CAR and CVA were some of the frequently abbreviated entities
appearing in the death certificates. Our custom abbreviations lexicon had around 350 entries
and it increased our F score from 0.3746 to 0.3998.</p>
      <p>One of the key reason for our low performance was quality of the ICD-10 lexicon
supplied. We observed many issues including inconsistent formatting errors and
incomplete coverage of ICD-10 codes in the lexicon. For example, we noticed that there were
over 100 instances where the ICD-10 codes manually coded by the experts are not part
of the ICD-10 lexicon. W19, W75 and B334 were few such examples observed in the
corpus. There were also several issues noticed with coding performed by the experts.
There were inconsistencies in the lexicon and codes identified manually by the experts.
One such example is that there are instances where experts coded few entities to J101
but in the lexicon the correct corresponding code is J1010. Another similar type of issue
is the ‘cause of death’ entities in a death certificate don’t match to the expert coded
version. For example, consider the death certificate with ID 00004. There is only one
entity (STROKE) according to the file which doesn’t include ICD-10 codes but in the
expert coded version there were two (I64 and F179) ICD-10 codes.</p>
      <p>Inconsistencies in the representation of multiple entities observed in the same line of
the death certificate were also frequently observed throughout the corpus. “CVA
PARKINSONS DISEASE” is such example where Cerebrovascular accident (CVA)
and PARKINSONS DISEASE are not clearly separated. “H/O CAD AND
ELEVATED B/P”, “Respiratory Distress/arrest”, “HEMORRHAGE S/P
AORTOBIFEMORAL BYPASS”, “CHF - DIASTOLIC” and “H/O CAD AND
ELEVATED B/P” are similar such examples where entities are separated inconsistently
with no standard guidelines or notation. There were at least over 2000 instances of such
inconsistencies in both train and test sets. We strongly believe that by enhancing the
current ICD-10 lexicon, we can improve the dictionary lookup based performance
further. One enhancement worth exploring in future is to incorporate synonyms and,
spelling variations and corrections (Example: PNUEMONIA =&gt; PNEUMONIA;
ATRAIL FIBRILLATION =&gt; ATRIAL FIBRILLATION) into ICD-10 lexicon used
in addition to addressing the issues discussed earlier.
5</p>
    </sec>
    <sec id="sec-4">
      <title>Conclusion</title>
      <p>In conclusion, we have described our methods to automatically code death certificates
to ICD-10 terminology. Our dictionary-lookup based methods are simple, effective and
no training phase is required. However, the performance of these methods is not as
good as machine learning based topic modeling or learning to rank or hybrid methods.
The performance of dictionary lookup heavily relies on the quality of the lexicon used.
In addition, to a high-quality lexicon, enhancements such as synonym and spelling
variations need to be incorporated into dictionary lookup approach for better performance.
In future, we would like to improve our results by employing learning to rank
algorithms in conjunction with improved dictionary lookup approach.</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgements</title>
      <p>
        This study was conducted as part of the electronic Practice Based Research Network
(ePBRN) and Translational Cancer research network (TCRN) research programs.
ePBRN is funded in part by the School of Public Health &amp; Community Medicine, Ingham
Institute for Applied Medical Research, UNSW Medicine and South West Sydney
Local Health District. TCRN is funded by Cancer Institute of New South Wales and Prince
of Wales Clinical School, UNSW Medicine. We would like to thank the orga
        <xref ref-type="bibr" rid="ref11">nizers of
CLEF eHealth 2017</xref>
        Task 1 for providing us, with the ICD10 coded text content from
death certificates. The content of this publication is solely the responsibility of the
authors and does not necessarily reflect the official views of the funding bodies.
1.
15.
16.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>WHO. [cited 2017 June]; Available from: http://www.who.int/classifications/icd/en/HistoryOfICD.pdf.</mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Jonnagaddala</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , et al.,
          <article-title>Mining Electronic Health Records to Guide and Support Clinical Decision Support Systems, in Improving Health Management through Clinical Decision Support Systems</article-title>
          .
          <year>2016</year>
          ,
          <string-name>
            <given-names>IGI</given-names>
            <surname>Global</surname>
          </string-name>
          . p.
          <fpage>252</fpage>
          -
          <lpage>269</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>He</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          , et al.
          <article-title>Delving deep into rectifiers: Surpassing human-level performance on imagenet classification</article-title>
          .
          <source>in Proceedings of the IEEE international conference on computer vision</source>
          .
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Boytcheva</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <article-title>Automatic matching of ICD-10 codes to diagnoses in discharge letters</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <source>in Proceedings of the Workshop on Biomedical Natural Language Processing</source>
          .
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>Dermouche</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , et al.
          <article-title>ECSTRA-INSERM@ CLEF eHealth2016-task 2: ICD10 code extraction from death certificates</article-title>
          .
          <year>2016</year>
          . CLEF.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>Zweigenbaum</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          and T. Lavergne,
          <article-title>Hybrid methods for ICD-10 coding of death certificates</article-title>
          .
          <source>EMNLP</source>
          <year>2016</year>
          ,
          <year>2016</year>
          : p.
          <fpage>96</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>Mottin</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          , et al.,
          <source>BiTeM at CLEF eHealth Evaluation Lab 2016 Task</source>
          <volume>2</volume>
          : Multilingual Information Extraction.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <surname>Zweigenbaum</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          and
          <string-name>
            <surname>T. Lavergne. LIMSI</surname>
          </string-name>
          <article-title>ICD10 coding experiments on CépiDC death certificate statements</article-title>
          .
          <year>2016</year>
          . CLEF.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <given-names>Lorraine</given-names>
            <surname>Goeuriot</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.K.</given-names>
            ,
            <surname>Hanna</surname>
          </string-name>
          <string-name>
            <surname>Suominen</surname>
          </string-name>
          , Aurélie Névéol, Aude Robert, Evangelos Kanoulas, Rene Spijker, Joaõ Palotti, and Guido Zuccon. ,
          <article-title>CLEF 2017 eHealth Evaluation Lab Overview</article-title>
          . ,
          <source>in CLEF 2017 - 8th Conference and Labs of the Evaluation Forum, Lecture Notes in Computer Science (LNCS)</source>
          , Springer, September,
          <year>2017</year>
          .
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <surname>Névéol</surname>
            ,
            <given-names>A.a.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Robert N</surname>
          </string-name>
          . and
          <string-name>
            <surname>Cohen</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Bretonnel</surname>
            and Grouin, Cyril and Lavergne, Thomas and Rey, Grégoire and Robert, Aude and Rondet, Claire and Zweigenbaum, Pierre. ,
            <given-names>CLEF</given-names>
          </string-name>
          <article-title>eHealth 2017 Multilingual Information Extraction task overview: ICD10 coding of death certificates in English and French. , in CLEF 2017 Evaluation Labs</article-title>
          and Workshop: Online Working Notes, CEUR-WS,
          <year>September</year>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <surname>WHO</surname>
          </string-name>
          ,
          <string-name>
            <surname>The</surname>
            <given-names>ICD</given-names>
          </string-name>
          -
          <article-title>10 Classification of Diseases, Clinical Descriptions and Diagnostic Guidelines</article-title>
          . Geneva: WHO,
          <year>1992</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <surname>Lavergne</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          , et al.,
          <article-title>A Dataset for ICD-10 Coding of Death Certificates: Creation and Usage</article-title>
          .
          <source>BioTxtM</source>
          <year>2016</year>
          ,
          <year>2016</year>
          : p.
          <fpage>60</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <surname>Jonnagaddala</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , et al.
          <article-title>Recognition and normalization of disease mentions in PubMed abstracts</article-title>
          .
          <source>in Proceedings of the fifth BioCreative challenge evaluation workshop</source>
          , Sevilla, Spain, September 9-
          <issue>11</issue>
          ,
          <year>2015</year>
          .
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <surname>Jonnagaddala</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , et al.,
          <article-title>Improving the dictionary lookup approach for disease normalization using enhanced dictionary and query expansion</article-title>
          .
          <source>Database</source>
          ,
          <year>2016</year>
          .
          <year>2016</year>
          : p.
          <fpage>baw112</fpage>
          -
          <lpage>baw112</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          Vol.
          <volume>1</volume>
          . 2008: Cambridge university press Cambridge.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <surname>Davis</surname>
            ,
            <given-names>A.P.</given-names>
          </string-name>
          , et al.,
          <article-title>MEDIC: a practical disease vocabulary used at the Comparative Toxicogenomics Database</article-title>
          . Database,
          <year>2012</year>
          .
          <year>2012</year>
          : p.
          <fpage>bar065</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>