=Paper=
{{Paper
|id=Vol-1327/21
|storemode=property
|title=Coverage of Rare Disease Names in Clinical Coding Systems and Ontologies and Implications for Electronic Health Records-Based Research
|pdfUrl=https://ceur-ws.org/Vol-1327/icbo2014_paper_52.pdf
|volume=Vol-1327
|dblpUrl=https://dblp.org/rec/conf/icbo/RichessonFB14
}}
==Coverage of Rare Disease Names in Clinical Coding Systems and Ontologies and Implications for Electronic Health Records-Based Research==
ICBO 2014 Proceedings Coverage of Rare Disease Names in Clinical Coding Systems and Ontologies and Implications for Electronic Health Records-Based Research Rachel Richesson Kin Wah Fung Duke University School of Nursing National Library of Medicine Durham, NC USA Bethesda, MD, USA rachel.richesson@dm.duke.edu Olivier Bodenreider National Library of Medicine Bethesda, MD, USA Abstract—This poster will present the completeness of to examine real-world treatment decisions, and is specifically coverage of rare disease names in standard coding systems, tasked to conduct observational and interventional research on including the International Classification of Diseases (ICD) and the comparative effectiveness of various treatments, using SNOMED CT, and ontologies such as the Orphanet Rare distributed and heterogeneous EHR systems.[5] The Diseases Ontology (RDO). Using use cases and a set of 45 rare PCORnet research portfolio currently includes 45 rare diseases for the national Patient Centered Outcomes Research Network (PCORnet), the poster will describe the current diseases (in addition to approximately 20 more common capacity and implications for electronic health records-based conditions). The objective of this poster is to determine the research on these diseases. Authors will provide suggestions on coverage of these rare disease names in standard coding how clinical coding systems and ontologies can be used in a systems and explore the current capacity and implications for coordinated approach to support the use of electronic health EHR-based research on these and other rare diseases. record data for various types of research related to rare diseases. Keywords—rare diseases; clinical classifications; ontologies; II. METHODS biomedical research; electronic health records In this poster we present an inventory of various clinical coding systems and ontologies that are relevant to rare I. INTRODUCTION diseases, and summarize their coverage of rare disease names Rare diseases are defined in the US as conditions that from previous studies. We match rare diseases names and affect less than 200,000 Americans and in the European Union synonyms from the Office of Rare Disease Research (ORD) as those with a prevalence of 5 per 10,000 or less.[1,2] The and Orphanet (RDO) to the Unified Medical Language System NIH Office of Rare Diseases Research recognizes 6,485 rare (UMLS) Metathesaurus and identify maps to SNOMED CT diseases.[3] Although each rare disease is uncommon, and other terminologies. To characterize the coverage of rare collectively they constitute a significant burden to the health diseases studied in PCORnet, we estimate the number of care system. One estimate suggests that 1 in 10 Americans are precise and equivalent matches in the three clinical affected by a rare disease.[2] Consequently ‘rare diseases’ classifications (ICD-9-CM, ICD-10-CM, and SNOMED CT) have emerged as priority topics in public health and research. for a set of 45 rare diseases studied in PCORnet. Finally, we Rare disease names are included, at different levels of present the likely use of existing classifications, ontologies, completeness and granularity, in a number of clinical coding mappings, and tools to support the research process, from the systems that are embedded in electronic health record (EHR) collection of data in clinical settings to their use in various systems, and in a number of ontologies designed to support types of EHR-based research. the diagnosis rare diseases and investigation of their causes and treatments.[4] III. RESULTS With increased adoption and “meaningful use” of EHRs, there SNOMED CT has the highest coverage of rare disease names is renewed effort in leveraging EHRs for research. In the U.S., among clinical terminologies in UMLS, and covers 44% of the the national Patient Centered Outcomes Research Network 6,485 diseases (19,504 terms) recognized by the Office of (PCORnet) was funded this year from the Affordable Care Act Rare Diseases (ORD), and 48% of the 6,750 diseases (15,585 78 ICBO 2014 Proceedings terms) diseases listed in the Orphanet Rare Disease Ontology. an efficient national research infrastructure and learning 25% (1,611) of ORD and 14% (1,592) RDO disease names healthcare system. The UMLS is a vital tool to support the have bi-directional one-to-one maps to SNOMED CT. The linkage across clinical coding systems and specialized rest are one-to-many or many-to-one maps. Two terminologies ontologies that will be essential for a national EHR-based rare have higher coverage than SNOMED CT. Medical Subject diseases research infrastructure. Headings (MeSH) covers 75% and 70%, while Online Mendelian Inheritance in Man (OMIM) covers 49% and 57%, Ontologies can support advances in understanding disease of ORD and RDO respectively. Overall, the UMLS covers etiology and potential treatments. Specialized ontologies, 82% of ORD-recognized and 84% of RDO-recognized rare such as OMIM, RDO, and others (such as the Human diseases. Phenotype Ontology) can provide the vocabulary for detailed clinical documentation , or “deep phenotyping”, of genetic All of the rare diseases studied in PCORnet were included in diseases (e.g., in the NIH Undiagnosed Diseases Network), the UMLS and its source terminologies. 8 diseases did not and complement clinical terminologies and administrative have any match to SNOMED CT, ICD-9-CM or ICD-10-CM. classifications widely used in EHRs. This poster will include The 45 rare diseases studied in PCORnet yielded multiple an illustrative representation of the collection of rare disease- matches to terms in clinical coding systems; i.e., many specific data in dedicated ontologies to support diagnosis, and PCORnet rare disease names matched to more than one (term) the use of mappings to standardized clinical terminologies or code in a coding system, and many codes from clinical coding classifications as needed for clinical documentation, data systems matched more than one rare disease name. Of 55 exchange, billing and public health reporting. ICD-9-CM codes that matched to a PCORnet rare disease, 7 were matched to multiple rare diseases. Of 47 matched ICD- 10-CM codes, 4 matched to multiple rare diseases, and of 59 matched SNOMED CT codes, one SNOMED CT code ACKNOWLEDGMENT matched to multiple PCORnet rare diseases. The proportions This work was partly supported by the Intramural Research of matched codes that were considered equivalent matches Program of the National Institutes of Health and the National (rather than broader matches or related terms) were 25%, 45% Library of Medicine. This work was also supported in part by and 94% for ICD-9-CM, ICD-10-CM and SNOMED CT PCORnet, funded by the Patient Centered Outcomes Research respectively. Institute (PCORI). IV. CONCLUSIONS REFERENCES The coverage and quality (i.e., precision and equivalence) of terms for rare diseases in clinical coding systems is less than [1] Orphanet. Orphanet Rare Disease Ontology (ORDO). 2014 [cited 2014 ideal, but is markedly improved with SNOMED CT in March 14]; Available from: http://www.orphadata.org/cgi- comparison to ICD 9 and 10 classifications. The lack of bin/inc/ordo_orphanet.inc.php. precise and complete coverage of rare disease names in [2] NORD. Rare Disease Information. 2014 [cited 2014 March 14]; Available from: http://www.rarediseases.org/rare-disease-information. clinical coding systems will inhibit the automated [3] NIH. Office of Rare Diseases Research (ORDR) Brochure. 2009 [cited identification patients with rare diseases from EHR data for 2010 20/08/2010]; Available from: clinical trial recruitment or observational research. The http://rarediseases.info.nih.gov/asp/resources/ord_brochure.html. coverage of rare disease names in specialized ontologies (e.g., [4] Fung, K.W., R.L. Richesson, and O. Bodenreider, Coverage of Rare OMIM) is higher, but these are not designed for use in clinical Disease Names in Standard Terminologies and Implications for Patients, Providers, and Research, in Paper accepted for presentation: American EHR systems. Medical Informatics Association Annual Symposium2014: Washington, D.C. Given the intended purpose for each classification and [5] PCORI. PCORnet: The National Patient-Centered Clinical Research ontology and the completeness and coverage of rare disease Network. 2014 [cited 2014 March 13]; Available from: names, we propose how these various clinical coding systems, http://www.pcori.org/funding-opportunities/pcornet-national-patient- centered-clinical-research-network/. ontologies, and UMLS mappings can be leveraged to support 79 ICBO 2014 Proceedings Coverage of Rare Disease Names in Clinical Coding Systems and Ontologies and Implications for Electronic Health Records-Based Research Rachel Richesson1, Kin Wah Fung2, Olivier Bodenreider2 │ 1Duke University School of Nursing, Durham, NC, USA; 2National Library of Medicine, Bethesda, MD, USA Rare disease names are included, at different levels of completeness and granularity, in a number RESULTS Figure 2. Use Cases and Coding Systems for Rare Diseases Care and Research ABSTRACT of clinical coding systems that are embedded in electronic health record (EHR) systems, and in a number of ontologies designed to support the diagnosis of rare diseases and investigation of This poster highlights clinical coding systems and ontologies relevant to rare diseases, including the • SNOMED CT has the highest coverage among clinical coding systems, and covers 44% of the genetic causes and treatments. International Classification of Diseases (ICD) and SNOMED CT, and ontologies such as the Human 6,485 diseases recognized by the Office of Rare Diseases, and 28% of the 6,750 diseases that 1 2 UMLS-Mappings support linkage of SNOMED CT- Phenotype Ontology (HPO) and the Orphanet Rare Diseases Ontology (ORDO). Using use cases As was shown in Table 2, a range of coverage for rare disease names across coding systems has are listed in the Orphanet Rare Disease Ontology (ORDO). and a set of rare diseases for the national Patient Centered Outcomes Research Network been reported using a variety of methods. In 2010, the NLM mapped 8,435 rare disease names encoded data to other coding systems… (PCORnet), the poster will describe the current capacity and implications for EHR-based research (collected from ORDR, Orphanet, and the National Organization for Rare Disorders, a patient Table 3. Coverage of Rare Diseases from 2 Sources by on these diseases. Authors will provide suggestions on where mappings across classifications and advocacy and voluntary health organization in the US) to the UMLS, and found different levels of coverage for Medical Subject Headings (MeSH) (5,663 ; 67%), Online Mendelian Inheritance in Coding Systems Electronic Health ontologies are needed to support the use of EHR data for various types of research related to rare HPO and ORDO for “deep phenotyping” Link to OMIM and GO and for diseases. Man (OMIM) (3,802 ; 45%), SNOMED CT (4,192 ; 50%), and ICD-10 (1,029 ;12%).[4] More % coverage of 6,485 diseases % coverage of 6,750 diseases Record Systems Data Uses Code Systems Coding System of undiagnosed disorders in specialty or molecular diagnosis. recently, we used the UMLS and the published maps from SNOMED CT to ICD-9-CM (developed from US NCATS/ORDR from Orphanet ORDO genetics clinics. by IHTSDO) and ICD-10-CM (developed by NLM).[5] BACKGROUND With increased adoption and “meaningful use” of EHRs, there is renewed effort in leveraging EHRs UMLS 82% 62% 5 Reimbursement ICD-9-CM, ICD-10-CM MeSH 75% 52% for research. The national Patient Centered Outcomes Research Network (PCORnet) was funded OMIM 52% 41% Rare diseases are defined in the US as conditions that affect less than 200,000 Americans and in from the Affordable Care act to examine real-world treatment decisions, and is specifically tasked to the European Union as those with a prevalence of 5 per 10,000 or less. There is no globally conduct observational and interventional research on the comparative effectiveness of various SNOMED CT 44% 36% 6 Public Health Surveillance ICD-10 authoritative list of rare diseases, but there are several online disease catalogues developed by ICD-9-CM Clinical coding systems 11% 7% treatments, using distributed and heterogeneous EHR systems. The PCORnet research portfolio reliable sources.[1-3] Although each rare disease is uncommon, collectively they are more currently includes 48 rare diseases and conditions. ICD-10-CM 21% 16% common, and consequently ‘rare diseases’ have emerged as priority topics in public health and research. • Overall, the UMLS covers 82% of ORD and 62% of ORDO-recognized rare diseases. 7 Quality Measurement SNOMED CT • Two terminologies have higher coverage than SNOMED CT: Medical Subject Headings (MeSH) Table 1. Sources of Rare Disease Names Figure 1. The Patient Centered Outcomes Research Network covers 75% and 52%, while Online Mendelian Inheritance in Man (OMIM) covers 52% and 41%, 8 Interventional Research SNOMED CT, MedDRA; Source # of rare diseases (PCORnet) of Networks of ORD and ORDO respectively. plus new data collection using PhenX and LOINC NCATS, Office of Rare Diseases Research (United States) [1] 6,485 • SNOMED CT covers 44% of ORD and 36% of ORDO-recognized rare diseases. Orphanet Rare Diseases Ontology [3] 6,750 • 25% (1,611) of ORD and 14% (1,592) ORDO disease names have bi-directional one-to-one 3 Encode with SNOMED CT for documenting 9 maps to SNOMED CT. 4 diagnoses or “problems” 3 13 • The rest are one-to-many or many-to-one maps. Query SNOMED CT for Table 2. Terminologies, Coding Systems, and Ontologies with 2 12 12 13 networked research networks Coverage of Rare Diseases 14 MeSH for linkage to the biomedical and observational research. Examples 14 3 1 2 13 literature and clinical practice Terminology or 14 2 14 15 guidelines (e.g., InfoButton, CDSS). Linkage to patient-directed health 13 12 15 2 Sponsor Intended Purpose Estimated Coverage 1 2 information (e.g., Medline Plus Coding System 12 14 1 14 13 1 Many rare diseases map to the following codes: search with MeSH synonyms). International Classification 14 3 These are not of Diseases version 10 World Health Organization Disease Surveillance; Mortality 12% [4] 1 14 2 759.89 Other specified congenital anomalies (ICD-9-CM) (ICD-10) 13 2 13 15 13 “high precision” DISCUSSION CONCLUSIONS 1 2 1 4 2 13 2 12 13 District of Q82.8 Other specified congenital malformations of skin (ICD-10-CM) mappings International Classification Using UMLS-based 13 14 13 World Health Organization; national 2 14 Columbia of Diseases Clinical method [5]: 1 2 2 Pseudoneonatal adrenoleukodystrophy maps to: government/public health sponsors Medical Billing 13 2 13 Modifications (ICD-CM, 13% for ICD-9-CM 15 2 versions 9 and 10) by country 26% for ICD-10-CM 15 2 13 2 12 (238069004) Acyl-CoA oxidase deficiency (disorder) in This is a “high The diagram in Figure 2 includes the collection of rare disease-specific data in dedicated ontologies • The coverage of terms for rare diseases in clinical terminologies is highest with 14 SNOMED CT and is the only rare disease that does. precision” map to support diagnosis, the use of mappings to standardized clinical terminologies or classifications as SNOMED CT in comparison to ICD 9 and 10 classifications. 3 Systematized Coding the clinical content of 13 International Standards 1 Nomenclature of Medicine Development Organization electronic health records to support 50 - 53% [4, 5] 1 14 needed for clinical documentation, data exchange, billing and public health reporting. 12 14 Joubert Syndrome maps to: • SNOMED CT has the greatest proportion of high-precision mappings and 2 14 Clinical Terms patient care and other secondary 1 12 http://www.ihtsdo.org/ Not semantically (SNOMED-CT) data uses. 3 742.4 Other specified congenital anomalies of brain (ICD-9-CM) The current coverage of rare disease names in standard coding systems can support a number of equivalent mappings in our sample of PCORnet rare diseases. 1 2 “equivalent” Medical Dictionary for International Federation of Adverse event reporting; regulatory 13 15 Not semantically use cases, including: the identification of rare disease patients from EHR data for research (#8 and Pharmaceutical Manufacturers and 12 742.9 Unspecified congenital anomaly of brain, spinal cord, and Regulatory Activities Associations (IFPMA ); maintained submissions for new drugs and Unknown 3 15 1 ”equivalent” #9 on figure), and the identification of appropriate rare diseases information, including published • The coverage of rare disease names in specialized ontologies is higher, but these devices. nervous system (ICD-9-CM) medical literature, clinical practice guidelines for providers (#3 on figure) and authoritative (MedDRA) and supported by MSSO 13 1 Not semantically are not designed for use in clinical EHR systems. Medical Subject Headings To index article topics for the Q04.3 Other reduction deformities of brain (ICD-10-CM) ”equivalent” consumer-directed information for patients (#4 on figure), using coded data from EHRs. (MeSH) U.S. National Library of Medicine published medical literature. 67% [4] 13 Semantically • Understanding the intended purpose of each classification and ontology and 2 12 253175003 Familial aplasia of the vermis (disorder) (SNOMED CT) ”equivalent” Advances in the discovery of genetic causes and possible treatments can be supported by specific A catalog of human genes and the coverage of rare disease names can facilitate an efficient national Distributed by the U.S. National genetic disorders and traits, with ontologies to the extent that they can be used in cooperation with EHR data coded in clinical coding Online Mendelian Center for Biotechnology focus on the molecular relationship Puerto Rico systems (#1 and #2 on figure). research infrastructure and learning healthcare system. Information; Authored and edited at Inheritance in Man between genetic variation and 45% [4] • As shown in Table 4, of the 48 rare diseases studied in PCORnet, the number of rare diseases • Further, ontologies can support advances in understanding disease etiology and 14 the McKusick-Nathans Institute of PPRN (OMIM) phenotypic expression; considered a The differences in the coverage, intended purpose, and granularity of different coding systems can Genetic Medicine, Johns Hopkins “phenotypic companion” to the 12 with one and only one match to coding system term (considered “high precision”) was 87% for potential treatments. University School of Medicine 1 CDRN impact how EHRs can support the consistent and reliable identification of rare disease patients, to Human Genome Project. ICD-9-CM, 91% for ICD-10-CM, and 98% for SNOMED CT. enable evidence-based care and multi-site research. The lines in the figure describe where • The UMLS is a vital tool to support the linkage across clinical coding systems and Various translational and genetics “Deep phenotyping” for EHRs in Unknown; presumably • Authors (RR, KWF) assessed the semantic nature of the maps to determine whether the mappings and linkages across terminologies are needed to support various use cases. Human Phenotype Ontology (HPO) research collaborators http://www.human-phenotype- genetics and specialty clinics; support interoperability between complete; 54% of HPO mapped term was broader, narrower, or equivalent to the PCORnet rare disease name. The specialized ontologies that will be essential for a national EHR-based rare terms are in the UMLS [6] ontology.org/ current major genetics databases. proportions of equivalent matches were 25%, 45% and 94% for ICD-9-CM, ICD-10-CM and diseases research infrastructure. A research resource for SNOMED CT, respectively. REFERENCES computational analysis and data 1. Orphanet. The portal for rare diseases and orphan drugs. 2014 March 12, 2014 [cited 2014 March 14]; Orphanet Rare Disease Ontology (ORDO) Orphanet mining/knowledge discovery for rare diseases. Supports editorial 100% METHODS Table 4. Precision of Coverage of PCORnet Rare Diseases in Available from: http://www.orpha.net/consor/cgi-bin/index.php. ACKNOWLEDGEMENTS procedures of Orphanet knowledge 2. NIH. Office of Rare Diseases Research (ORDR) Brochure. 2009 [cited 2010 20/08/2010]; Available from: The authors thank PCORnet collaborators and staff for their support of strategies for using electronic health bases and services. Different Clinical Coding Systems http://rarediseases.info.nih.gov/asp/resources/ord_brochure.html. data to advance rare diseases research. This paper was supported by grants from the Patient Centered • We match rare diseases names and synonyms from the Office of Rare Disease Research and N/A 3. NORD. Rare Disease Information. 2014 [cited 2014 March 14]; Available from: Outcomes Research Institute (PCORI) (P122013-499A) and the NIH Collaboratory (5 U54 AT007748-02). U.S. National Human Genome Standard questions for clinical (PhenX is for risk factors the Orphanet Rare Disease Ontology (ORDO) to the Unified Medical Language System (UMLS) This work was partly supported by the Intramural Research Program of the National Institutes of Health and PhenX phenotyping data to support de novo http://www.rarediseases.org/rare-disease-information. Research Institute data collection in GWAS studies. and environmental Metathesaurus and identify maps to SNOMED CT and other terminologies. % of PCORnet rare disease % of PCORnet rare disease the National Library of Medicine. The views expressed do not necessarily represent the views of the NLM, exposures) codes that did not map to other codes considered an 4. Pasceri, E. Analyzing rare diseases terms in biomedical terminologies. (LHNCB Medical Informatics • We estimate the number of precise and equivalent matches in the three clinical terminologies Coding System Training Program Final Report; Dr. Olivier Bodenreider, Mentor). 2010 [cited 2014 July 21]; Available NIH, or PCORI. The UMLS integrates and distributes rare diseases (1-1 map) equivalent map key terminology, classification and (ICD-9-CM, ICD-10-CM, and SNOMED CT) for a set of 48 rare diseases studied in PCORnet. from: http://mor.nlm.nih.gov/pubs/alum/2010-pasceri.pdf. coding standards, and associated (Precision) (Equivalence) 5. Fung, K.W., Richesson R.L., and O. Bodenreider, Coverage of Rare Disease Names in Standard Unified Medical Language resources to promote creation of Contains 8,435 rare • We assess the precision of mapping by looking at the number of rare disease names that map to System (UMLS) U.S. National Library of Medicine more effective and interoperable disease names ICD-9-CM 87% 25% Terminologies and Implications for Patients, Providers, and Research, in American Medical Informatics biomedical information systems and distinct codes in each terminology or coding system, and the equivalence by characterizing the Association Annual Symposium. 2014: Washington, D.C. ICD-10-CM 91% 45% services, including electronic health semantic nature of the maps to determine whether the mapped term was broader, narrower, or 6. Winnenburg, R. and O. Bodenreider. Coverage of phenotypes in standard terminologies. Proceedings of records. equivalent to the PCORnet rare disease name. SNOMED CT 98% 94% the Joint Bio-Ontologies and BioLINK ISMB'2014 SIG session “Phenotype Day” 2014:41-44. 80