=Paper= {{Paper |id=Vol-1327/21 |storemode=property |title=Coverage of Rare Disease Names in Clinical Coding Systems and Ontologies and Implications for Electronic Health Records-Based Research |pdfUrl=https://ceur-ws.org/Vol-1327/icbo2014_paper_52.pdf |volume=Vol-1327 |dblpUrl=https://dblp.org/rec/conf/icbo/RichessonFB14 }} ==Coverage of Rare Disease Names in Clinical Coding Systems and Ontologies and Implications for Electronic Health Records-Based Research== https://ceur-ws.org/Vol-1327/icbo2014_paper_52.pdf
                                                     ICBO 2014 Proceedings


  Coverage of Rare Disease Names in Clinical Coding
     Systems and Ontologies and Implications for
      Electronic Health Records-Based Research
                   Rachel Richesson                                                              Kin Wah Fung
           Duke University School of Nursing                                               National Library of Medicine
                  Durham, NC USA                                                               Bethesda, MD, USA
            rachel.richesson@dm.duke.edu
                                                                                               Olivier Bodenreider
                                                                                           National Library of Medicine
                                                                                               Bethesda, MD, USA


    Abstract—This poster will present the completeness of                  to examine real-world treatment decisions, and is specifically
coverage of rare disease names in standard coding systems,                 tasked to conduct observational and interventional research on
including the International Classification of Diseases (ICD) and           the comparative effectiveness of various treatments, using
SNOMED CT, and ontologies such as the Orphanet Rare                        distributed and heterogeneous EHR systems.[5]             The
Diseases Ontology (RDO). Using use cases and a set of 45 rare
                                                                           PCORnet research portfolio currently includes 45 rare
diseases for the national Patient Centered Outcomes Research
Network (PCORnet), the poster will describe the current                    diseases (in addition to approximately 20 more common
capacity and implications for electronic health records-based              conditions). The objective of this poster is to determine the
research on these diseases. Authors will provide suggestions on            coverage of these rare disease names in standard coding
how clinical coding systems and ontologies can be used in a                systems and explore the current capacity and implications for
coordinated approach to support the use of electronic health               EHR-based research on these and other rare diseases.
record data for various types of research related to rare diseases.

   Keywords—rare diseases; clinical classifications; ontologies;                                  II. METHODS
biomedical research; electronic health records
                                                                           In this poster we present an inventory of various clinical
                                                                           coding systems and ontologies that are relevant to rare
                       I. INTRODUCTION                                     diseases, and summarize their coverage of rare disease names
   Rare diseases are defined in the US as conditions that                  from previous studies. We match rare diseases names and
affect less than 200,000 Americans and in the European Union               synonyms from the Office of Rare Disease Research (ORD)
as those with a prevalence of 5 per 10,000 or less.[1,2] The               and Orphanet (RDO) to the Unified Medical Language System
NIH Office of Rare Diseases Research recognizes 6,485 rare                 (UMLS) Metathesaurus and identify maps to SNOMED CT
diseases.[3] Although each rare disease is uncommon,                       and other terminologies. To characterize the coverage of rare
collectively they constitute a significant burden to the health            diseases studied in PCORnet, we estimate the number of
care system. One estimate suggests that 1 in 10 Americans are              precise and equivalent matches in the three clinical
affected by a rare disease.[2] Consequently ‘rare diseases’                classifications (ICD-9-CM, ICD-10-CM, and SNOMED CT)
have emerged as priority topics in public health and research.             for a set of 45 rare diseases studied in PCORnet. Finally, we
Rare disease names are included, at different levels of                    present the likely use of existing classifications, ontologies,
completeness and granularity, in a number of clinical coding               mappings, and tools to support the research process, from the
systems that are embedded in electronic health record (EHR)                collection of data in clinical settings to their use in various
systems, and in a number of ontologies designed to support                 types of EHR-based research.
the diagnosis rare diseases and investigation of their causes
and treatments.[4]
                                                                                                 III. RESULTS
With increased adoption and “meaningful use” of EHRs, there                SNOMED CT has the highest coverage of rare disease names
is renewed effort in leveraging EHRs for research. In the U.S.,            among clinical terminologies in UMLS, and covers 44% of the
the national Patient Centered Outcomes Research Network                    6,485 diseases (19,504 terms) recognized by the Office of
(PCORnet) was funded this year from the Affordable Care Act                Rare Diseases (ORD), and 48% of the 6,750 diseases (15,585




                                                                      78
                                                  ICBO 2014 Proceedings

terms) diseases listed in the Orphanet Rare Disease Ontology.          an efficient national research infrastructure and learning
25% (1,611) of ORD and 14% (1,592) RDO disease names                   healthcare system. The UMLS is a vital tool to support the
have bi-directional one-to-one maps to SNOMED CT. The                  linkage across clinical coding systems and specialized
rest are one-to-many or many-to-one maps. Two terminologies            ontologies that will be essential for a national EHR-based rare
have higher coverage than SNOMED CT. Medical Subject                   diseases research infrastructure.
Headings (MeSH) covers 75% and 70%, while Online
Mendelian Inheritance in Man (OMIM) covers 49% and 57%,                Ontologies can support advances in understanding disease
of ORD and RDO respectively. Overall, the UMLS covers                  etiology and potential treatments. Specialized ontologies,
82% of ORD-recognized and 84% of RDO-recognized rare                   such as OMIM, RDO, and others (such as the Human
diseases.                                                              Phenotype Ontology) can provide the vocabulary for detailed
                                                                       clinical documentation , or “deep phenotyping”, of genetic
All of the rare diseases studied in PCORnet were included in           diseases (e.g., in the NIH Undiagnosed Diseases Network),
the UMLS and its source terminologies. 8 diseases did not              and complement clinical terminologies and administrative
have any match to SNOMED CT, ICD-9-CM or ICD-10-CM.                    classifications widely used in EHRs. This poster will include
The 45 rare diseases studied in PCORnet yielded multiple               an illustrative representation of the collection of rare disease-
matches to terms in clinical coding systems; i.e., many                specific data in dedicated ontologies to support diagnosis, and
PCORnet rare disease names matched to more than one (term)             the use of mappings to standardized clinical terminologies or
code in a coding system, and many codes from clinical coding           classifications as needed for clinical documentation, data
systems matched more than one rare disease name. Of 55                 exchange, billing and public health reporting.
ICD-9-CM codes that matched to a PCORnet rare disease, 7
were matched to multiple rare diseases. Of 47 matched ICD-
10-CM codes, 4 matched to multiple rare diseases, and of 59
matched SNOMED CT codes, one SNOMED CT code                                                ACKNOWLEDGMENT
matched to multiple PCORnet rare diseases. The proportions                 This work was partly supported by the Intramural Research
of matched codes that were considered equivalent matches               Program of the National Institutes of Health and the National
(rather than broader matches or related terms) were 25%, 45%           Library of Medicine. This work was also supported in part by
and 94% for ICD-9-CM, ICD-10-CM and SNOMED CT                          PCORnet, funded by the Patient Centered Outcomes Research
respectively.                                                          Institute (PCORI).


                     IV. CONCLUSIONS                                                                 REFERENCES
The coverage and quality (i.e., precision and equivalence) of
terms for rare diseases in clinical coding systems is less than        [1]   Orphanet. Orphanet Rare Disease Ontology (ORDO). 2014 [cited 2014
ideal, but is markedly improved with SNOMED CT in                            March      14];   Available    from:  http://www.orphadata.org/cgi-
comparison to ICD 9 and 10 classifications. The lack of                      bin/inc/ordo_orphanet.inc.php.
precise and complete coverage of rare disease names in                 [2]   NORD. Rare Disease Information. 2014 [cited 2014 March 14];
                                                                             Available from: http://www.rarediseases.org/rare-disease-information.
clinical coding systems will inhibit the automated
                                                                       [3]   NIH. Office of Rare Diseases Research (ORDR) Brochure. 2009 [cited
identification patients with rare diseases from EHR data for                 2010               20/08/2010];              Available            from:
clinical trial recruitment or observational research. The                    http://rarediseases.info.nih.gov/asp/resources/ord_brochure.html.
coverage of rare disease names in specialized ontologies (e.g.,        [4]   Fung, K.W., R.L. Richesson, and O. Bodenreider, Coverage of Rare
OMIM) is higher, but these are not designed for use in clinical              Disease Names in Standard Terminologies and Implications for Patients,
                                                                             Providers, and Research, in Paper accepted for presentation: American
EHR systems.                                                                 Medical Informatics Association Annual Symposium2014: Washington,
                                                                             D.C.
Given the intended purpose for each classification and                 [5]   PCORI. PCORnet: The National Patient-Centered Clinical Research
ontology and the completeness and coverage of rare disease                   Network. 2014          [cited 2014 March 13]; Available from:
names, we propose how these various clinical coding systems,                 http://www.pcori.org/funding-opportunities/pcornet-national-patient-
                                                                             centered-clinical-research-network/.
ontologies, and UMLS mappings can be leveraged to support




                                                                  79
                                                                                                                                                                                                                                                                                                                                                                                                                          ICBO 2014 Proceedings


Coverage of Rare Disease Names in Clinical Coding Systems and Ontologies
and Implications for Electronic Health Records-Based Research
Rachel Richesson1, Kin Wah Fung2, Olivier Bodenreider2 │  1Duke University School of Nursing, Durham, NC, USA; 2National Library of Medicine, Bethesda, MD, USA

                                                                                                                                           Rare disease names are included, at different levels of completeness and granularity, in a number                                                                                                                                   RESULTS                                                                                                            Figure 2. Use Cases and Coding Systems for Rare Diseases Care and Research
ABSTRACT                                                                                                                                   of clinical coding systems that are embedded in electronic health record (EHR) systems, and in a
                                                                                                                                           number of ontologies designed to support the diagnosis of rare diseases and investigation of
This poster highlights clinical coding systems and ontologies relevant to rare diseases, including the                                                                                                                                                                                                                                                                         • SNOMED CT has the highest coverage among clinical coding systems, and covers 44% of the
                                                                                                                                           genetic causes and treatments.
International Classification of Diseases (ICD) and SNOMED CT, and ontologies such as the Human                                                                                                                                                                                                                                                                                   6,485 diseases recognized by the Office of Rare Diseases, and 28% of the 6,750 diseases that                          1                                                        2                                                                                      UMLS-Mappings support linkage of SNOMED CT-
Phenotype Ontology (HPO) and the Orphanet Rare Diseases Ontology (ORDO). Using use cases                                                   As was shown in Table 2, a range of coverage for rare disease names across coding systems has                                                                                                                                         are listed in the Orphanet Rare Disease Ontology (ORDO).
and a set of rare diseases for the national Patient Centered Outcomes Research Network                                                     been reported using a variety of methods. In 2010, the NLM mapped 8,435 rare disease names                                                                                                                                                                                                                                                                                                                                                                                                                     encoded  data  to  other  coding  systems…
(PCORnet), the poster will describe the current capacity and implications for EHR-based research                                           (collected from ORDR, Orphanet, and the National Organization for Rare Disorders, a patient                                                                                                                                         Table 3. Coverage of Rare Diseases from 2 Sources by
on these diseases. Authors will provide suggestions on where mappings across classifications and                                           advocacy and voluntary health organization in the US) to the UMLS, and found different levels of
                                                                                                                                           coverage for Medical Subject Headings (MeSH) (5,663 ; 67%), Online Mendelian Inheritance in                                                                                                                                         Coding Systems                                                                                                                                                                                                       Electronic Health
ontologies are needed to support the use of EHR data for various types of research related to rare                                                                                                                                                                                                                                                                                                                                                                                                          HPO and ORDO for  “deep  phenotyping”               Link to OMIM and GO and for
diseases.                                                                                                                                  Man (OMIM) (3,802 ; 45%), SNOMED CT (4,192 ; 50%), and ICD-10 (1,029 ;12%).[4] More                                                                                                                                                                                     % coverage of 6,485 diseases            % coverage of 6,750 diseases                                                                                                             Record Systems                                                Data Uses                                Code Systems
                                                                                                                                                                                                                                                                                                                                                                               Coding System                                                                                                                of undiagnosed disorders in specialty or            molecular diagnosis.
                                                                                                                                           recently, we used the UMLS and the published maps from SNOMED CT to ICD-9-CM (developed                                                                                                                                                                                     from US NCATS/ORDR                     from Orphanet ORDO                            genetics clinics.
                                                                                                                                           by IHTSDO) and ICD-10-CM (developed by NLM).[5]
BACKGROUND                                                                                                                                 With  increased  adoption  and  “meaningful  use”  of  EHRs,  there  is  renewed  effort  in  leveraging  EHRs  
                                                                                                                                                                                                                                                                                                                                                                               UMLS                                            82%                                    62%
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          5       Reimbursement                            ICD-9-CM, ICD-10-CM
                                                                                                                                                                                                                                                                                                                                                                               MeSH                                            75%                                    52%
                                                                                                                                           for research. The national Patient Centered Outcomes Research Network (PCORnet) was funded                                                                                                                                          OMIM                                            52%                                    41%
Rare diseases are defined in the US as conditions that affect less than 200,000 Americans and in
                                                                                                                                           from the Affordable Care act to examine real-world treatment decisions, and is specifically tasked to
the European Union as those with a prevalence of 5 per 10,000 or less. There is no globally
                                                                                                                                           conduct observational and interventional research on the comparative effectiveness of various
                                                                                                                                                                                                                                                                                                                                                                               SNOMED CT                                       44%                                    36%                                                                                                                                                                                 6       Public Health Surveillance               ICD-10
authoritative list of rare diseases, but there are several online disease catalogues developed by                                                                                                                                                                                                                                                                              ICD-9-CM             Clinical coding systems    11%                                     7%
                                                                                                                                           treatments, using distributed and heterogeneous EHR systems. The PCORnet research portfolio
reliable sources.[1-3] Although each rare disease is uncommon, collectively they are more
                                                                                                                                           currently includes 48 rare diseases and conditions.                                                                                                                                                                                 ICD-10-CM                                       21%                                    16%
common,  and  consequently  ‘rare  diseases’  have  emerged  as  priority  topics  in  public  health  and  
research.                                                                                                                                                                                                                                                                                                                                                                      • Overall, the UMLS covers 82% of ORD and 62% of ORDO-recognized rare diseases.
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          7       Quality Measurement                      SNOMED CT

                                                                                                                                                                                                                                                                                                                                                                               • Two terminologies have higher coverage than SNOMED CT: Medical Subject Headings (MeSH)
Table 1. Sources of Rare Disease Names                                                                                                     Figure 1. The Patient Centered Outcomes Research Network                                                                                                                                                                              covers 75% and 52%, while Online Mendelian Inheritance in Man (OMIM) covers 52% and 41%,                                                                                                                                                                                 8       Interventional Research                  SNOMED CT, MedDRA;
Source                                                                                                 # of rare diseases                  (PCORnet) of Networks                                                                                                                                                                                                                 of ORD and ORDO respectively.                                                                                                                                                                                                                                                                                             plus new data collection
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           using PhenX and LOINC
NCATS, Office of Rare Diseases Research (United States) [1]                                                     6,485                                                                                                                                                                                                                                                          • SNOMED CT covers 44% of ORD and 36% of ORDO-recognized rare diseases.
Orphanet Rare Diseases Ontology [3]                                                                             6,750                                                                                                                                                                                                                                                          • 25% (1,611) of ORD and 14% (1,592) ORDO disease names have bi-directional one-to-one                                           3                                                                                   Encode with SNOMED CT
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         for documenting                          9
                                                                                                                                                                                                                                                                                                                                                                                 maps to SNOMED CT.                                                                                                                                                        4                                         diagnoses  or  “problems”  
                                                                                                                                                    3
                                                                                                                                                             13
                                                                                                                                                                                                                                                                                                                                                                               • The rest are one-to-many or many-to-one maps.                                                                                                                                                                                                                 Query SNOMED CT for
Table 2. Terminologies, Coding Systems, and Ontologies with                                                                                                                          2       12                     12
                                                                                                                                                                                                                                                                                                                                                                     13
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               networked research networks
Coverage of Rare Diseases                                                                                                                                                                                                                                                                                                                         14                                                                                                                                                                MeSH for linkage to the biomedical                                                                                         and observational research.
                                                                                                                                                                                                                                                                                                                                                                               Examples
                                                                                                                                                    14                                                                                       3                                                                                                             1
                                                                                                                                               2
                                                                                                                                                                                                                                                                                                                                                           13                                                                                                                                                       literature and clinical practice
Terminology or                                                                                                                                                                                                                               14               2    14
                                                                                                                                                                                                                                                                                                                                                                15                                                                                                                                                  guidelines (e.g., InfoButton, CDSS).       Linkage to patient-directed health
                                                                                                                                                                           13                                       12                                                                                                                   15            2
                                            Sponsor                          Intended Purpose                Estimated Coverage                                       1                                                                                                                                                              2
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               information (e.g., Medline Plus
Coding System                                                                                                                                                                                     12                                                                                 14                                                            1
                                                                                                                                                                                                                                                                                                                                                           14        13
                                                                                                                                                                                         1
                                                                                                                                                                                                                                                                                                                                                                                Many rare diseases map to the following codes:                                                                                                                                 search with MeSH synonyms).
International Classification                                                                                                                                                                                                                                                                                                       14         3                                                                                                                           These are not
of Diseases version 10         World Health Organization             Disease Surveillance; Mortality                  12% [4]                                                                                                                 1   14                                                                         2
                                                                                                                                                                                                                                                                                                                                                                                759.89 Other specified congenital anomalies (ICD-9-CM)
(ICD-10)                                                                                                                                                         13
                                                                                                                                                                                                                2        13
                                                                                                                                                                                                                                                                                                           15
                                                                                                                                                                                                                                                                                                                                              13
                                                                                                                                                                                                                                                                                                                                                                                                                                                                         “high  precision”  
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  DISCUSSION                                                                                                             CONCLUSIONS
                                                                                                                                                         1                                                                                                              2        1            4                                   2 13   2

                                                                                                                                                                                                                                                                                                                    12
                                                                                                                                                                                                                                                                                                                                             13
                                                                                                                                                                                                                                                                                                                                                            District of
                                                                                                                                                                                                                                                                                                                                                                                Q82.8 Other specified congenital malformations of skin (ICD-10-CM)                          mappings
International Classification                                                                                   Using UMLS-based                                                 13                                                                                  14       13
                               World Health Organization; national                                                                             2
                                                                                                                                                                                                           14                                                                                                                                               Columbia
of Diseases Clinical                                                                                              method [5]:                                                                     1
                                                                                                                                                                                                                                                      2                                                         2
                                                                                                                                                                                                                                                                                                                                                                                Pseudoneonatal adrenoleukodystrophy maps to:
                               government/public health sponsors     Medical Billing                                                                                                                                               13                                                                                        2      13
Modifications (ICD-CM,                                                                                          13% for ICD-9-CM               15                                                                        2

versions 9 and 10)
                               by country
                                                                                                               26% for ICD-10-CM                                                                                                                      15
                                                                                                                                                                                                                                                                                         2        13                                                             2        12
                                                                                                                                                                                                                                                                                                                                                                                (238069004) Acyl-CoA oxidase deficiency (disorder) in                                     This  is  a  “high      The diagram in Figure 2 includes the collection of rare disease-specific data in dedicated ontologies                   • The coverage of terms for rare diseases in clinical terminologies is highest with
                                                                                                                                                                                                                                                                                                                                   14
                                                                                                                                                                                                                                                                                                                                                                                SNOMED CT and is the only rare disease that does.                                         precision”  map         to support diagnosis, the use of mappings to standardized clinical terminologies or classifications as                    SNOMED CT in comparison to ICD 9 and 10 classifications.
                                                                                                                                                                                                                                                                                                                         3
Systematized                                                         Coding the clinical content of                                                                                                                                                                                      13
                               International Standards                                                                                                                                                                                                                       1
Nomenclature of Medicine
                               Development Organization
                                                                     electronic health records to support
                                                                                                                  50 - 53% [4, 5]
                                                                                                                                                                                                                              1         14                                                                                                                                                                                                                                                        needed for clinical documentation, data exchange, billing and public health reporting.
                                                                                                                                                                           12                                                                                 14
                                                                                                                                                                                                                                                                                                                                                                                Joubert Syndrome maps to:
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          • SNOMED CT has the greatest proportion of high-precision mappings and
                                                                                                                                                                                                                                                  2                                                                  14
Clinical Terms                                                       patient care and other secondary                                                                                        1        12
                               http://www.ihtsdo.org/                                                                                                                                                                                                                                                                                                                                                                                                                     Not semantically
(SNOMED-CT)                                                          data uses.                                                                                                                                                                                                                        3
                                                                                                                                                                                                                                                                                                                                                                                742.4 Other specified congenital anomalies of brain (ICD-9-CM)                                                    The current coverage of rare disease names in standard coding systems can support a number of
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            equivalent mappings in our sample of PCORnet rare diseases.
                                                                                                                                                                                                                                                                                     1
                                                                                                                                                                                                                                                                        2                                                                                                                                                                                                  “equivalent”
Medical Dictionary for
                               International Federation of
                                                                     Adverse event reporting; regulatory                                                                                                                                                                         13                        15
                                                                                                                                                                                                                                                                                                                                                                                                                                                                          Not semantically        use cases, including: the identification of rare disease patients from EHR data for research (#8 and
                               Pharmaceutical Manufacturers and
                                                                                                                                                                                                                                                                        12
                                                                                                                                                                                                                                                                                                                                                                                742.9 Unspecified congenital anomaly of brain, spinal cord, and
Regulatory Activities
                               Associations (IFPMA ); maintained
                                                                     submissions for new drugs and                   Unknown                                                                                        3         15
                                                                                                                                                                                                                                                          1
                                                                                                                                                                                                                                                                                                                                                                                                                                                                           ”equivalent”           #9 on figure), and the identification of appropriate rare diseases information, including published                     • The coverage of rare disease names in specialized ontologies is higher, but these
                                                                     devices.                                                                                                                                                                                                                                                                                                   nervous system (ICD-9-CM)                                                                                         medical literature, clinical practice guidelines for providers (#3 on figure) and authoritative
(MedDRA)
                               and supported by MSSO
                                                                                                                                                                                                                                                       13
                                                                                                                                                                                                                                                                                                                         1
                                                                                                                                                                                                                                                                                                                                                                                                                                                                          Not semantically                                                                                                                                  are not designed for use in clinical EHR systems.
Medical Subject Headings                                             To index article topics for the                                                                                                                                                                                                                                                                            Q04.3 Other reduction deformities of brain (ICD-10-CM)                                     ”equivalent”           consumer-directed information for patients (#4 on figure), using coded data from EHRs.
(MeSH)
                               U.S. National Library of Medicine
                                                                     published medical literature.
                                                                                                                      67% [4]                                                                                                                                                                                                13
                                                                                                                                                                                                                                                                                                                                                                                                                                                                           Semantically                                                                                                                                   • Understanding the intended purpose of each classification and ontology and
                                                                                                                                                             2
                                                                                                                                                                      12                                                                                                                                                                                                        253175003 Familial aplasia of the vermis (disorder) (SNOMED CT)                            ”equivalent”           Advances in the discovery of genetic causes and possible treatments can be supported by specific
                                                                     A catalog of human genes and                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           the coverage of rare disease names can facilitate an efficient national
                               Distributed by the U.S. National
                                                                     genetic disorders and traits, with                                                                                                                                                                                                                                                                                                                                                                                           ontologies to the extent that they can be used in cooperation with EHR data coded in clinical coding
Online Mendelian
                               Center for Biotechnology
                                                                     focus on the molecular relationship
                                                                                                                                                                                                                                                                                         Puerto Rico
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  systems (#1 and #2 on figure).                                                                                            research infrastructure and learning healthcare system.
                               Information; Authored and edited at
Inheritance in Man                                                   between genetic variation and                    45% [4]                                                                                                                                                                                                                                                  • As shown in Table 4, of the 48 rare diseases studied in PCORnet, the number of rare diseases                                                                                                                                             • Further, ontologies can support advances in understanding disease etiology and
                                                                                                                                                                                                                                                                                              14
                               the McKusick-Nathans Institute of                                                                                                                                                                                                                                                                                   PPRN
(OMIM)                                                               phenotypic expression; considered a                                                                                                                                                                                                                                                                                                                                                                                          The differences in the coverage, intended purpose, and granularity of different coding systems can
                               Genetic Medicine, Johns Hopkins
                                                                     “phenotypic  companion”  to  the                                                                                                                                        12                                                                                                                                  with  one  and  only  one  match  to  coding  system  term  (considered  “high  precision”)  was    87%  for                                                                                                                               potential treatments.
                               University School of Medicine
                                                                                                                                                                                                                                        1
                                                                                                                                                                                                                                                                                                                                                   CDRN                                                                                                                                           impact how EHRs can support the consistent and reliable identification of rare disease patients, to
                                                                     Human Genome Project.                                                                                                                                                                                                                                                                                       ICD-9-CM, 91% for ICD-10-CM, and 98% for SNOMED CT.
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  enable evidence-based care and multi-site research. The lines in the figure describe where                              • The UMLS is a vital tool to support the linkage across clinical coding systems and
                               Various translational and genetics    “Deep  phenotyping”  for  EHRs  in  
                                                                                                               Unknown; presumably                                                                                                                                                                                                                                             • Authors (RR, KWF) assessed the semantic nature of the maps to determine whether the                              mappings and linkages across terminologies are needed to support various use cases.
Human Phenotype
Ontology (HPO)
                               research collaborators
                               http://www.human-phenotype-
                                                                     genetics and specialty clinics;
                                                                     support interoperability between
                                                                                                               complete; 54% of HPO
                                                                                                                                                                                                                                                                                                                                                                                 mapped term was broader, narrower, or equivalent to the PCORnet rare disease name. The
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            specialized ontologies that will be essential for a national EHR-based rare
                                                                                                             terms are in the UMLS [6]
                               ontology.org/                         current major genetics databases.                                                                                                                                                                                                                                                                           proportions of equivalent matches were 25%, 45% and 94% for ICD-9-CM, ICD-10-CM and                                                                                                                                                        diseases research infrastructure.
                                                                     A research resource for                                                                                                                                                                                                                                                                                     SNOMED CT, respectively.                                                                                         REFERENCES
                                                                     computational analysis and data
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  1. Orphanet. The portal for rare diseases and orphan drugs. 2014 March 12, 2014 [cited 2014 March 14];
Orphanet Rare Disease
Ontology (ORDO)
                               Orphanet
                                                                     mining/knowledge discovery for rare
                                                                     diseases. Supports editorial
                                                                                                                       100%                METHODS                                                                                                                                                                                                                             Table 4. Precision of Coverage of PCORnet Rare Diseases in                                                            Available from: http://www.orpha.net/consor/cgi-bin/index.php.
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          ACKNOWLEDGEMENTS
                                                                     procedures of Orphanet knowledge                                                                                                                                                                                                                                                                                                                                                                                             2. NIH. Office of Rare Diseases Research (ORDR) Brochure. 2009 [cited 2010 20/08/2010]; Available from:                 The authors thank PCORnet collaborators and staff for their support of strategies for using electronic health
                                                                     bases and services.
                                                                                                                                                                                                                                                                                                                                                                               Different Clinical Coding Systems                                                                                     http://rarediseases.info.nih.gov/asp/resources/ord_brochure.html.                                                    data to advance rare diseases research. This paper was supported by grants from the Patient Centered
                                                                                                                                           • We match rare diseases names and synonyms from the Office of Rare Disease Research and
                                                                                                                        N/A                                                                                                                                                                                                                                                                                                                                                                       3. NORD. Rare Disease Information. 2014 [cited 2014 March 14]; Available from:                                          Outcomes Research Institute (PCORI) (P122013-499A) and the NIH Collaboratory (5 U54 AT007748-02).
                               U.S. National Human Genome
                                                                     Standard questions for clinical
                                                                                                              (PhenX is for risk factors
                                                                                                                                             the Orphanet Rare Disease Ontology (ORDO) to the Unified Medical Language System (UMLS)                                                                                                                                                                                                                                                                                                                                                                                      This work was partly supported by the Intramural Research Program of the National Institutes of Health and
PhenX                                                                phenotyping data to support de novo                                                                                                                                                                                                                                                                                                                                                                                             http://www.rarediseases.org/rare-disease-information.
                               Research Institute
                                                                     data collection in GWAS studies.
                                                                                                                 and environmental           Metathesaurus and identify maps to SNOMED CT and other terminologies.                                                                                                                                                                                           % of PCORnet rare disease                     % of PCORnet rare disease                                                                                                                                      the National Library of Medicine. The views expressed do not necessarily represent the views of the NLM,
                                                                                                                    exposures)
                                                                                                                                                                                                                                                                                                                                                                                                           codes that did not map to other                    codes considered an                 4. Pasceri, E. Analyzing rare diseases terms in biomedical terminologies. (LHNCB Medical Informatics
                                                                                                                                           • We estimate the number of precise and equivalent matches in the three clinical terminologies                                                                                                                                      Coding System                                                                                                         Training Program Final Report; Dr. Olivier Bodenreider, Mentor). 2010 [cited 2014 July 21]; Available                NIH, or PCORI.
                                                                     The UMLS integrates and distributes                                                                                                                                                                                                                                                                                                      rare diseases (1-1 map)                            equivalent map
                                                                     key terminology, classification and                                     (ICD-9-CM, ICD-10-CM, and SNOMED CT) for a set of 48 rare diseases studied in PCORnet.                                                                                                                                                                                                                                                                  from: http://mor.nlm.nih.gov/pubs/alum/2010-pasceri.pdf.
                                                                     coding standards, and associated                                                                                                                                                                                                                                                                                                                (Precision)                                  (Equivalence)
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  5. Fung, K.W., Richesson R.L., and O. Bodenreider, Coverage of Rare Disease Names in Standard
Unified Medical Language                                             resources to promote creation of          Contains 8,435 rare         • We assess the precision of mapping by looking at the number of rare disease names that map to
System (UMLS)
                               U.S. National Library of Medicine
                                                                     more effective and interoperable            disease names
                                                                                                                                                                                                                                                                                                                                                                               ICD-9-CM                                     87%                                          25%                         Terminologies and Implications for Patients, Providers, and Research, in American Medical Informatics
                                                                     biomedical information systems and                                      distinct codes in each terminology or coding system, and the equivalence by characterizing the                                                                                                                                                                                                                                                          Association Annual Symposium. 2014: Washington, D.C.
                                                                                                                                                                                                                                                                                                                                                                               ICD-10-CM                                    91%                                          45%
                                                                     services, including electronic health                                   semantic nature of the maps to determine whether the mapped term was broader, narrower, or                                                                                                                                                                                                                                                           6. Winnenburg, R. and O. Bodenreider. Coverage of phenotypes in standard terminologies. Proceedings of
                                                                     records.                                                                equivalent to the PCORnet rare disease name.                                                                                                                                                                                      SNOMED CT                                    98%                                          94%                         the Joint Bio-Ontologies and BioLINK ISMB'2014 SIG session “Phenotype  Day”  2014:41-44.
                                                                                                                                                                                                                                                                                                                                                                                                                                      80