=Paper=
{{Paper
|id=Vol-1327/24
|storemode=property
|title=Ontology-based Normalization for Disease-Lab test Relation Extraction
|pdfUrl=https://ceur-ws.org/Vol-1327/icbo2014_paper_56.pdf
|volume=Vol-1327
|dblpUrl=https://dblp.org/rec/conf/icbo/ZhangWTX14
}}
==Ontology-based Normalization for Disease-Lab test Relation Extraction==
ICBO 2014 Proceedings Ontology-based Normalization for Disease- Lab test Relation Extraction Yaoyun Zhang, Jingqi Wang, Cui Tao, Hua Xu School of Biomedical Informatics University of Texas at Houston Houston, USA {Yaoyun.Zhang, Jingqi.Wang, Cui.Tao, Hua.Xu}@uth.tmc.edu Abstract—This poster describes our preliminary work on ontology-based normalization for diseases and lab tests, as a B. ICD-9CM fundamental step toward disease-lab test relation extraction. The International Statistical Classification of Diseases and Multiple ontologies are leveraged for this aim. Specifically, Related Health Problems (ICD) is the international "standard diseases and lab tests are first extracted and mapped to the diagnostic tool for epidemiology, health management and Concept Unique Identifier (CUI) of the Unified Medical clinical purposes"[3] . The ICD provides a hierarchical system Language System (UMLS) by MetaMap. Codes of International of diagnostic codes for classifying diseases. Major categories Classification of Diseases, Version 9 – Clinical Modification are designed to include a set of similar diseases as sub- (ICD-9CM) are then employed to further normalize diseases; categories. For example, “(050.0) Variola major” is a while the Logical Observation Identifiers Names and Codes subcategory of “(050) Smallpox”. Health conditions can be (LOINC) are used to normalize lab tests. mapped corresponding generic categories or more specific sub- Keywords—ontology-based normalization; disease categories. normalization; lab test normalization; relation extraction C. LOINC I. INTRODUCTION The Logical Observation Identifier Names and Codes (LOINC) is the only publicly available universal standard for Disease-labtest relation extraction plays an important role laboratory test codes and names[4 5].The current version of the in various medical appliations such as clinical decision-support LOINC code set (released in June 2014) contains 73,889 terms systems and phenotype information extraction. However, for lab tests, measurements and clinical observations. Lab tests mentions of diseases and labtests in text contains diverse non- are orgazied hierarchically into 14 top classes including standard variations. Those variants need to be normalizad into “Microbiology”, “Blood Bank”, etc. standard codes first to facilitate more universal computational applications. This poster describes our preliminary work on III. NORMALIZATION METHOD ontology-based normalization for diseases and lab tests, as a The original mention of disease/lab test is first recognized fundamental step toward disease-lab test relation extraction. and mapped to UMLS concepts by MetaMap[2]. If ICD- Three existing standard ontologies UMLS[1], ICD-9CM[3] 9CM/LOINC is among the sources of terms for the mapped and LOINC[4] are leveraged for this aim. UMLS concept, then the mention can be normalized to the II. ONTOLOGY OVERVIEW corresponding code and name from ICD-9CM/LOINC directly. If not, for disease, if SNOWMED CT is one source of terms Overview for the UMLS concept, the corresponding SNOWMED CT A. UMLS concept can be mapped to ICD-9CM code by the rule-based mapping provided by NIH[6]. For lab test, the RELMA UMLS[1] is a thesaurus re-orgizaing many controlled software is employed to map labtest to LOINC code and vocabularies in the biomedical sciences. It provides a mapping name[7]. Fig. 1 and Fig. 2 illustrate the workflow of disease structure among viraous vocabularies and serves as a normalization and lab test normalization, respectivley. comprehensive ontology of biomedical concepts. For each concept in UMLS, a synonym list consisting of terms from multiple vocabularies is collected. For example, “Diabetes Mellitus” and “dm” are synonyms of the same CUI C0011849. Various ontological relations between CUIs are defined, such as “isa”, “broader”, and “sibling”, etc. Identify applicable sponsor/s here. If no sponsors, delete this text box (sponsors). 87 ICBO 2014 Proceedings Fig. 1. Diagram of Disease Normalization Fig. 3. Example of Disease Normalization Fig. 4. Example of Lab test Normalization Fig. 2. Diagram of Lab test Normalization REFERENCES [1] Bodenreider O. The unified medical language system (UMLS): integrating biomedical terminology. Nucleic acids research 2004;32(suppl 1):D267-D70. [2] Aronson AR, Lang F-M. An overview of MetaMap: historical perspective and recent advances. Journal of the American Medical Informatics Association 2010;17(3):229-36. IV. NORMALIZATION RESULTS [3] World Health O. International classification of impairments, disabilities, and handicaps: a manual of classification relating to the consequences of Multiple variations of diseases and lab tests are normalized disease, published in accordance with resolution WHA29. 35 of the into standard codes following the workfolow in Fig 1 and Fig Twenty-ninth World Health Assembly, May 1976. 1980. 2. Fig. 3 and Fig. 4 show the examples of disease [4] McDonald CJ, Huff SM, Suico JG, et al. LOINC, a universal standard normalization and lab test normalization, respectivley. The for identifying laboratory observations: a 5-year update. Clinical original mentions are mapped to UMLS concept first, and then chemistry 2003;49(4):624-33. to ICD-9 CM and LOINC code. [5] Khan AN, Griffith SP, Moore C, Russell D, Rosario AC, Bertolli J. Standardizing laboratory data by mapping to LOINC. Journal of the American Medical Informatics Association 2006;13(3):353-55. [6] http://www.nlm.nih.gov/research/umls/mapping_projects/snomedct_to_i cd9cm_reimburse.html. [7] http://loinc.org/downloads/relma. 88 ICBO 2014 Proceedings Ontology-based Normalization for Disease-Lab Test Relation Extraction Yaoyun Zhang, PhD, Jingqi Wang, MS, Cui Tao, PhD, Hua Xu, PhD The School of Biomedical Informatics |The University of Texas Health Science Center at Houston Introduction Results Disease-lab test relation extraction plays an important role in various medical appliations such • Precision: 1) General concepts of diseases/lab tests not so valuable practically. E.g., in as clinical decision-support systems and phenotype information extraction. However, relations between “Heart Diseases” and lab tests, “Heart Diseases” include “coronary mentions of diseases and labtests in text have diverse non-standard variations. Those artery disease”, “arrhythmias” and “congenital heart defects”, etc. 2) Fail to normalize to LONIC by RELMA. E.g., “acanthocyte count” -> 565-2:COLONY variants need to be normalizad into standard codes first to facilitate more universal [COUNT]:NUM:PT:XXX:ORD:VC. computational applications. This poster describes our preliminary work on ontology-based • Coverage: Fail to recognize variants of diseases/lab tests. E.g., “blood film” refers to normalization of diseases and lab tests, as a fundamental step toward disease-lab test “blood smear” . relation extraction. Ontology Overview • UMLS [1]: Re-organized many controlled vocabularies in the biomedical sciences; a comprehensive ontology of biomedical concepts. E.g., “Diabetes Mellitus” and “dm” are synonyms of the same concept . • ICD-9CM [2]: A hierarchical system of diagnostic codes for classifying diseases. E.g., “(050.0) Variola major” is a subcategory of “(050) Smallpox”. • LOINC [3]: The only publicly available universal standard for laboratory test codes and names. E.g., the test of “blood culture” is under the general category “Microbiology” . Figure 2 Examples of Disease and Lab Test Normalization Method Conclusion This poster presents the preliminary results of our ontology-based normalization of diseases and lab tests. In the next stage, machine learning methods will be employed for disease and lab test recognition. General concepts of diseases and lab tests need to be filtered. The precision of lab test normalization to LOINC also need to be further improved. References 1. Bodenreider O. The unified medical language system (UMLS): integrating biomedical terminology. Nucleic acids research 2004;32(suppl 1):D267-D70. 2. National Center for Health, S. ICD-9-CM: International Classification of Diseases 9th Revision Clinical Modification. US Department of Health and Human Services, Public Health Service, Health Care Financing Administration 2008. 3. McDonald CJ, Huff SM, Suico JG, et al. LOINC, a universal standard for identifying laboratory observations: a 5-year update. Clinical chemistry 2003;49(4):624-33. Figure 1 Diagrams of Disease and Lab Test Normalization Please contact the corresponding author via email: Hua.Xu@uth.tmc.edu 89