=Paper= {{Paper |id=Vol-2285/ICBO_2018_paper_2 |storemode=property |title=Adapting Disease Vocabularies for Curation at the Rat Genome Database |pdfUrl=https://ceur-ws.org/Vol-2285/ICBO_2018_paper_2.pdf |volume=Vol-2285 |authors=Stan Laulederkind,G. Thomas Hayman,Shur-Jen Wang,Elizabeth Bolton,Jennifer R. Smith,Marek Tutaj,Jeff De Pons,Mary Shimoyama,Melinda Dwinell |dblpUrl=https://dblp.org/rec/conf/icbo/LaulederkindHWB18 }} ==Adapting Disease Vocabularies for Curation at the Rat Genome Database== https://ceur-ws.org/Vol-2285/ICBO_2018_paper_2.pdf
      Proceedings of the 9th International Conference on Biological Ontology (ICBO 2018), Corvallis, Oregon, USA                     1




    Adapting Disease Vocabularies for Curation at the
                 Rat Genome Database
 Laulederkind SJ, Hayman GT, Wang SJ, Bolton ER,
    Smith JR, Tutaj M, De Pons J, Shimoyama M                                                               Dwinell MR
         Department of Biomedical Engineering                                          Genomic Sciences and Precision Medicine Center
 Medical College of Wisconsin and Marquette University                                         Medical College of Wisconsin
                 Milwaukee, WI, USA                                                                Milwaukee, WI, USA


    Abstract— The Rat Genome Database (RGD) has been                          term “osteoarthritis” in DO has no children terms, but the RGD
annotating genes, QTLs, and strains to disease terms for over 15              version of DO has 11 children terms or variations of
years. During that time the controlled vocabulary used for                    “osteoarthritis” (Figure 1). The extra details of those terms is
disease curation has changed a few times. The changes were                    lost to users of DO. To avoid the loss of granularity it was
necessitated because no single vocabulary or ontology was freely              decided to extend the DO beyond the merged, axiomized DO file.
accessible and complete enough to cover all of the disease states             After mapping/adding DO terms completely to the RGD version
described in the biomedical literature.                                       of MEDIC, a broader, deeper disease vocabulary has been
                                                                              achieved, by providing more term branches throughout the
    The first disease vocabulary used at RGD was the “C” branch               ontology and more child terms within those branches.
of the National Library of Medicine’s Medical Subject Headings
(MeSH). For at least a few years it was the most publicly                         Keywords— Rat Genome Database, disease, vocabularies,
accessible, complete, and useful vocabulary to describe diseases              online resource
and disease processes. However, it still had many holes in its
coverage of disease vocabulary and an improved vocabulary was
much desired.

     By 2011 RGD had switched disease curation to the use of
MEDIC (MErged DIsease voCabulary), which was a
combination of MeSH and OMIM (Online Mendelian
Inheritance in Man) constructed by curators at the Comparative
Toxicogenomics Database (CTD). MEDIC was an improvement
over MeSH, because of the added coverage of OMIM terms, but
it was not long before RGD curators saw the need for more
disease terms. So within a couple of years, RGD began to add
terms to MEDIC under the guise of the RGD Disease Ontology
(RDO). Since RGD assigned a unique ID to every MEDIC term
imported from CTD, it was easy to add specially coded IDs to
indicate those additional terms from a separate, supplemental
file.

    Meanwhile, the human disease ontology (DO) had slowly been
developing and expanding. As early as 2010, members of RGD
were contributing to the development of DO. However, five                         Figure 1. RDO Children terms of Osteoarthritis.
years went by before MGD (mouse genome database) and RGD
joined with DO in an organized attempt to make DO useful for
the model organism community. From that collaboration came a
large addition of OMIM-based terms, expansion of multi-
parentage of terms through axiomatic extension, and expansion
of cross-references to clinical vocabularies. Based on the promise
of those improvements, it was determined that the Alliance of
Genome Resources could use the DO as a unifying disease
vocabulary across model organism databases.

    Despite the improvements in DO, RGD still had more than
1000 custom terms and 3800 MEDIC terms with annotations to
deal with if RGD would convert to the use of DO. Those extra
terms originated from OMIM, MeSH, and the biomedical
literature. If RGD mapped those non-DO disease terms to DO,
much granularity of meaning would be lost. For instance the




      ICBO 2018                                                   August 7-10, 2018                                                  1
   Proceedings of the 9th International Conference on Biological Ontology (ICBO 2018), Corvallis, Oregon, USA   2




Figure 2. Osteoarthritis with Mild Chondrodysplasia




   ICBO 2018                                                   August 7-10, 2018                                2