Representing and sharing knowledge using SNOMED Proceedings of the 3rd international conference on Knowledge Representation in Medicine (KR-MED 2008) R. Cornet, K.A. Spackman (Eds) Leveraging SNOMED CT with a General Purpose Terminology Server R. Weida, PhD, J. Bowie, ScD, R. McClure, MD, D. Sperzel, MD Apelon, Ridgefield, CT, USA weida@apelon.com General purpose terminology server software Terminology servers support diverse applications. facilitates coordinated use of multiple standard For example, they are used by informaticists to medical terminologies for diverse healthcare create, maintain, localize and map terminologies; by applications. SNOMED CT is an important clinical clinical applications and their users to select and reference terminology, whose size and scope make record standardized data; and by software integration advanced terminology server capabilities engines to map data elements between applications. particularly useful. Moreover, capabilities tied to SNOMED CT is of special interest due to its broad SNOMED CT’s special features and requirements clinical scope, extensive detail, formal structure, and can result in substantial further benefits. international standing.5,6 This paper describes some Enhancements to a general purpose terminology ways that one general purpose terminology server has server have been developed to facilitate the tailored been enhanced and applied to support SNOMED CT creation, validation, organization, deployment, within the context of a full complement of other distribution, submission and maintenance of (post- healthcare terminologies. coordinated) extensions to SNOMED CT. DISTRIBUTED TERMINOLOGY SYSTEM INTRODUCTION Apelon’s Distributed Terminology System (DTS) is Standard medical terminologies are vital to all sorts an open source terminology software suite whose key of contemporary healthcare information technology component is a terminology server. DTS is robust and endeavors, ranging from encoding and exchanging mature, benefiting from years of production information in electronic health record (EHR) deployment in diverse healthcare industry settings. It systems to facilitating outcomes analysis and decision has been used by software and content vendors, support. However, effective integration of pharmaceutical companies, government agencies, terminologies into clinical applications poses universities and research institutions, healthcare substantial challenges. These applications generally delivery systems, and standards development require multiple terminologies since each terminology organizations around the world. has been designed for different purposes by different healthcare constituencies, e.g., SNOMED CT for DTS Architecture representation of clinical data; ICD-9-CM, ICD-10- DTS employs typical three-tier architecture, as CM and CPT-4 for reimbursement; LOINC for illustrated in Figure 1. Multi-tier architectures offer laboratory test results; and HL7 for application many well known advantages, including the ability to interfaces. Drug nomenclatures such as RxNorm and support highly flexible, easily scalable, and extremely NDF-RT, device taxonomies such as UMDNS, dependable deployment solutions. specialty ontologies, and others are also important, as are enterprise-specific terminology enhancements. Terminologies employ different data models and they are delivered in different data formats. Finally, DTS terminologies are constantly evolving, so they must Database be regularly updated in clinical and other applications. However, revision schedules and processes vary widely and are often inconsistent. DTS Server Such challenges can be effectively met with a comprehensive, general purpose terminology server, defined as a networked software component that Tomcat centralizes and integrates terminology content and (DTS Client) reasoning to provide (complete, consistent, effective) terminology services for users and other network DTS DTS DTS Client applications. Earlier terminology servers1,2,3,4 did not Editor Browser Application provide the modular classification, subset, template or SNOMED-specific features described here. Figure 1 – DTS Architecture. 16 Representing and sharing knowledge using SNOMED Proceedings of the 3rd international conference on Knowledge Representation in Medicine (KR-MED 2008) R. Cornet, K.A. Spackman (Eds) The DTS client tier (below the DTS Server in Figure drugs, procedures, and so on. DL enables clear and 1), provides both Java and .Net APIs for developing unambiguous formal definition of a concept’s custom terminology applications. DTS comes with meaning, primarily in terms of its relationships with packaged client applications such as an extensible other concepts. A given concept (e.g., representing a desktop (fat client) terminology editor, the DTS class of drugs) can be described succinctly by naming Editor. There is also a web-based (thin client) the concepts it specializes (more general classes of terminology browser, the DTS Browser, which drugs) and introducing distinguishing characteristics requires an Internet browser and an intermediary (e.g., relationships to its ingredients). The logical Apache Tomcat (or equivalent) web server. The consistency of an entire set of concepts, such as those middle tier of DTS consists of the DTS Server, a comprising a medical terminology, is automatically terminology-focused application server which tested and enforced. Moreover, logical consequences supports highly concurrent, authenticated access to that are implicit in the given descriptions are terminology services via the APIs. It features automatically made explicit. numerous performance optimizations, logging, tracing, remote monitoring, etc. The APIs support A particular DL provides a language for describing browsing, navigation, search, query, editing, concepts and a repertoire of logical inferences for localization, mapping, subsetting and other common reasoning about them. SNOMED CT uses the terminology operations. A relational database Ontylog DL9, which is also used for the US Veterans comprises the third – or data – tier of DTS, shown at Health Administration’s NDF-RT (National Drug File the top of Figure 1. In addition, DTS supplies various – Reference Terminology) and the National Cancer utilities for software and content management, Institute’s NCI Thesaurus, all standards of the US including content subscription updates. Readers Government’s Consolidated Health Informatics (CHI) interested in DTS features outside the scope of this Initiative10. Ontylog syntax and semantics have been paper are referred to the DTS White Paper7. published in connection with the NCI Thesaurus.11 Among the most powerful aspects of DL are its DTS Namespaces facilities for reasoning about relationships among DTS employs a unified content model for uniform concepts and thus automatically managing a logically access to diverse terminologies, including ones based consistent taxonomy (i.e., generalization hierarchy or on Description Logic (DL) such as SNOMED CT, the “is-a” hierarchy) of concepts. NCI Thesaurus and NDF-RT, as well as non-DL terminologies like CPT, ICD, and LOINC. A The DL classification operation automatically subscription service is available for all major medical organizes concepts into a taxonomy based on their terminologies (plus cross-terminology mappings) logical descriptions. Software that implements formatted for easy loading into DTS, ensuring that the classification is called a classifier. As a simplified latest versions of the terminologies are always expository example, a set of concepts { A, B, C, D, E, available. A DTS namespace is the unit of F, G, H, I, J } might be classified into the taxonomy management for content delivery (and access shown in the top portion of Figure 2, where A is a control). Thus, each standard terminology resides in a generalization of B, C and D; B is a generalization of separate namespace so it can be independently E, F and G, etc. We will use this taxonomy in updated and versioned. A mapping between subsequent examples. Extant classifiers generally (elements of) a pair of terminologies, e.g., from CPT create an explicit representation of a taxonomy, to SNOMED CT, is also typically delivered in its including explicit information corresponding to each own separate namespace. DTS also supports an of the lines shown between pairs of linked concepts. unlimited number of local namespaces enabling users The Apelon classifier generates a very high to create and maintain user- or organization-specific performance, in-memory “classification graph” which terminology data. These local terminologies are also includes all information necessary to continue housed in distinct namespaces, as are the local classifying additional concepts in the future. extensions to standard terminologies described below. As a result of classification, each concept in the DESCRIPTION LOGIC taxonomy is guaranteed to be more specific than its parents and all other ancestors (directly or indirectly Description Logic (DL) is a well known field of study connected concepts above), as well as more general within the area of knowledge representation.8 DL is a than its children and all other descendants (directly or type of formal logic focused on creating definitions of indirectly connected concepts below). Therefore, concepts and reasoning about them effectively. Thus, concepts are always found in predictable locations. DL is well suited for expressing precise descriptions That makes it easier to envision relationships among of medical concepts, including anatomy, diseases, concepts and to recognize unintended results. Well- 17 Representing and sharing knowledge using SNOMED Proceedings of the 3rd international conference on Knowledge Representation in Medicine (KR-MED 2008) R. Cornet, K.A. Spackman (Eds) organized taxonomies allow medical knowledge (e.g., an entire set of concepts into a taxonomy by “starting advice, rules, warnings, arbitrary codes, etc.) to be from scratch” and classifying (processing) each and associated with concepts at the most appropriate level every concept in turn. in the taxonomy (neither too general nor too specific) and appropriately inherited by (implicitly associated Modular Classification with) descendant concepts. DTS uniquely facilitates multiple independent extensions of a concept taxonomy based on DL. A terminology is a collection of presumably related Separate classification operations determine how one concepts. In DTS, a namespace is a set of concepts or more distinct sets of additional concepts, each that are managed as a group. Thus, one can classify comprising an extension, fit in with the original the set of concepts comprising a namespace into a taxonomy while leaving the original taxonomy intact taxonomy. Ordinarily, an entire terminology is and without copying it. Classification results are contained – and thereby managed – in one recorded so that the original taxonomy as well as namespace, e.g., all the concepts shown in the top every extension thereof can be independently portion of Figure 2 might comprise a single browsed, searched, queried and retrieved on demand. namespace. (For authoring purposes, some DLs allow As a result, DL taxonomies such as SNOMED CT terminologies to be composed by “importing” (the can be extended easily and accurately, using the same concepts of) one terminology into another, but the language as the original, in multiple independent entire result is still classified monolithically.) ways, to meet local and/or specialized needs in a timely manner. We call this process modular MODULAR EXTENSION classification. Thus, DTS introduces effective means DTS terminology extension features are motivated for working with multiple independent extensions of largely by the existence of SNOMED CT and the an existing taxonomy while preserving the integrity of desire of users to adapt it in diverse ways. SNOMED the original. Indeed, DTS uses the same classification CT contains hundreds of thousands of concepts. New software used in the creation of SNOMED CT. versions of SNOMED have been released twice yearly. Many different users (persons or We will refer to an existing, self-contained organizations) may wish to extend SNOMED by namespace, e.g., a namespace containing SNOMED adding their own concepts. The SNOMED data CT, as a base namespace. Concepts therein are model provides for this possibility. Indeed a single referred to as base concepts. Then, an extension user may be interested in extending SNOMED namespace contains one or more additional concepts several different ways. However, it is important to to be classified, viewed, and otherwise used as if they clearly distinguish the authoritatively published core were also part of the base namespace, but without of SNOMED from any extensions thereof. altering and without copying the base namespace. Furthermore, it is important to classify terminology Concepts within an extension namespace are referred extensions, including post-coordinated expressions, to as extension concepts. as rapidly as possible. Traditional classifiers organize A B C D E F G H I J A A B X1 C D B C X2 D E F G H I J E F G H I J Figure 2 - Base Namespace Taxonomy (top) with Multiple Independent Extended Taxonomies (bottom). 18 Representing and sharing knowledge using SNOMED Proceedings of the 3rd international conference on Knowledge Representation in Medicine (KR-MED 2008) R. Cornet, K.A. Spackman (Eds) The modular classifier operates on DL elements of extension concepts in the context of a base SNOMED extension concepts defined in extension namespace for emphasis. namespaces. These concepts are linked by SNOMED relationships to other concepts in the base namespace In the interest of clarity and brevity (SNOMED CT and/or the same extension namespace. DTS extension has hundreds of thousands of concepts), the upper namespaces can also contain other local information portion of Figure 2 shows a much simpler sample about core SNOMED concepts. Examples include taxonomy for a base namespace. Beneath that are two additional local synonyms; local associations independent extensions, one where the taxonomy is connecting them to or from other concepts, e.g., to extended with a namespace consisting of the concept represent mappings from a local terminology; and X1, and another where the taxonomy is extended with local properties (attribute value pairs, e.g., to indicate a namespace consisting of the concept X2. Notice that that a procedure is performed locally, or that a certain an extended taxonomy effectively contains the entire person last edited the concept). In all cases, set of concepts from the base namespace augmented extensions to SNOMED CT could become with additional concept(s) from the extension problematic if a base SNOMED concept is later namespace. The dashed lines are intended to suggest retired. Reports detailing any such connections are that while the relationships of the extension concepts available, thus allowing for remediation. to the base taxonomy have been determined, they are not (destructively) spliced into the original base As an example, the fictitious Podunk Hospital may taxonomy (shown with solid lines). While these wish to extend the SNOMED CT base namespace simple illustrations show only one concept per with a Podunk Hospital extension namespace. That extension, an extension can of course contain an extension namespace may include an extension arbitrary number of concepts. We have used the DTS concept for a disorder, Familial vertigo, with modular classifier with an extension namespace that definitional relationships to several base concepts in (experimentally) extends SNOMED with LOINC SNOMED CT. In general, an extension concept can laboratory concepts, and another extension be defined in terms of its relationships to base namespace containing the US Drug Extension12, each concept(s) and/or fellow extension concept(s). The containing well over 15,000 concepts. user’s definition of Familial vertigo is shown on the right in Figure 3. This definition was created A base namespace may have multiple extensions interactively within the DTS Editor, drawing from which depend on it; extensions are mutually concepts and relationships (roles) in the standard independent. Multiple independent namespaces SNOMED CT Namespace. Following modular extending SNOMED CT might have a variety of classification, the position of Familial Vertigo with custodians and purposes, including a person (for respect to one branch of the SNOMED taxonomy is learning and testing), a project (for research and shown on the left. Of note, the classifier has inferred development), an organization (for specific the position of Familial Vertigo directly under a institutional needs), a specialty society (for concept Labyrinthine disorder not mentioned terminology related to their practice area), national explicitly in its definition. The DTS Editor italicizes Figure 3 – DTS Editor with Extension Concept (right) and Extended Taxonomy (left). 19 Representing and sharing knowledge using SNOMED Proceedings of the 3rd international conference on Knowledge Representation in Medicine (KR-MED 2008) R. Cornet, K.A. Spackman (Eds) authorities, or even the creators of the base namespace themselves (e.g., to preview possible future enhancements to the base): SNOMED CT Personal Authority Extensions Extensions Visualization of subsets greatly aids review and Project Extensions Organization Extensions Specialty Extensions revision. The DTS Editor (and likewise the web- based browser) can highlight subset members in the larger context of the entire SNOMED taxonomy; note So far, we have focused on authoring sets of concepts the subset member concepts highlighted in gold: covering a unified extension of interest. However, it is important to note that modular classification is equally adept at “on the fly” post-coordination of new concepts in accord with the SNOMED model, e.g., to help populate EHRs at run-time using the DTS API. Logical equivalence (hence redundancy) with a base concept or another extension concept is always detected and reported by the modular classifier. SUBSETS Considering the large size and broad scope of SNOMED CT and other contemporary medical terminologies, it can be extremely helpful to work with smaller, more focused subsets of terminologies when populating pick lists in EHR systems or fields in HL7 messages (HL7 value sets), constraining searches to pertinent concepts for data matching and analysis, etc. Subsets of interest can themselves be large and therefore challenging to maintain when the underlying terminologies are revised, e.g., concepts that are members of the subset may be retired and new concepts that should become members may be introduced. Enumerating each element of a large subset is tedious, opaque and often highly inefficient. The DTS Editor can also render and browse the Therefore, DTS takes a constructive approach to hierarchical structure of the subset members alone, subset specification: a concise subset expression just as if all non-members were spliced out of the compositionally defines an arbitrary subset by original taxonomy (not shown for brevity). Of course, specifying member concepts according to their DTS can also enumerate and export subsets, test for names, synonyms, other properties, and relationships. subset membership, search within subsets, etc. All of Subset expressions can specify inclusion or exclusion these features are available in the DTS Editor GUI of identified concepts and/or all of their descendants application and also via the DTS APIs for runtime in the (base or extended) taxonomy. Moreover, application integration. subset expressions can be arbitrarily nested to include sub-taxonomies, exclude portions thereof, etc. Subset TEMPLATES expressions can use various concept attributes, even Since DTS is a general purpose system for arbitrary those that refer to other namespaces, e.g., we can terminologies, the DTS Editor enables unconstrained specify all SNOMED chronic diseases mapped to editing using generic terminology constructs. ICD-9-CM but (strictly for illustration) excluding However, the SNOMED model carefully constrains chronic drug abuse and chronic drug overdose: concept definitions. Particular types of concepts (within a particular SNOMED hierarchy) are to be 20 Representing and sharing knowledge using SNOMED Proceedings of the 3rd international conference on Knowledge Representation in Medicine (KR-MED 2008) R. Cornet, K.A. Spackman (Eds) defined using particular SNOMED relationships to The Direct substance relationship is one attribute of target concepts chosen from particular portions of an overall template for My Procedures (which are SNOMED. The DTS Template Builder (a DTS Editor concepts in the My SNOMED Extension namespace): “plug-in”) has been developed to specify templates for context-dependent editing in compliance with such a model. Due to space restrictions, the following example is necessarily very abbreviated and simplified but conveys the gist. Suppose we need to extend SNOMED with more procedures. The SNOMED CT Users Guide specifies that the value of a Direct substance relationship (when present) on a Procedure concept should be a Substance or a Pharmaceutical/biological product. Thus, we create a Direct substance subset: The DTS Template Editor enables creation and modification of concepts according to such templates. It reports an error if we attempt to use a concept that is not a member of the specified subset as the value As we create a template for our procedures, we can for a Direct substance relationship: require a value for the Direct substance relationship and require that it be restricted to members of our Direct substance subset: The Template Editor accepts a member of the subset, as in this definition of Snake venom identification: Notice the template-specific labels: Procedure name, Defining procedure and Direct substance. Absent any intervening extension concepts, the modular classifier will place this Snake venom identification extension concept directly under the Toxin detection (procedure) concept from the SNOMED CT base namespace. 21 Representing and sharing knowledge using SNOMED Proceedings of the 3rd international conference on Knowledge Representation in Medicine (KR-MED 2008) R. Cornet, K.A. Spackman (Eds) capabilities in DTS from Dr. James Campbell at the DISTRIBUTION AND SUBMISSION University of Nebraska Medical Center and from Australia’s National E-Health Transition Authority. There are several ways to transfer terminology content into, out of and between DTS instances. References Apelon distributes full and incremental versions of many standard (and custom) terminologies using a 1. Rector AL, Solomon WD, Nowlan WA, Rush compact data format which closely corresponds to the TW, Zanstra PE, Claassen WM. A terminology DTS database schema and can therefore be loaded server for medical language and medical very efficiently. DTS enables users to distribute their information systems. Meth Inform Med. 1995, own DTS terminology content in the same format. In 34(1-2) p. 147-57. addition, DTS includes graphical tools – the import 2. Mays E, Weida R, Dionne R, Laker M, White B, wizard and the export wizard – to easily move ad hoc Liang C, and Oles, FJ. Scalable and expressive terminology content in and out of DTS using medical terminologies. AMIA Annual Fall delimited text and XML formats. However, Symposium, 1996. p. 259-263. SNOMED CT has its own release format, consisting 3. Burgun A, Patrick D, Bodenreider O, Botti G, of a set of related files, tailored to the SNOMED data Delamarre D, Poulinquen B, Oberlin P, Leveque model, which specifically support SNOMED JM, Lukacs B, Kohler F, Fieschi M, LeBeux P. extensions. A SNOMED CT Identifier (SCTID) A web terminology server using UMLS for the uniquely identifies all concepts, descriptions and description of medical procedures. J Am Med relationships in SNOMED CT. Those who wish to Inform Assoc. 1997; 4:356–363. extend SNOMED CT can request their own, 4. Chute CG, Elkin PL, Sherertz DD, Tuttle MS. exclusively assigned range of SNOMED CT Desiderata for a clinical terminology server. Proc identifiers. To facilitate creation and distribution of AMIA Symp1999: 42-6. SNOMED extensions using DTS, we have 5. Spackman KA, Campbell KE, and Cote RA. implemented new DTS capabilities in collaboration SNOMED RT: A reference terminology for with a national terminology authority and with a health care. Proceedings of the AMIA Annual leading academic medical center. These capabilities Fall Symposium, 1997. p. 640–644. include generation of SCTIDs for all elements of a 6. IHTSDO: SNOMED CT®. [Online]. 2007 [cited SNOMED Extension namespace in DTS, as well as 2008 Jan 15]; Available from: import and export of extension namespaces in http://www.ihtsdo.org/our-standards/snomed-ct/ SNOMED release format. Thus, SNOMED 7. Distributed Terminology System. [Online]. 2006 extensions can be readily shared with collaborators, [cited 2008 Jan 15]; Available from: URL: and as appropriate, could be submitted for possible http://www.apelon.com/products/white inclusion in the SNOMED core. The fact that these papers/DTS White Paper V34.pdf extensions have already been successfully classified 8. Baader F, Calvanese D, McGuinness DL, Nardi together with the SNOMED core should expedite D, and Patel-Schneider PF, editors. The review and possible acceptance. description logic handbook: theory, implementation, and applications. Cambridge CONCLUSION (U.K.): Cambridge University Press; 2003. 9. Spackman KA, Dionne R, Mays E, Weis J. Role Apelon DTS, now available via open source grouping as an extension to the description logic licensing, has proven to be a popular tool for of Ontylog motivated by concept modeling in enterprise terminology asset management, featuring SNOMED. AMIA Annual Symposium, 2002. p. comprehensive capabilities for working with multiple 712-716. standard and local terminologies, both individually 10. Presidential Initiatives. [Online]. 2007 [cited and in concert (e.g., via mappings) using a unified 2008 Jan 15]; Available from: URL: suite of software components. Recognizing the http://www.hhs.gov/healthit/chiinitiative.html importance of SNOMED CT, we have added 11. Hartel FW, de Coronado S, Dionne R, Fragoso significant functionality to meet SNOMED’s unique G, Golbeck J. Modeling a description logic requirements and benefit from its unique capabilities. vocabulary for cancer research. J Biomed Inform. 2005 Apr. p. 114-29. Acknowledgments 12. CAP SNOMED Terminology Solutions. We gratefully acknowledge the contributions of past Pharmacy. [Online]. 2007 [cited 2008 Jan 15]; and present Apelon colleagues to the ideas described Available from: in this paper and the DTS system. We deeply URL:http://www.cap.org/apps/docs/snomed/docu appreciate review and feedback on new SNOMED ments/pharmacy_nov07.pdf 22