=Paper=
{{Paper
|id=Vol-410/paper-6
|storemode=property
|title=SNOMED CT: Browsing the Browsers
|pdfUrl=https://ceur-ws.org/Vol-410/Paper06.pdf
|volume=Vol-410
|dblpUrl=https://dblp.org/rec/conf/krmed/RogersB08
}}
==SNOMED CT: Browsing the Browsers==
Representing and sharing knowledge using SNOMED
Proceedings of the 3rd international conference on Knowledge Representation in Medicine (KR-MED 2008)
R. Cornet, K.A. Spackman (Eds)
SNOMED CT: Browsing the Browsers
J Rogers1, MD, O Bodenreider2, MD,
1
Technology Office, NHS Connecting for Health, Leeds UK
2
National Library of Medicine, NIH, Bethesda USA
jeremy.rogers@nhs.net, olivier@nlm.nih.gov
SNOMED CT is a complex ontology; sophisticated Working deployments of SNOMED CT require
browsers are required to make it understandable and additional or ancillary information linked to that core,
useful. We identified 23 SNOMED CT browsers that usually provided by either the IHTSDO or a National
have been developed, and inspected 17. We Release Centre. Examples of such data include
enumerate and provide test criteria for a ‘master list’ crossmaps to other clinical classifications (e.g. ICD-
of 143 browsing features supported by at least one 10), definitions of subsets of concepts and/or their
inspected browser; future work will determine which descriptions for navigational or localization purposes,
of these features are implemented by individual and a history of changes between successive releases.
browsers. Only 5 features were common to all 17 The January 2008 IHTSDO release therefore
browsers; 89 were found in less than one third of comprised 21 discrete table components in addition to
browsers. We recommend that a core set of browsing the 3 defining the core ontology. The April 2008 UK
features be defined and harmonized across browsers, National Release, which builds on the January 2008
particularly for text-to-concept search operations. IHTSDO release, comprised 122 separate tables.
In addition to this centrally provided additional
INTRODUCTION content, it is also possible to link external data to the
core or ancillary data sources. For example, crossmap
SNOMED CT is a biomedical ontology and an target codes can be linked to their corresponding
associated terminology1. Formerly owned by the native rubrics or hierarchies.
College of American Pathologists, it has been
managed since April 2007 by the International Health
SNOMED CT Browsers
Terminology Standards Development Organisation
The authors and their colleagues identified 23
(IHTSDO), a not-for-profit international standards
different implementations of software10-28 offering
body. As distributed, it is a large, complex and
SNOMED CT browsing capability – either embedded
evolving knowledge artifact. Sophisticated browsers
in larger application environments or available as
must make that complexity accessible and
standalone browsers. 16 of these10-23 were inspected
understandable, and suppress distracting or unwanted
as working software: CaTTS, CliniClue, CLIVE,
detail2-3. A number of different SNOMED CT
EdBrowse, FDB Sphinx, HealthTerm, LexPlorer,
browsers have been constructed since it was first
Mycroft, NCI Terminology Browser, OntoBrowser,
published. Some have been evaluated for a variety of
OpenKnoME, Protégé-OWL, SNOB, SnoFlake, the
use cases, including coding of clinical data4-8 and
UMLS Rich Release Format (RRF) Browser and the
terminology evaluation and management9.
Virginia Tech Browser. One additional feature was
In this paper, we report interim results of a systematic
identified on a screen capture of the AxSys browser.
inspection of some of these browsers. We enumerate
AxSys, CLIVE, FDB Sphinx, HealthTerm and
a superset of browsing features, outline the variability
LexPlorer require user privileges to access;
with which these features are implemented in
OntoBrowser and EdBrowse are unsupported in-
individual browsers, and consider the possible
house prototypes. The remaining ten browsers are
consequences of non-standardized browsing of a
publicly available at zero cost. Both CliniClue and
standardized terminology.
OpenKnoME require proprietary additional tooling to
load SNOMED CT distribution files, although
MATERIALS prebuilt CliniClue data is widely available.
SNOMED CT OpenKnoMe and OntoBrowser also require a
The core of a SNOMED CT release comprises three proprietary terminology server.
tables (sct_concepts, sct_descriptions and The remaining 6 browsers not inspected24-28 were:
sct_relationships) collectively defining a proprietary software from Informatics inc, Ocean
compositional description logic ontology of the Informatics and Visual Read; a demonstrator
medical domain, and a lexicon of associated preferred browser/encoder developed within the NHS Common
or synonymous descriptions. The most recent User Interface Project; Kermanog’s CLAW product17
international release (January 2008) contains 311,313 based on SNOMED in ClaML (EN 14463) format;
active concepts, 1,357,719 relationships between and Linköping University’s browser. These were
those concepts and 794,061 active descriptions. excluded for reasons of time or lack of access.
30
Representing and sharing knowledge using SNOMED
Proceedings of the 3rd international conference on Knowledge Representation in Medicine (KR-MED 2008)
R. Cornet, K.A. Spackman (Eds)
METHODS Core Data
A minimal requirement for a SNOMED CT browser
Each browser was inspected by one author against an
is to give access to the data in the three core tables
emerging catalog of all features exhibited so far by at
(concepts, relationships, descriptions). Table 1 lists
least one previously inspected browser. Whenever the
the 22 fields from each of the three core tables that
choice was given to us, browsers were inspected
might be displayed by a browser.
using content based on the July 31, 2007 international
Most browsers implement a concept-centric view of
release of SNOMED CT. A subset of SNOMED CT
this core content, comprising one concept, its
content converted into OWL DL was used for
description(s), classification with respect to other
Protégé-OWL inspection.
concepts, and definition in terms of other concepts.
The goal of each successive inspection was primarily
This represents the minimum set of features required
to identify novel features implemented in the
for the coding of clinical data and basic navigation.
inspected browsers, for inclusion in a cumulative
Some fields (e.g. ConceptStatus) appear in the source
master catalog. The feature catalogue was iteratively
release data as coded numeric values whose
organized by an emerging set of themes, and this
interpretation is given only in SNOMED release
resulted in a progressive systematization of the
documentation; most browser implementations
inspection process itself, with each theme considered
display only the human readable interpretation of
in detail by turn. This iterative systematisation aided
these codes and not also (or only) the numeric values
the process of new feature identification.
as actually distributed.
Where possible, operational definitions of new
Despite their ‘core’ nature, however, only three of the
features were specified (reproduced in Tables 1-3).
22 related features were displayed by all browsers
Subsequent inspections progressed by browsing or
inspected: the Concept ID, a link to (at least one)
searching the Test Case column entry, and comparing
description for a concept, and display of the text of
the displayed result with the Expected Result column.
linked descriptions. Description status and Initial
Although previously inspected browsers were
Capital Status, Relationship ID and Refinability were
subsequently re-inspected for newly discovered
each visible in only two or three browsers.
features, work is underway to confirm the validity and
reproducibility of inspecting individual browsers
Non-Core: Ancillary, 3rd Party and Derived Data
against the feature catalog. Individual browser scores
Advanced navigation and terminology maintenance
are therefore not presented here.
work may require either additional data outside the
core tables, or ‘derived’ views of the core data itself
RESULTS
such as ‘reverse’ historical relationships (showing
143 different browsing features were identified across which inactive concepts point at the current browser
17 inspected browsers. 6 further features occurred to focus concept as their replacement). Table 1 lists the
the authors during the inspection process as being ‘derived’ views found across the inspected browsers.
potentially useful, but were not found in any A complete set of SNOMED core and ancillary
inspected browser. The combined set of 149 features linked data is large and complex. Further, it changes
are presented in the accompanying tables, organised with each biannual release. To reflect this
under the 8 major themes outlined below. configuration and versioning complexity, some
Our preliminary summary results, based on partially browsers report exactly which versions of which
validated individual browser inspections, suggest release components are loaded, alert users when they
most browser featuresets are an arbitrarily selected are browsing non-current data, and support
and small subset of all 149 features available. On concurrent browsing of multiple release versions for
average, individual browsers implement only 40 direct discovery or comparison of changed content.
features (Range 21-107, StDev=13), but only 22 of We found display of non-core data, and data from
the 149 features were found in more than two thirds more than one release, to be the exception rather than
of all browsers inspected, of which only 5 were the rule. Pointers from inactive concepts to their
implemented in all inspected browsers (Search by active replacement, and the set of concepts using the
ConceptID or by Exact string, display of a browser focus concept in their definition, are
ConceptID, its linkage to a Description, and the text accessible in less than half of all browsers; all other
of that Description). 89 features were found in less ancillary, 3rd party or derived data browsing functions
than a third of all browsers, but 70 of these are found are present in less than one third of all browsers and
in at least two browsers. Overall, these results usually only in two or three.
suggests that most possible browsing features have
been implemented independently by several Visualisation and Navigation
SNOMED browser developers, but they have yet to Following from consideration of what data a browser
become ‘standard’ across most browsers. displays is how it displays it. Additionally, the
31
Representing and sharing knowledge using SNOMED
Proceedings of the 3rd international conference on Knowledge Representation in Medicine (KR-MED 2008)
R. Cornet, K.A. Spackman (Eds)
navigability of this data must be considered. Table 2 collation features were observed across all browsers,
lists the visualization and navigation features thirteen of the browsers implemented less than 10 of
encountered in the inspected browsers. them - and rarely the same set. 27 searching features
Most browsers implement some form of graphical were implemented in less than a third of all browsers
tree browser, displaying the browser focus concept in inspected, of which 5 were unique to one browser.
the context of SNOMED’s multiaxial subsumption Browsers differ in which features are on by default,
hierarchy. Some off-the-shelf tree controls, however, which must be explicitly specified, and which can be,
are unsuitable for displaying trees with very many or by default are, combined in Boolean combinations.
levels and very many siblings at the same level, such Not all strip trailing spaces; some default to an exact
as SNOMED CTs subsumption hierarchy. Those string match whilst others assume wildcarding unless
showing the hierarchy always exploded from the root specifically overridden. Where a search expression
node downward (e.g. the NCI Terminology Browser contains multiple words or tokens, few browsers
and Protégé) are particularly unwieldy; those that do support complex query logics such as requiring some
not detect very large sibling sets before attempting to tokens to be present and others not.
display them can lead to very long refresh times. To demonstrate the effect of these differences, all
Other visualization features observed include: sorting browsers were used in their default configuration to
and grouping of components within concept search against the same string: ‘ear catheter’. Six
definition or synonym sets, diacritic and superscript browsers found no matches. A further six found only
rendering, and typographic or colour coding of text. 72683003 Removal of catheter from middle ear, and its
Most browsers employ web browsing paradigms for two descendants. SNOB returned eleven matches,
navigation, with use of hyperlinks to refocus the including 72683003 but also 232199004 Inflation of
browser on arbitrary concepts, as well as Eustachian tube using balloon. The latter has no directly
back/forward navigation. Bookmarked ‘favourites’, associated descriptions containing either ‘ear’ or
or a ‘home’ concept, however, were rarely observed. ‘catheter’ but instead is returned because it has at
least one ancestor with at least one description
Usability and Interoperability matching ‘ear’, and a separate ancestor with a
The overall experience of working with a browser is description matching ‘catheter’. The UMLS RRF
influenced by a range of more generic user interface Browser returned sixty-six matches.
features, listed inTable 2. These include: the ability to
transiently or persistently configure a custom view on Postcoordination and Miscellaneous
the wealth of SNOMED related information, e.g., to Unlike traditional clinical terminologies, SNOMED
occupy less of the desktop real estate; copy-and-paste CT can be ‘postcoordinated’ - dynamically extended
or drag-and-drop of selected information either within by anybody, subject to certain ontological rules. Most
the browser environment or into external applications, trivially, this manifests as the option to qualify
and the availability of an API allowing browser anatomical sites by a Laterality attribute and
interface components to be instantiated and controlled Sidedness value. Exposing SNOMED CT only as a
by 3rd party software (a functionality distinct from the static corpus significantly diminishes its expressivity.
notion of a terminology services API per se). Further, a large part of the content – e.g. all Qualifier,
and Linkage Concepts - is easily misunderstood
Searching outside the context of postcoordination.
Table 3 lists the range of features observed by which The rules governing postcoordination are complex
SNOMED CT is searched against a user-entered text but compliance with them is a prerequisite for
string in order to identify candidate SNOMED dynamic classification of the expressions so built. A
ConceptIDs as possible entry points for subsequent dedicated postcoordinated expression building and
visualization and navigation. These different search validating interface is therefore highly desirable, but
features observed may be further analysed into: we found only five browsers that implement one.
· lexical expansion of the original user search string Three of these additionally implement some limited
in order to increase recall part of the rules and conventions. However, although
· semantic or metadata filtering of the set of compliance with the rules has limited value outside
candidate concepts returned by a query, in order the context of dynamic classification, no browser
to increase precision inspected currently provides that function.
· collation and sorting of filtered results, so that the SNOMED CT contains many content errors and
user may find (or be certain of not finding) the omissions. Empowering end users to log and report
required concept content errors offers a ‘social computing’ route to
In general, SNOMED CT searching functionality in expand SNOMED CT’s quality assurance capacity.
most browsers is impoverished and idiosyncratic. However, only one inspected browser directly
Although 37 different query expansion, filtering and integrates content bug logging and reporting.
32
Representing and sharing knowledge using SNOMED
Proceedings of the 3rd international conference on Knowledge Representation in Medicine (KR-MED 2008)
R. Cornet, K.A. Spackman (Eds)
DISCUSSION References
Accessing data vs. browsing. In seeking to review 1. SNOMED CT. IHTSDO, Copenhagen 2007
‘browser’ technologies, we excluded command line www.ihtsdo.org
or other direct SQL interfaces on the data tables. 2. Tuttle MS, Cole WG, Sheretz DD, Nelson SJ.
Although most browsers hide the raw data tables from Navigating to knowledge. Methods Inf Med.
the user, at least one explicitly provides a route to it. 1995 Mar;34(1-2):214-31
Whether ‘display’ of data by this route should pass or 3. Patel VL, Kushniruk AW. Understanding,
fail our core data theme tests is debatable. navigating and communicating knowledge: issues
Configurability. A minority of the features identified and challenges. Methods Inf Med. 1998
are orthogonal or graded values of one property. For Nov;37(4-5):460-70.
example, whether a given hierarchy browser sorts 4. Windle J, Van-Milligan G, Duffy S et al. Web-
sibling concepts randomly, alphabetically by based physician order entry: an open source
description, or numerically by ConceptID are solution with broad physician involvement.
orthogonal values of a ‘sibling sort’ function. AMIA Annu Symp Proc. 2003;:724-7.
Although in theory it is possible to imagine a browser 5. Elkin PL, Brown SH, Husser CS et al. Evaluation
configurable to any one of the three, individual of the content coverage of SNOMED CT: ability
hierarchy display instances can only implement one at of SNOMED clinical terms to represent clinical
a point in time. In practice, all inspected browsers problem lists. Mayo Clin Proc. 2006
implement only one of these options throughout. Jun;81(6):741-8.
Operational test criteria. Differences between the 6. Sundvall E, Nyström M. et al. Interactive
browsers, particularly their default treatment of visualization and navigation of complex
search strings, confounded attempts to specify tests terminology systems, exemplified by SNOMED
that would work equally across all of them. Many of CT. Std Health Technol Inform. 2006;124:851-6.
the tests specified in Tables 1-3 must be interpreted to 7. Chiang MF, Hwang JC, Yu AC et al. Reliability
take account of issues such as whether exact or of SNOMED-CT coding by three physicians
wildcard string matching is assumed. using two terminology browsers. AMIA Annu
Absence of standard search features. The observed Symp Proc. 2006;:131-5.
differences in text-to-concept search implementations 8. Richesson R, Syed A, Guillette H et al. A web-
have a striking effect on browsing experience. Further based SNOMED CT browser: distributed and
work to characterize this phenomenon is required. real-time use of SNOMED CT during the clinical
Future work. We are currently validating the testing research process. Medinfo. 2007;12(Pt 1):631-5.
of specific browsers against the catalog of features. 9. Cornet R, de Keizer NF, Abu-Hanna A. A
The quantitative results reported here are preliminary framework for characterizing terminological
but confirm the authors’ original motivation for the systems. Methods Inf Med. 2006;45(3):253-66.
experiment: currently available SNOMED CT 10. CaTTS (browsed Dec 21st 2007) www.jdet.com/
browsers are very different and often suboptimal. 11. CliniClue (build 2006.2.30) www.cliniclue.com
We do not propose that all SNOMED CT browsers 12. CLIVE (UK NHS in-house terminology
must always implement all the features we identify; authoring tool)
further research is required to determine which 13. HealthTerm (v 4.3.2 browsed Dec 21st 2007)
features are required for specific use cases, but the 14. HLi LExPlorer (v 4.4.1P build 48 browsed Dec
prior existence of a master feature catalog such as we 21st 2007 – Athens account required)
present here is a prerequisite for that research. Many www.snomed.cfh.nhs.uk/lexplorer/
of the features seem likely to be common across use 15. Mycroft (v. 2.1.0.2) www.apelon.com/
cases, particularly text-to-concept search operations. 16. NCI Terminology Browser (browsed Dec 21st
We recommend that a core set of searching and 2007) nciterms.nci.nih.gov/NCIBrowser/
browsing features be defined and harmonized across 17. OpenKnoME 5.4d and ClaW Workbench
tools, so that a standard terminology is not www.opengalen.org/sources/software.html
transformed into multiple different objects by virtue 18. Protégé (v4.0 build 59) protege.stanford.edu
of idiosyncratic and limited browsing experiences. 19. SNOB (v1.64) snob.eggbird.eu
20. SnoFlake (v 2.0 browsed Dec 21st 2007)
Acknowledgments snomed.dataline.co.uk/
21. UMLS Rich Release Format Browser (2007AC)
This research was supported by the Intramural www.nlm.nih.gov/research/umls/
Research Program of the National Institutes of Health 22. Virginia Tech Browser (browsed Dec 21st 2007)
(NIH), National Library of Medicine (NLM) and by terminology.vetmed.vt.edu/SCT/menu.cfm
NHS Connecting for Health. Thanks to Drs Malcolm 23. AxSys Browser (browsed Jan 5th 2008)
Duncan and Christopher Wroe for their assistance. www.axsys.co.uk/excelicare/eprclinicalcoding.htm
33
Representing and sharing knowledge using SNOMED
Proceedings of the 3rd international conference on Knowledge Representation in Medicine (KR-MED 2008)
R. Cornet, K.A. Spackman (Eds)
24. First DataBank www.firstdatabank.com/ 27. Visual Read www.visualread.com/
25. Informatics inc www.informatics.com/ 28. NHS Common User Interface nww.cui.nhs.uk/
26. Ocean Informatics oceaninformatics.biz
Table 1 Core and Additional SNOMED CT table browsing features
34
Representing and sharing knowledge using SNOMED
Proceedings of the 3rd international conference on Knowledge Representation in Medicine (KR-MED 2008)
R. Cornet, K.A. Spackman (Eds)
Table 2: Visualisation, Navigation and Interoperation browsing features
35
Representing and sharing knowledge using SNOMED
Proceedings of the 3rd international conference on Knowledge Representation in Medicine (KR-MED 2008)
R. Cornet, K.A. Spackman (Eds)
Table 3: Searching, Postcoordination and Miscellaneous browsing features
36