=Paper= {{Paper |id=Vol-410/paper-6 |storemode=property |title=SNOMED CT: Browsing the Browsers |pdfUrl=https://ceur-ws.org/Vol-410/Paper06.pdf |volume=Vol-410 |dblpUrl=https://dblp.org/rec/conf/krmed/RogersB08 }} ==SNOMED CT: Browsing the Browsers== https://ceur-ws.org/Vol-410/Paper06.pdf
Representing and sharing knowledge using SNOMED
Proceedings of the 3rd international conference on Knowledge Representation in Medicine (KR-MED 2008)
R. Cornet, K.A. Spackman (Eds)




                                            SNOMED CT: Browsing the Browsers
                                               J Rogers1, MD, O Bodenreider2, MD,
                                   1
                                     Technology Office, NHS Connecting for Health, Leeds UK
                                       2
                                         National Library of Medicine, NIH, Bethesda USA
                                   jeremy.rogers@nhs.net, olivier@nlm.nih.gov
              SNOMED CT is a complex ontology; sophisticated                  Working deployments of SNOMED CT require
              browsers are required to make it understandable and             additional or ancillary information linked to that core,
              useful. We identified 23 SNOMED CT browsers that                usually provided by either the IHTSDO or a National
              have been developed, and inspected 17. We                       Release Centre. Examples of such data include
              enumerate and provide test criteria for a ‘master list’         crossmaps to other clinical classifications (e.g. ICD-
              of 143 browsing features supported by at least one              10), definitions of subsets of concepts and/or their
              inspected browser; future work will determine which             descriptions for navigational or localization purposes,
              of these features are implemented by individual                 and a history of changes between successive releases.
              browsers. Only 5 features were common to all 17                 The January 2008 IHTSDO release therefore
              browsers; 89 were found in less than one third of               comprised 21 discrete table components in addition to
              browsers. We recommend that a core set of browsing              the 3 defining the core ontology. The April 2008 UK
              features be defined and harmonized across browsers,             National Release, which builds on the January 2008
              particularly for text-to-concept search operations.             IHTSDO release, comprised 122 separate tables.
                                                                              In addition to this centrally provided additional
                               INTRODUCTION                                   content, it is also possible to link external data to the
                                                                              core or ancillary data sources. For example, crossmap
              SNOMED CT is a biomedical ontology and an                       target codes can be linked to their corresponding
              associated terminology1. Formerly owned by the                  native rubrics or hierarchies.
              College of American Pathologists, it has been
              managed since April 2007 by the International Health
                                                                              SNOMED CT Browsers
              Terminology Standards Development Organisation
                                                                              The authors and their colleagues identified 23
              (IHTSDO), a not-for-profit international standards
                                                                              different implementations of software10-28 offering
              body. As distributed, it is a large, complex and
                                                                              SNOMED CT browsing capability – either embedded
              evolving knowledge artifact. Sophisticated browsers
                                                                              in larger application environments or available as
              must make that complexity accessible and
                                                                              standalone browsers. 16 of these10-23 were inspected
              understandable, and suppress distracting or unwanted
                                                                              as working software: CaTTS, CliniClue, CLIVE,
              detail2-3. A number of different SNOMED CT
                                                                              EdBrowse, FDB Sphinx, HealthTerm, LexPlorer,
              browsers have been constructed since it was first
                                                                              Mycroft, NCI Terminology Browser, OntoBrowser,
              published. Some have been evaluated for a variety of
                                                                              OpenKnoME, Protégé-OWL, SNOB, SnoFlake, the
              use cases, including coding of clinical data4-8 and
                                                                              UMLS Rich Release Format (RRF) Browser and the
              terminology evaluation and management9.
                                                                              Virginia Tech Browser. One additional feature was
              In this paper, we report interim results of a systematic
                                                                              identified on a screen capture of the AxSys browser.
              inspection of some of these browsers. We enumerate
                                                                              AxSys, CLIVE, FDB Sphinx, HealthTerm and
              a superset of browsing features, outline the variability
                                                                              LexPlorer require user privileges to access;
              with which these features are implemented in
                                                                              OntoBrowser and EdBrowse are unsupported in-
              individual browsers, and consider the possible
                                                                              house prototypes. The remaining ten browsers are
              consequences of non-standardized browsing of a
                                                                              publicly available at zero cost. Both CliniClue and
              standardized terminology.
                                                                              OpenKnoME require proprietary additional tooling to
                                                                              load SNOMED CT distribution files, although
                                  MATERIALS                                   prebuilt CliniClue data is widely available.
              SNOMED CT                                                       OpenKnoMe and OntoBrowser also require a
              The core of a SNOMED CT release comprises three                 proprietary terminology server.
              tables     (sct_concepts,      sct_descriptions   and           The remaining 6 browsers not inspected24-28 were:
              sct_relationships)      collectively     defining   a           proprietary software from Informatics inc, Ocean
              compositional description logic ontology of the                 Informatics and Visual Read; a demonstrator
              medical domain, and a lexicon of associated preferred           browser/encoder developed within the NHS Common
              or synonymous descriptions. The most recent                     User Interface Project; Kermanog’s CLAW product17
              international release (January 2008) contains 311,313           based on SNOMED in ClaML (EN 14463) format;
              active concepts, 1,357,719 relationships between                and Linköping University’s browser. These were
              those concepts and 794,061 active descriptions.                 excluded for reasons of time or lack of access.




                                                                         30
Representing and sharing knowledge using SNOMED
Proceedings of the 3rd international conference on Knowledge Representation in Medicine (KR-MED 2008)
R. Cornet, K.A. Spackman (Eds)




                                  METHODS                                    Core Data
                                                                             A minimal requirement for a SNOMED CT browser
              Each browser was inspected by one author against an
                                                                             is to give access to the data in the three core tables
              emerging catalog of all features exhibited so far by at
                                                                             (concepts, relationships, descriptions). Table 1 lists
              least one previously inspected browser. Whenever the
                                                                             the 22 fields from each of the three core tables that
              choice was given to us, browsers were inspected
                                                                             might be displayed by a browser.
              using content based on the July 31, 2007 international
                                                                             Most browsers implement a concept-centric view of
              release of SNOMED CT. A subset of SNOMED CT
                                                                             this core content, comprising one concept, its
              content converted into OWL DL was used for
                                                                             description(s), classification with respect to other
              Protégé-OWL inspection.
                                                                             concepts, and definition in terms of other concepts.
              The goal of each successive inspection was primarily
                                                                             This represents the minimum set of features required
              to identify novel features implemented in the
                                                                             for the coding of clinical data and basic navigation.
              inspected browsers, for inclusion in a cumulative
                                                                             Some fields (e.g. ConceptStatus) appear in the source
              master catalog. The feature catalogue was iteratively
                                                                             release data as coded numeric values whose
              organized by an emerging set of themes, and this
                                                                             interpretation is given only in SNOMED release
              resulted in a progressive systematization of the
                                                                             documentation; most browser implementations
              inspection process itself, with each theme considered
                                                                             display only the human readable interpretation of
              in detail by turn. This iterative systematisation aided
                                                                             these codes and not also (or only) the numeric values
              the process of new feature identification.
                                                                             as actually distributed.
              Where possible, operational definitions of new
                                                                             Despite their ‘core’ nature, however, only three of the
              features were specified (reproduced in Tables 1-3).
                                                                             22 related features were displayed by all browsers
              Subsequent inspections progressed by browsing or
                                                                             inspected: the Concept ID, a link to (at least one)
              searching the Test Case column entry, and comparing
                                                                             description for a concept, and display of the text of
              the displayed result with the Expected Result column.
                                                                             linked descriptions. Description status and Initial
              Although previously inspected browsers were
                                                                             Capital Status, Relationship ID and Refinability were
              subsequently re-inspected for newly discovered
                                                                             each visible in only two or three browsers.
              features, work is underway to confirm the validity and
              reproducibility of inspecting individual browsers
                                                                             Non-Core: Ancillary, 3rd Party and Derived Data
              against the feature catalog. Individual browser scores
                                                                             Advanced navigation and terminology maintenance
              are therefore not presented here.
                                                                             work may require either additional data outside the
                                                                             core tables, or ‘derived’ views of the core data itself
                                   RESULTS
                                                                             such as ‘reverse’ historical relationships (showing
              143 different browsing features were identified across         which inactive concepts point at the current browser
              17 inspected browsers. 6 further features occurred to          focus concept as their replacement). Table 1 lists the
              the authors during the inspection process as being             ‘derived’ views found across the inspected browsers.
              potentially useful, but were not found in any                  A complete set of SNOMED core and ancillary
              inspected browser. The combined set of 149 features            linked data is large and complex. Further, it changes
              are presented in the accompanying tables, organised            with each biannual release. To reflect this
              under the 8 major themes outlined below.                       configuration and versioning complexity, some
              Our preliminary summary results, based on partially            browsers report exactly which versions of which
              validated individual browser inspections, suggest              release components are loaded, alert users when they
              most browser featuresets are an arbitrarily selected           are browsing non-current data, and support
              and small subset of all 149 features available. On             concurrent browsing of multiple release versions for
              average, individual browsers implement only 40                 direct discovery or comparison of changed content.
              features (Range 21-107, StDev=13), but only 22 of              We found display of non-core data, and data from
              the 149 features were found in more than two thirds            more than one release, to be the exception rather than
              of all browsers inspected, of which only 5 were                the rule. Pointers from inactive concepts to their
              implemented in all inspected browsers (Search by               active replacement, and the set of concepts using the
              ConceptID or by Exact string, display of a                     browser focus concept in their definition, are
              ConceptID, its linkage to a Description, and the text          accessible in less than half of all browsers; all other
              of that Description). 89 features were found in less           ancillary, 3rd party or derived data browsing functions
              than a third of all browsers, but 70 of these are found        are present in less than one third of all browsers and
              in at least two browsers. Overall, these results               usually only in two or three.
              suggests that most possible browsing features have
              been implemented independently by several                      Visualisation and Navigation
              SNOMED browser developers, but they have yet to                Following from consideration of what data a browser
              become ‘standard’ across most browsers.                        displays is how it displays it. Additionally, the




                                                                        31
Representing and sharing knowledge using SNOMED
Proceedings of the 3rd international conference on Knowledge Representation in Medicine (KR-MED 2008)
R. Cornet, K.A. Spackman (Eds)




              navigability of this data must be considered. Table 2            collation features were observed across all browsers,
              lists the visualization and navigation features                  thirteen of the browsers implemented less than 10 of
              encountered in the inspected browsers.                           them - and rarely the same set. 27 searching features
              Most browsers implement some form of graphical                   were implemented in less than a third of all browsers
              tree browser, displaying the browser focus concept in            inspected, of which 5 were unique to one browser.
              the context of SNOMED’s multiaxial subsumption                   Browsers differ in which features are on by default,
              hierarchy. Some off-the-shelf tree controls, however,            which must be explicitly specified, and which can be,
              are unsuitable for displaying trees with very many               or by default are, combined in Boolean combinations.
              levels and very many siblings at the same level, such            Not all strip trailing spaces; some default to an exact
              as SNOMED CTs subsumption hierarchy. Those                       string match whilst others assume wildcarding unless
              showing the hierarchy always exploded from the root              specifically overridden. Where a search expression
              node downward (e.g. the NCI Terminology Browser                  contains multiple words or tokens, few browsers
              and Protégé) are particularly unwieldy; those that do            support complex query logics such as requiring some
              not detect very large sibling sets before attempting to          tokens to be present and others not.
              display them can lead to very long refresh times.                To demonstrate the effect of these differences, all
              Other visualization features observed include: sorting           browsers were used in their default configuration to
              and grouping of components within concept                        search against the same string: ‘ear catheter’. Six
              definition or synonym sets, diacritic and superscript            browsers found no matches. A further six found only
              rendering, and typographic or colour coding of text.             72683003 Removal of catheter from middle ear, and its
              Most browsers employ web browsing paradigms for                  two descendants. SNOB returned eleven matches,
              navigation, with use of hyperlinks to refocus the                including 72683003 but also 232199004 Inflation of
              browser on arbitrary concepts, as well as                        Eustachian tube using balloon. The latter has no directly
              back/forward navigation. Bookmarked ‘favourites’,                associated descriptions containing either ‘ear’ or
              or a ‘home’ concept, however, were rarely observed.              ‘catheter’ but instead is returned because it has at
                                                                               least one ancestor with at least one description
              Usability and Interoperability                                   matching ‘ear’, and a separate ancestor with a
              The overall experience of working with a browser is              description matching ‘catheter’. The UMLS RRF
              influenced by a range of more generic user interface             Browser returned sixty-six matches.
              features, listed inTable 2. These include: the ability to
              transiently or persistently configure a custom view on           Postcoordination and Miscellaneous
              the wealth of SNOMED related information, e.g., to               Unlike traditional clinical terminologies, SNOMED
              occupy less of the desktop real estate; copy-and-paste           CT can be ‘postcoordinated’ - dynamically extended
              or drag-and-drop of selected information either within           by anybody, subject to certain ontological rules. Most
              the browser environment or into external applications,           trivially, this manifests as the option to qualify
              and the availability of an API allowing browser                  anatomical sites by a Laterality attribute and
              interface components to be instantiated and controlled           Sidedness value. Exposing SNOMED CT only as a
              by 3rd party software (a functionality distinct from the         static corpus significantly diminishes its expressivity.
              notion of a terminology services API per se).                    Further, a large part of the content – e.g. all Qualifier,
                                                                               and Linkage Concepts - is easily misunderstood
              Searching                                                        outside the context of postcoordination.
              Table 3 lists the range of features observed by which            The rules governing postcoordination are complex
              SNOMED CT is searched against a user-entered text                but compliance with them is a prerequisite for
              string in order to identify candidate SNOMED                     dynamic classification of the expressions so built. A
              ConceptIDs as possible entry points for subsequent               dedicated postcoordinated expression building and
              visualization and navigation. These different search             validating interface is therefore highly desirable, but
              features observed may be further analysed into:                  we found only five browsers that implement one.
              · lexical expansion of the original user search string           Three of these additionally implement some limited
                  in order to increase recall                                  part of the rules and conventions. However, although
              · semantic or metadata filtering of the set of                   compliance with the rules has limited value outside
                  candidate concepts returned by a query, in order             the context of dynamic classification, no browser
                  to increase precision                                        inspected currently provides that function.
              · collation and sorting of filtered results, so that the         SNOMED CT contains many content errors and
                  user may find (or be certain of not finding) the             omissions. Empowering end users to log and report
                  required concept                                             content errors offers a ‘social computing’ route to
              In general, SNOMED CT searching functionality in                 expand SNOMED CT’s quality assurance capacity.
              most browsers is impoverished and idiosyncratic.                 However, only one inspected browser directly
              Although 37 different query expansion, filtering and             integrates content bug logging and reporting.




                                                                          32
Representing and sharing knowledge using SNOMED
Proceedings of the 3rd international conference on Knowledge Representation in Medicine (KR-MED 2008)
R. Cornet, K.A. Spackman (Eds)




                                  DISCUSSION                                                         References
              Accessing data vs. browsing. In seeking to review                1.  SNOMED CT. IHTSDO, Copenhagen 2007
              ‘browser’ technologies, we excluded command line                     www.ihtsdo.org
              or other direct SQL interfaces on the data tables.               2. Tuttle MS, Cole WG, Sheretz DD, Nelson SJ.
              Although most browsers hide the raw data tables from                 Navigating to knowledge. Methods Inf Med.
              the user, at least one explicitly provides a route to it.            1995 Mar;34(1-2):214-31
              Whether ‘display’ of data by this route should pass or           3. Patel VL, Kushniruk AW. Understanding,
              fail our core data theme tests is debatable.                         navigating and communicating knowledge: issues
              Configurability. A minority of the features identified               and challenges. Methods Inf Med. 1998
              are orthogonal or graded values of one property. For                 Nov;37(4-5):460-70.
              example, whether a given hierarchy browser sorts                 4. Windle J, Van-Milligan G, Duffy S et al. Web-
              sibling concepts randomly, alphabetically by                         based physician order entry: an open source
              description, or numerically by ConceptID are                         solution with broad physician involvement.
              orthogonal values of a ‘sibling sort’ function.                      AMIA Annu Symp Proc. 2003;:724-7.
              Although in theory it is possible to imagine a browser           5. Elkin PL, Brown SH, Husser CS et al. Evaluation
              configurable to any one of the three, individual                     of the content coverage of SNOMED CT: ability
              hierarchy display instances can only implement one at                of SNOMED clinical terms to represent clinical
              a point in time. In practice, all inspected browsers                 problem lists. Mayo Clin Proc. 2006
              implement only one of these options throughout.                      Jun;81(6):741-8.
              Operational test criteria. Differences between the               6. Sundvall E, Nyström M. et al. Interactive
              browsers, particularly their default treatment of                    visualization and navigation of complex
              search strings, confounded attempts to specify tests                 terminology systems, exemplified by SNOMED
              that would work equally across all of them. Many of                  CT. Std Health Technol Inform. 2006;124:851-6.
              the tests specified in Tables 1-3 must be interpreted to         7. Chiang MF, Hwang JC, Yu AC et al. Reliability
              take account of issues such as whether exact or                      of SNOMED-CT coding by three physicians
              wildcard string matching is assumed.                                 using two terminology browsers. AMIA Annu
              Absence of standard search features. The observed                    Symp Proc. 2006;:131-5.
              differences in text-to-concept search implementations            8. Richesson R, Syed A, Guillette H et al. A web-
              have a striking effect on browsing experience. Further               based SNOMED CT browser: distributed and
              work to characterize this phenomenon is required.                    real-time use of SNOMED CT during the clinical
              Future work. We are currently validating the testing                 research process. Medinfo. 2007;12(Pt 1):631-5.
              of specific browsers against the catalog of features.            9. Cornet R, de Keizer NF, Abu-Hanna A. A
              The quantitative results reported here are preliminary               framework for characterizing terminological
              but confirm the authors’ original motivation for the                 systems. Methods Inf Med. 2006;45(3):253-66.
              experiment: currently available SNOMED CT                        10. CaTTS (browsed Dec 21st 2007) www.jdet.com/
              browsers are very different and often suboptimal.                11. CliniClue (build 2006.2.30) www.cliniclue.com
              We do not propose that all SNOMED CT browsers                    12. CLIVE (UK NHS in-house terminology
              must always implement all the features we identify;                  authoring tool)
              further research is required to determine which                  13. HealthTerm (v 4.3.2 browsed Dec 21st 2007)
              features are required for specific use cases, but the            14. HLi LExPlorer (v 4.4.1P build 48 browsed Dec
              prior existence of a master feature catalog such as we               21st 2007 – Athens account required)
              present here is a prerequisite for that research. Many               www.snomed.cfh.nhs.uk/lexplorer/
              of the features seem likely to be common across use              15. Mycroft (v. 2.1.0.2) www.apelon.com/
              cases, particularly text-to-concept search operations.           16. NCI Terminology Browser (browsed Dec 21st
              We recommend that a core set of searching and                        2007) nciterms.nci.nih.gov/NCIBrowser/
              browsing features be defined and harmonized across               17. OpenKnoME 5.4d and ClaW Workbench
              tools, so that a standard terminology is not                         www.opengalen.org/sources/software.html
              transformed into multiple different objects by virtue            18. Protégé (v4.0 build 59) protege.stanford.edu
              of idiosyncratic and limited browsing experiences.               19. SNOB (v1.64) snob.eggbird.eu
                                                                               20. SnoFlake (v 2.0 browsed Dec 21st 2007)
                                Acknowledgments                                    snomed.dataline.co.uk/
                                                                               21. UMLS Rich Release Format Browser (2007AC)
              This research was supported by the Intramural                        www.nlm.nih.gov/research/umls/
              Research Program of the National Institutes of Health            22. Virginia Tech Browser (browsed Dec 21st 2007)
              (NIH), National Library of Medicine (NLM) and by                     terminology.vetmed.vt.edu/SCT/menu.cfm
              NHS Connecting for Health. Thanks to Drs Malcolm                 23. AxSys Browser (browsed Jan 5th 2008)
              Duncan and Christopher Wroe for their assistance.                     www.axsys.co.uk/excelicare/eprclinicalcoding.htm




                                                                          33
Representing and sharing knowledge using SNOMED
Proceedings of the 3rd international conference on Knowledge Representation in Medicine (KR-MED 2008)
R. Cornet, K.A. Spackman (Eds)




              24. First DataBank www.firstdatabank.com/                 27. Visual Read www.visualread.com/
              25. Informatics inc www.informatics.com/                  28. NHS Common User Interface nww.cui.nhs.uk/
              26. Ocean Informatics oceaninformatics.biz




                                   Table 1 Core and Additional SNOMED CT table browsing features




                                                                   34
Representing and sharing knowledge using SNOMED
Proceedings of the 3rd international conference on Knowledge Representation in Medicine (KR-MED 2008)
R. Cornet, K.A. Spackman (Eds)




                                Table 2: Visualisation, Navigation and Interoperation browsing features




                                                                   35
Representing and sharing knowledge using SNOMED
Proceedings of the 3rd international conference on Knowledge Representation in Medicine (KR-MED 2008)
R. Cornet, K.A. Spackman (Eds)




                              Table 3: Searching, Postcoordination and Miscellaneous browsing features




                                                                   36