<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>An examination of OWL and the requirements of a large health care terminology</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Kent Spackman</string-name>
          <email>spackman@ohsu.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Medical Informatics and Clinical Epidemiology Oregon Health &amp; Science University</institution>
          ,
          <addr-line>Portland, Oregon</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper presents a brief initial look at some of the possible bene¯ts and barriers to using OWL as the language for the development, dissemination and implementation of terminological knowledge in the domain of health and health care. In particular, this assessment is made from the perspective of the author's role in the development of the Systematized Nomenclature of Medicine (SNOMED). To date, SNOMED has developed and adopted its own special-purpose syntax and formats for terminology development, exchange and distribution. Its representation language has limited expressivity yet is not expressible by any dialect of OWL 1.0. With the evolution to OWL 1.1, the barriers to using OWL for knowledge representation have been resolved. However, partly because of SNOMED's very large size, there remain barriers to adoption of OWL XML/RDF for SNOMED development, distribution or exchange purposes.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        The Systematized Nomenclature of Medicine, Clinical Terms (SNOMED CT)
[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] is a work of clinical terminology with broad coverage of the domain of
health care, and it has been selected as a national standard for use in
electronic health applications in many countries, including the U.S., U.K., Canada,
Australia, Denmark, and others. SNOMED was originally published in 1976,
while SNOMED CT became available in 2002 as a major expansion resulting
from the merger of SNOMED RT with the U.K.'s Clinical Terms version 3. A
major distinguishing feature di®erentiating it from prior editions is the use of
description logic (DL) to de¯ne and organize codes and terms [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>Another major distinguishing feature of SNOMED is its size and complexity.
With over 350,000 concept codes, each representing a di®erent class, it is an order
of magnitude larger than the next largest DL-based ontology of which we are
aware. The size of the OWL XML/RDF form of SNOMED is approximately 248
MB, and this is just the DL representation without all the synonyms, mappings,
subsets, and other special-purpose components of the terminology.
tems. To some degree these three streams can °ow together, but in a more
important sense they produce con°icting design criteria.</p>
      <p>From the healthcare terminology perspective, the most useful stream of
in°uences on OWL was the description logic stream. Long prior to the conception
of the Semantic Web and the promulgation of related standards, the SNOMED
e®ort had adopted description logic as the representation language for its formal
de¯nitions.</p>
      <p>SNOMED CT was developed with a description logic which includes
conjunction (u), existential role restrictions (9R:C), role hierarchy axioms (r v s),
and class de¯nitions of the form A v C with A a class name and C an arbitrary
class description. In addition, it makes use of what have been termed \right
identities", a restricted form of property chain inclusion axioms of the form r ± s v r,
where the circle denotes role composition. This relatively limited set of language
constructs is nevertheless su±cient to represent a very wide range of meanings.</p>
      <p>
        Another important property of this limited language, which can be called
E LH with right identities, is that its computational complexity remains
polynomial[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], and there exist classi¯ers that can handle the task of classifying the
entire terminology[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>
        Without a solid DL foundation, the Semantic Web would have remained
largely irrelevant to health care terminology standardization. Even so, the
initial version of OWL was developed without taking adequate account of features
of DL that had already been used in both the GALEN [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and SNOMED [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]
e®orts. The development of OWL 1.1 eliminated one of the most signi¯cant
barriers to use of OWL for SNOMED, since it permits the identi¯cation of tractable
sublanguages capable of handling the size and complexity of SNOMED[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
Although adding property chain inclusion axioms was reportedly the most di±cult
step in developing OWL 1.1 [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], this was essential. Without it, adoption of OWL
by the SNOMED community would have required awkward workarounds with
their attendant complications and complexities { e®ectively killing movement
in that direction. With it, we have a clear path to using OWL 1.1 for further
development and integration with other biomedical ontologies.
3
      </p>
    </sec>
    <sec id="sec-2">
      <title>Development and Exchange</title>
      <p>
        Beginning in 1995-96, SNOMED's developers adopted a distributed co-operative
model for modeling the DL de¯nitions of its meanings [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. This development
model was driven by several factors, including the geographical distribution of
health care professionals available to work on curation of the terminology, as
well as a recognition of the value of distributed development.
      </p>
      <p>
        In order to successfully handle asynchronous modeling by geographically
dispersed editors, software was developed that permitted each editor to submit
\change sets" which could be imported into a central con¯guration management
environment and analyzed for con°icts. The format of these change sets has
evolved over time, and SNOMED has recently de¯ned an published an open
source speci¯cation for an XML interchange format [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
      </p>
      <p>As mentioned above, scale is a major issue for SNOMED. This comes into
play when considering the most common tools available for editing and
development of OWL-based work. All open source editing tools that I have tested run
into memory problems when dealing with a terminology as large as SNOMED.
While some of these barriers will be (or perhaps by the time of this conference
have been) removed as a result of special programming e®ort, it is probable that
only a few open source editing tools will receive this level of re¯nement and
enhancement. It is unclear whether most ontologies and terminologies for the
semantic web are small-scale because that is the right size, or merely because
the tools and resources to build truly useful large terminologies are not available.
In health care, at least, it is clear that we need a very large and well-integrated
terminology, such as SNOMED.</p>
      <p>To the extent that OWL assumes that Semantic Web resources will be
developed independently and that publication is e®ectively a static expression of these
resources on the Web, then there is no need for standards that permit broad and
distributed development, critique, curation, correction, enhancement, or
modi¯cation. On the other hand, our experience with SNOMED suggests strongly
that distributed development is crucial, and therefore it would be important to
consider support of this need by OWL standards. Also based on our experience,
support for change sets, at least, would not seem to be a very di±cult thing to
add to the standard. SNOMED's change set documents are available under the
Apache 2.0 license and could be used as a starting point if desired.
4</p>
    </sec>
    <sec id="sec-3">
      <title>Distribution</title>
      <p>In order to be useful, SNOMED has to be integrated into application programs
that are in use in the health care environment. Of course, virtually none of the
commercial applications in wide use in health care today are built to take account
of the Semantic Web. As a result, each vendor or software supplier or health care
institution has a slightly di®erent need for terminology standards, and
incorporates those standards into their application software in di®erent ways.</p>
      <p>
        There is an existing distribution format for SNOMED, based on a set of ¯les
of UTF-8 characters suitable for loading into database tables. In order to
provide application developers with a stable speci¯cation, this distribution format
was submitted for American National Standards Institute (ANSI)
standardization[
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. This structure has served its purpose for several years. However, as the
knowledge representation evolves and there is a need for more expressiveness, the
current format has become limiting. As a result, a special-purpose XML format
has been developed and published for comment[
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. Once again, it would seem
that it would be worthwhile to examine the representation capabilities of OWL
1.1 as a means for representing at least the DL-based components of SNOMED.
Other components, such as change tracking, mapping, subsetting, and so forth
may not be as readily represented in OWL. However, these issues facing
terminologies in health care are probably not unique to this one domain, and it would
be worth examining the extent to which an enhanced OWL standard might
contribute to improved use of terminologies across several industries, thereby
increasing the chances of the Semantic Web vision being realized.
5
      </p>
    </sec>
    <sec id="sec-4">
      <title>Conclusion</title>
      <p>We have discussed several issues that have arisen in the process of examining
the utility of OWL and its associated products for the purposes of supporting
ongoing e®orts to formalize representation of terminology for health and health
care. The move to OWL 1.1 is seen as a very positive move from a knowledge
representation standpoint. Support for large scale and distributed development
of ontologies is seen as a need that is as yet unmet.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>Snomed</given-names>
            <surname>Clinical</surname>
          </string-name>
          <article-title>Terms</article-title>
          . North¯eld, IL: College of American Pathologists,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>F.</given-names>
            <surname>Baader</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Calvanese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. L.</given-names>
            <surname>McGuinness</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Nardi</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P. F.</given-names>
            <surname>Patel-</surname>
          </string-name>
          Schneider, editors.
          <source>The Description Logic Handbook. Theory</source>
          , Implementation, and
          <string-name>
            <surname>Applications</surname>
          </string-name>
          . Cambridge, U.K.: Cambridge University Press,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>P. F.</given-names>
            <surname>Patel-Schneider</surname>
          </string-name>
          .
          <article-title>What is OWL (and why should I care)? Ninth International Conference on the Principles of Knowledge Representation and Reasoning</article-title>
          . Whistler, Canada,
          <year>June 2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>F.</given-names>
            <surname>Baader</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Brandt</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Lutz</surname>
          </string-name>
          .
          <article-title>Pushing the EL envelope</article-title>
          .
          <source>In Proc. of the Nineteenth Int. Joint Conf. on Arti¯cial Intelligence (IJCAI-05)</source>
          , Edinburgh, UK,
          <year>2005</year>
          . Morgan-Kaufmann Publishers.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>F.</given-names>
            <surname>Baader</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Lutz</surname>
          </string-name>
          , and
          <string-name>
            <given-names>B.</given-names>
            <surname>Suntisrivaraporn</surname>
          </string-name>
          .
          <article-title>CEL|a polynomial-time reasoner for life science ontologies</article-title>
          . In U. Furbach and N. Shankar, editors,
          <source>Proc. of the 3rd Int. Joint Conf. on Automated Reasoning (IJCAR'06)</source>
          , volume
          <volume>4130</volume>
          <source>of Lecture Notes in Arti¯cial Intelligence</source>
          , pages
          <fpage>287</fpage>
          {
          <fpage>291</fpage>
          . Springer-Verlag,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>A. L.</given-names>
            <surname>Rector</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. K.</given-names>
            <surname>Bechhofer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. A.</given-names>
            <surname>Goble</surname>
          </string-name>
          , I. Horrocks,
          <string-name>
            <given-names>W. A.</given-names>
            <surname>Nolan</surname>
          </string-name>
          and
          <string-name>
            <given-names>W. D.</given-names>
            <surname>Solomon</surname>
          </string-name>
          .
          <article-title>The GRAIL Concept Modelling Language for Medical Terminology</article-title>
          , Arti¯cial Intelligence in Medicine,
          <volume>9</volume>
          :
          <fpage>139</fpage>
          -
          <lpage>171</lpage>
          ,
          <year>1997</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>K. A.</given-names>
            <surname>Spackman</surname>
          </string-name>
          .
          <article-title>Managing clinical terminology hierarchies using algorithmic calculation of subsumption: Experience with SNOMED-RT</article-title>
          .
          <year>2000</year>
          .
          <source>OHSU Technical Report.</source>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>B. C.</given-names>
            <surname>Grau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Horrocks</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Parsia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Patel-Schneider</surname>
          </string-name>
          and
          <string-name>
            <given-names>U.</given-names>
            <surname>Sattler</surname>
          </string-name>
          .
          <article-title>Next Steps for OWL</article-title>
          .
          <source>Proceedings of the Second OWL Experiences and Directions Workshop</source>
          (OWLED)
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>M.</given-names>
            <surname>Horridge</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Tsarkov</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T.</given-names>
            <surname>Redmond</surname>
          </string-name>
          .
          <article-title>Supporting early adoption of OWL 1.1 with Protege-OWL and FaCT++</article-title>
          .
          <source>Proceedings of the Second OWL Experiences and Directions Workshop</source>
          (OWLED)
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>K</surname>
            . E. Campbell,
            <given-names>S. P.</given-names>
          </string-name>
          <string-name>
            <surname>Cohn</surname>
          </string-name>
          , et. al. Galapagos:
          <article-title>Computer-based support for evolution of a convergent medical terminology</article-title>
          .
          <source>Proc AMIA Annu Fall Symp</source>
          <year>1996</year>
          , pp.
          <fpage>269</fpage>
          -
          <lpage>273</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11. SNOMED Interchange Format Speci¯cation. College of American Pathologists,
          <year>2006</year>
          . http://www.snomed.org/snomedct/interchange format.
          <source>html.</source>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Healthcare Terminology</surname>
          </string-name>
          <article-title>Structure: ANSI Standard</article-title>
          . College of American Pathologists,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>SNOMED XML Schema</surname>
          </string-name>
          <article-title>Speci¯cation</article-title>
          . College of American Pathologists,
          <year>2006</year>
          . http://www.snomed.org/snomedct/xml schema.
          <source>html.</source>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>