<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Semantic Terminology Management for Applications: Contextualized SKOS-XL</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Andreas Thalhammer</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Martin Romacker</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Joachim Rupp</string-name>
          <email>joachim.ruppg@roche.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Roche Pharma Research and Early Development Informatics, Roche Innovation Center Basel, F. Ho mann-La Roche Ltd</institution>
          ,
          <addr-line>Grenzacherstrasse 124, 4070 Basel</addr-line>
          ,
          <country country="CH">Switzerland</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Terminology management is an important aspect for ensuring data quality in large organizations. To enable expert applications the use of agreed and curated terms enhances data quality while it signi cantly reduces the long-term cost for data integration. In this abstract, we outline our solution for two problems that occur in the context of terminology management for applications.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Problem 1 Domain-speci c schemes, like \Diseases" (organized via
domaincentric polyhierarchies), would be used for a drop-down eld called \Search
cell line inventory|related disease)".1 This means that an end user would
need to select a single value from more than 4000 while, in the context of
the drop-down eld, far less entries are actually needed.
1 Note that skos:broader/skos:narrower cannot be used in this case as the
hierarchical organization of the content is domain-dependent rather than application-centric.
Problem 2 When an application uses a preferred or alternative label in its
speci c context, the label would need to be transferred to the application
(as otherwise it would be unclear which label the application wants to use).
Therefore, the consuming application becomes the \owner" of the term and
potential changes (on the application side) might not be fed back to the
terminology management system.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Contextualized SKOS-XL</title>
      <p>Graph: domain master</p>
      <p>Graph: application
:Label1</p>
      <p>skosxl:prefLabel
:Concept</p>
      <p>In the pREDi-Roche Terminology System (RTS) we use RDF and SKOS-XL for
specifying domain-centric terminologies. Consuming applications can rely on the
de ned semantics and the stable URIs that the terminology system provides. In
order to address Problem 1, applications need access to subsets. However, in
the design of the RTS system we decided not to maintain these subsets on the
application side. This is due to two main reasons: 1) Semantic - Applications
use the same concepts with slightly di erent semantics (e.g., a disease can also
be interpreted as an adverse event, genetic background, or pheno-type): an
application naturally sets a context for a term. Data curators need to be aware
of such contexts (which is enabled by maintaining the subsets at the side of
the terminology system). 2) Organizational - If subsets are maintained at
the application side (i.e., via lists of URIs and according preferred terms), the
technical infrastructure for storing and retrieving terms is readily in place. In
combination with corporate structure (i.e., shortest paths) and the long-term
orientation of central terminology management (i.e., no quick bene t) this can
lead to unnoticed disconnection of the applications.</p>
      <p>In RTS, we make use of named graphs to maintain application contexts/subsets.
The URIs of the concepts and labels of a domain-master graph are reused and
contextualized in application graphs. In order to address Problem 2, the domain
master maintains all di erent types of SKOS-XL labels (preferred, alternative,
or hidden label) and applications can choose one of these labels as their preferred
label (see Fig. 1). This enables to provide acronyms like \NSCLC" in restricted,
application-centric subsets (i.e., maximum exibility). At any point of time, data
curators can see which applications consume which terms and how they refer to
them.</p>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>