<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Towards a Standard Ontology Metadata Model</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Hua Min</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Stuart Turner</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sherri de Coronado</string-name>
          <xref ref-type="aff" rid="aff6">6</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Brian Davis</string-name>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Trish Whetzel</string-name>
          <xref ref-type="aff" rid="aff8">8</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Robert R. Freimuth</string-name>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Harold R. Solbrig</string-name>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Richard Kiefer</string-name>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michael Riben</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Grace A. Stafford</string-name>
          <xref ref-type="aff" rid="aff7">7</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lawrence Wright</string-name>
          <xref ref-type="aff" rid="aff6">6</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Riki Ohira</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Booz Allen Hamilton</institution>
          ,
          <addr-line>Rockville, MD 20852</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Health Administration and Policy, George Mason University</institution>
          ,
          <addr-line>Fairfax, VA 22030</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Leafpath Informatics</institution>
          ,
          <addr-line>Davis, California</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>MD Anderson Cancer Center</institution>
          ,
          <addr-line>Houston, TX 77030</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>Mayo Clinic</institution>
          ,
          <addr-line>Rochester, MN 55905</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff5">
          <label>5</label>
          <institution>MindFull Informatics</institution>
          ,
          <addr-line>Denver, CO 80205</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff6">
          <label>6</label>
          <institution>National Cancer Institute</institution>
          ,
          <addr-line>Rockville, MD 20852</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff7">
          <label>7</label>
          <institution>The Jackson Laboratory</institution>
          ,
          <addr-line>Bar Harbor, Maine 04609</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff8">
          <label>8</label>
          <institution>University of California San Diego</institution>
          ,
          <addr-line>La Jolla CA 92093</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>- Bio-ontologies are becoming increasingly important in semantic alignment for data integration, information exchange, and semantic interoperability. Due to the large number of emerging bio-ontologies, it is challenging for naïve ontology users to search, select, and adopt a “right” ontology for their applications. Therefore, it is important to have a consistent terminology metadata model and a resource for discovering appropriate ontologies or other resource for use in annotating data. This paper aims to seek a common, shareable, and comprehensive method to create, disseminate, and consume metadata about terminology resources.</p>
      </abstract>
      <kwd-group>
        <kwd>Ontology</kwd>
        <kwd>Metadata</kwd>
        <kwd>Ontology Metadata Model</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>I. INTRODUCTION</title>
      <p>
        Ontologies, thesaruri, termiologies, classifications and
coding systems can be referred to collectively as ‘terminology
resources.” Ontologies and other terminology resources are
becoming increasingly important in semantic alignment for
data integration, information exchange, and semantic
interoperability [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ] especially in the biomedical field. New
bio-ontologies continue to emerge. For example, the number of
terminology resources published in the National Center for
Biomedical Ontology (NCBO) BioPortal increased from 72 in
2008 [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] to 508 in April 2016. The explosion of available
terminology resources makes it challenging for naïve users to
search, select, and adopt the “right” terminology resource for
their applications [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Core issues in adoption include: (1) Lack
of advertisement: Users such as software developers often do
not know about the existence of standards, vocabularies, or
ontologies; (2) Lack of description: Users do not know which
standard, terminology resource, or ontology would be best for
their particular use; and (3) Lack of a rating system: Users do
not know the quality of a selected ontology.
      </p>
      <p>An ontology metadata model provides a method to richly
characterize terminology resources. It therefore has the
potential to mitigate the issue of discovery and improve
ontology adoption. It also would facilitate comparison as well
as coordination relating to standards and best practices.
Structured metadata describes key aspects of a terminology
resource such as scope, structure, provenance, availability, and
usage statistics. It helps users to identify, evaluate, select, and
deploy a resource more effectively and efficiently. The primary
goal for this paper is to seek a common, shareable, and
comprehensive method to create, disseminate, and consume
metadata about terminology resources.</p>
      <p>The NCI Center for Biomedical Informatics and
Information Technology (CBIIT), NCBO, and National Cancer
Research Institute (NCRI) in the U.K. support the fundamental
concept that adoption of ontologies by the research and
education communities is essential to data sharing,
interoperability and reuse. NCI, NCBO and NCRI had a
common interest in representing terminology resources using a
standard ontology metadata model. Although there are several
existing models and standards efforts for an ontology metadata
profile, the NCI CBIIT, NCBO, and NCRI realized that no
single metadata model could meet the requirements collected
from those three institutions in 2012. Therefore we proposed to
contribute to a possible standard ontology metadata model by
harmonizing existing models.</p>
      <p>
        Major activities of the NCI CBIIT include developing,
coordinating, and/or deploying biomedical-informatics and
scientific-management information technology systems,
infrastructure, open-source applications, semantics, and data
resources in support of the NCI’s research agenda [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. NCI
Enterprise Vocabulary Services (EVS) publishes a number of
vocabularies needed and used by the NCI community [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. It
also provides tools and services to accurately code, analyze and
share cancer and biomedical research, clinical and public
health information. Though EVS publishes metadata with
terminologies, challenges remain as to how to make descriptive
information for those vocabularies available to the end-user
communities in a comparable way to terminologies hosted by
other organizations.
      </p>
      <sec id="sec-1-1">
        <title>B. NCBO</title>
        <p>
          The NCBO is one of the National Institutes of Health
Centers for Biomedical Computing (NCBCs). The goal of the
NCBO is to support biomedical researchers by providing
online tools, a web-based BioPortal, and programming
interfaces, enabling researchers to access, review, and integrate
disparate ontological resources in all aspects of biomedical
investigation and clinical practice [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. BioPortal enables
community participation in the evaluation and evolution of
ontology content by allowing users to submit their ontologies
to BioPortal, providing mappings, managing ontology
versions, and collecting user feedback through structured notes
and reviews [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. The metadata model of the NCBO BioPortal
is based on an extended version of the Ontology Metadata
Vocabulary (OMV) [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ].
        </p>
      </sec>
      <sec id="sec-1-2">
        <title>C. NCRI</title>
        <p>
          The U.K.’s NCRI focused on the technical and cultural
aspects of data sharing [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]. The NCRI developed an online
tool called “Cancer InfoMatrix” for the visualization and
discovery of three types of standards: vocabularies, exchange
formats, and reporting guidelines. The dilemma NCRI faced
was how to promote a consistent set of “core” standards to the
community in order to produce a coherent pattern of use of
these “core” standards.
        </p>
        <p>The NCRI worked to achieve consensus with and across
NCI and NCBO in defining the way forward in promoting
standards in order to facilitate international interoperability. A
joint group, the Ontology Representation Working Group
(ORWG), was founded to serve this purpose. The major goal
for the ORWG was to seek a common, shareable and
comprehensive method to create, disseminate, and consume
metadata about ontologies.</p>
        <p>
          Beside the three institutions that were involved in this
activity, the OBO Foundry [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ], BioSharing [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ], the Research
Data Alliance (RDA) [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ], the Monarch Initiative [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ], and
Elixir in Europe [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] are also active in this area, e.g.
Biosharing provides usage data about terminologies and other
resources, and obofoundry.org provides high quality metadata
for a selective collection of terminology resources. Work
towards a common model for terminology resources metadata
would benefit all.
        </p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>III. METHODS</title>
      <p>The development of an ontology metadata model contains
four major steps: (A) Review the existing metadata models, (B)
Identify important metadata based on a user survey of the three
organizations that collaborated on the ORWG, (C) Propose and
ballot new extensions and/or modifications toward a new
normative edition of an ontology metadata model, and (D)
Adoption of the new version by the community.</p>
      <sec id="sec-2-1">
        <title>A. Review Existing Models</title>
        <p>
          We reviewed existing ontology metadata models including
ISO/ IEC 19763-3 [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ], Ontology Definition Metamodel [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ],
Open Provenance Model [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ], Open Ontology Repository [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ],
and Ontology Metadata Vocabulary (OMV) [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ]. Our
evaluation showed that ISO/IEC 19763-3, Ontology Definition
Metamodel, and Open Provenance Model do not support the
breadth of profile representation required by our three
institutions. Specifically, these models have some limitations in
their support of several use-cases important to this group, for
example, the intended scope of the content of a terminology
resource, its type, language or syntax, and its fitness for use for
a given purpose. The Open Ontology Repository initiative also
realized the importance of the ontology metadata and was
willing to collaborate with the ORWG. Based on our
evaluation, OMV was the most complete and thoughtfully
developed metadata model among these models for the use
cases. One of the three institutions, NCBO, already uses OMV
as the supporting model for its BioPortal. And given that it
contained the best metadata for the use cases, the ORWG
decided to choose the OMV version 2.4.1 as its base model.
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>B. Identify a List of Important Metadata</title>
        <p>A survey to the NCI, NCBO, and NCRI was created and
conducted by the ORWG. The content of the survey was
designed based on OMV core. Authors from the three
institutions filled out the survey based on their own needs and
requirements. Each piece of metadata from the OMV was
prioritized as “Low, Medium, or High” and classified as
“Required or Optional”. A list of important metadata was
finalized based on group consensus. The detailed survey can
be found in the Results section. Usability testing was done by
applying OMV to four relevant biomedical vocabularies:
LOINC, SNOMED CT, RxNorm, and NCI Thesaurus. A
summary of the outcome of usability testing results is also
described in the Results section.</p>
      </sec>
      <sec id="sec-2-3">
        <title>C. OMV Extensions or Modifications</title>
        <p>Based on the results from our survey, review, and OMV
usability tests, the ORWG group suggested that further
extensions and/or modifications to OMV are an ideal approach
to enable OMV to serve as a single model designed to support
the requirements of the entities involved in this paper and
potentially the rest of the biomedical community.</p>
      </sec>
      <sec id="sec-2-4">
        <title>D. Adopt the OMV Extensions</title>
        <p>
          The OMV extensions influenced the final selection of
metadata in the CTS2 standard. A set of standardized
terminology metadata allows applications using terminology
services to build on a common infrastructure, and improve
interoperability across applications. For example, NCI’s
LexEVS CTS2 server implements a portion of the model, as
does a recent pilot project for implementing a federated
network of terminology service nodes [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ]. Recommendations
were generated by the ORWG with a broad range of issues
including publishing, distribution, implementation,
maintenance licensing, provenance, and community input.
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>IV. RESULTS</title>
      <p>The results of this study include (1) a list of important
metadata data elements, (2) usability testing results, and (3) a
set of recommendations consisting of changes to existing
properties and proposed extensions in support of a newly
revised version of OMV.</p>
      <sec id="sec-3-1">
        <title>A. A list of important metadata</title>
        <p>A list of important terminology metadata elements was
extracted from the survey results. The content of this survey
was derived from the OMV Ontology class. The structure of
the OMV contains classes, elements, and relationships. The
OMV has 15 classes such as Ontology, Ontology Type,
Ontology Language, Ontology Engineering Methodology,
Party, etc. Elements are properties or characteristics that are
used to describe the class. For example, the Ontology class has
34 elements including URI, name, acronym, version, etc. The
relationships link the OMV classes together.</p>
        <p>Metadata elements were tagged with an “Occurrence
Constraint” (whether an element should be Required or
Optional) and “Ranking” (Importance of the element to utility
High, Medium, or Low) by the three institutions to judge the
importance of the metadata based on their own requirements
and needs. Eight out of the 34 OMV metadata elements were
identified as “Required” to describe a source (e.g., name,
description, and creationDate). Ten out of the 34 metadata
elements were identified as “High” priority (e.g., URI, name,
and description). Some metadata were classified as both
“Required” and “High” (e.g., ontology name and description).
Some were ranked as a “High” priority but were also
recommended as “Optional”. For example, not all ontologies in
the NCBO BioPortal have a URI and some ontologies have
more than one URI. Table 1 shows the group’s determination
of metadata elements identified as ‘core’.</p>
      </sec>
      <sec id="sec-3-2">
        <title>B. Usability Test Results for the OMV</title>
        <p>
          The OMV was tested with four widely used vocabularies:
LOINC [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ], SNOMED CT [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ], RxNorm [
          <xref ref-type="bibr" rid="ref23">23</xref>
          ], and NCI
Thesaurus [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ]. The reasons for choosing these four include (1)
the purpose and scope of these vocabularies are different.
LOINC is a coding system for laboratory and clinical
observations. SNOMED CT is a systematically organized
computer processable ontology of medical terms. RxNorm,
published by NLM, and included in the UMLS, provides
normalized names and a model for clinical drugs available in
the US. Finally, the NCI Thesaurus includes broad coverage
related to the cancer research domain. (2) The native
presentation and structure of these terminologies vary; LOINC
is a coding system with pre-coordinated terms constructed
from elements in six axes, while SNOMED CT and the NCI
Thesaurus are based on description logics, and RxNorm is a
terminology with a sophisticated model but not a Description
Logic (DL) ontology. A sample application of the metadata
elements to RxNorm was presented at [
          <xref ref-type="bibr" rid="ref25">25</xref>
          ].
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>C. OMV Extensions/Modifications</title>
        <p>
          A revision to the OMV was recommended by both
refinement of existing OMV ontology class elements [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ] and
via extensions. Modifications included (a) Rename, (b)
Refine, (c) Relocate, (d) Remove, and (e) Harmonization. The
purpose of these modifications was to make the OMV easier
to understand and use. Modifications and changes are
described below and summarized in Table 2.
        </p>
      </sec>
      <sec id="sec-3-4">
        <title>a) Rename – To solve the ambiguity problem: The</title>
        <p>element name was changed to the widely accepted term. For
example, the “name” and “acronym” were changed to the
fullName and shortName respectively.</p>
      </sec>
      <sec id="sec-3-5">
        <title>b) Refine definition – To solve the ambiguity problem:</title>
        <p>Some textual definitions for elements were modified. For
example, the definition of the “fullName” was defined as “the
name by which an ontology is known”. Sample values for the
element were also included in the definition. For example, the
full name of an ontology can be "Logical Observation
Identifiers Names and Codes".</p>
      </sec>
      <sec id="sec-3-6">
        <title>c) Relocate – To provide a better way to organize the</title>
        <p>information: Elements were relocated to other classes if they
did not belong to the OMV Core class. For example,
“creationDate” and “modificationDate” were removed from
the Core Ontology class and relocated into a newly created
Version class.</p>
      </sec>
      <sec id="sec-3-7">
        <title>d) Remove – To simplify the model: Some elements</title>
        <p>were identified as not necessary according to group consensus.
These unimportant elements were removed from the new
model (e.g., “hasPriorVersion”).</p>
        <p>e) Harmonization – To refine/extend the model: OMV's
single “endorsedBy” element represents an important concept
that should be expanded into more atomic elements for greater
expression. The “endorsedBy” is a relation between
“Ontology” to a "Party" as used in the OMV to represent
either a person or organization. Therefore it is difficult to
disambiguate the types of entities supporting the ontology.
New properties such as “certifiedBy” and “mandatedBy” were
added to the OMV extensions in order to capture these details
with finer granularity. For example, “certifiedBy” is done
under formal reviews by CBIIT or “mandatedBy” for
requirements for use by regulatory or governmental agencies
such as under the Meaningful Use criteria for deployment of
electronic health records in the United States.</p>
        <p>
          The OMV extensions were developed by harmonizing
existing well-known metadata standards such as Dublin Core
[
          <xref ref-type="bibr" rid="ref26">26</xref>
          ]. The extensions for the OMV Ontology class are presented
in Table 2. The last column in Table 2 shows the comparable
terms from Dublin Core Metadata.
        </p>
      </sec>
      <sec id="sec-3-8">
        <title>D. Recommendations/Next Steps</title>
        <p>The ORWG developed a set of recommendations relating
to issues for ontology distribution, implementation, and
maintenance as well as licensing, provenance, and community
participation. A summary of those recommendations are as
follows.</p>
        <p>a) Core Ontology Metadata: An iterative approach
should be adopted to implement consistent ontology metadata
beginning with core metadata and moving outward to more
comprehensive metadata as it proves beneficial. The scope of
the metadata should include core areas such as content,
structure, provenance, documentation, and certification. A
revised version of OMV core metadata should be evaluated or
audited against several ontologies relevant to the biomedical
community. Future updates or revisions to the core ontology
profile should be created based on the feedback from the
evaluation.</p>
      </sec>
      <sec id="sec-3-9">
        <title>b) Adoption of Prevailing Standards: Reuse of portions</title>
        <p>of widely adopted models would improve the opportunities for
broader adoption, ease of use of existing tools, and therefore
the potential for federation and semantic interoperability. To
the authors’ knowledge, OMV deployment so far has been
within stand-alone environments, although API's (e.g., the
REST API in BioPortal) have been used to facilitate query and
retrieval. The impact for the replacement of OMV elements
would affect NCBO's modeling in BioPortal. Continued
observation of emerging models important to the community
such as updates to the CTS2 was recommended.</p>
        <p>c) Ontology Usage: “knownUsage” is one of the most
important and yet difficult metadata elements to capture,
measure, or share. “Usage” can be described by a simple
declarative example, by actual use case, case study, or by
reference to a project, activity, or data. Since it is difficult to
define a formal and detailed schema for this element, the
ORWG suggested that it should be carefully described with
best practices or guidelines to capture information from the
community. Language should be succinct, but should
emphasize relatively verbose sharing of details of how an
ontology is used including successes, failures, challenges,
innovations, activities such as mapping or merging, level of
effort and so on. The goal is to capture a wide range of
instances of how, where and why an ontology is being used,
including those outside the primary scope or intent published
by the developer. All usage examples could be helpful, even if
they are novel or represent outlying use cases.</p>
        <p>d) Intellectual Property: Licensing or Rights Expression
Description of licensing rights in the current metadata models
(i.e., "hasLicense") is fairly general. Additional information
about attribution, reuse, distribution, and guidance for
variances in licenses should be provided by ontology
owners/submitters. These variances include licensing of
artifacts as creative works (e.g., Creative Commons), open
source licensing of source code, hybrid or dual licensing (e.g.,
commercial open source), and complexities in licensing by
geographic region (e.g., affiliate licensing under IHTSDO for
SNOMED-CT).</p>
      </sec>
      <sec id="sec-3-10">
        <title>e) Creation, Maintenance, and Distribution of</title>
        <p>Terminology Metadata Profiles: Terminology owners are the
ideal source to initially populate profiles as well as to curate
subsequent revisions. Although users are valuable in providing
more atomic updates, annotations, reviews, and unique
perspectives, the level of detail in a profile and the difficulty
in assuring accuracy requires dependence on a more
authoritative and familiar source. Biosharing, for instance
recognizes this as well, and attempts to get resource owners to
‘claim’ and update the metadata record. Source information
must also be accurate and timely and even with initial
population by the ontology owners, continued curation by the
manager of the repository or registry remains a significant
burden and at risk for obsolescence. The ORWG
recommended a common approach where owners or
community members may maintain a verified or validated
profile locally by populating a “standard” metadata profile
based on the revision of the OMV for publication using a
unique identifier and namespace. A profile service would be
able to periodically query and archive updated ontology
profiles or the service could ping the service with the URL to
the resource at the moment it is modified or updated.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>V. DISCUSSION</title>
      <p>
        The lack of a standard terminology metadata schema to
comprehensively describe biomedical ontologies is a challenge
and a barrier to users looking to identify terminology resources
appropriate for their use. Even though a tool such as the
Ontology Recommender [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ] can help steer people to
resources that contain terms in specific areas of interest,
choosing an appropriate resource is also a matter of
understanding its characteristics, how widely it is used and for
what purposes, how it is supported, and so on. This study
focused on identifying and evaluating the requirements of
terminology metadata profiles for various levels of semantic
precision. It also investigated the extensible framework
(including a terminology metadata profile, an effective
federated collaboration platform, and the capability of
engaging the community) for representing important ontology
resources.
      </p>
      <p>Of all the available metadata models, the OMV Version
2.4.1 was selected as the best foundation for developing a
shared, harmonized, and standard metadata model. However
the OMV Consortium has no active governance structure to
accept community contributions and there has been no
published update since 2009. It is currently insufficient to meet
the representational needs of a richer community-based
framework. This study identified gaps as well as
recommendations to extend and refine features of the OMV.</p>
      <sec id="sec-4-1">
        <title>A. Challenges to Populate Values for a Terminology</title>
      </sec>
      <sec id="sec-4-2">
        <title>Metadata Model</title>
        <p>It is difficult to populate values for non-ontologies since
they do not have a more formal/explicit ontological
representation. Some “Required” elements in the OMV have to
be left blank. For example, LOINC has a NULL value for the
required element “hasOntologyLanguage”. Therefore, the
occurrence constraints should be loosened to allow the model
to fit knowledge representation resources that fall along the
various points along the semantic spectrum; from thesauri to
controlled vocabularies to taxonomies and to ontologies.</p>
        <p>Another prevailing observation was that it was sometimes
difficult to find the right values to populate the metadata model
for a terminology resource. For example it took one of the
authors a full work day to populate values from LOINC for the
whole OMV model. In some cases, the information was not
readily available (e.g., not published or difficult to find). In
other cases, although the information seemed to be reasonably
aligned with the metadata description, it was divergent enough
to lead to uncertainty about how appropriate a value was for a
given metadata element. Perhaps better data was available
elsewhere, or a specific value was different enough to suggest
the creation of a new metadata element to satisfy a perceived
gap in the model. As mentioned earlier, the ontology owner or
developer is best suited to seed and maintain the metadata
information. This mitigates much of the concern about gaps in
knowledge by a curator or the appropriateness of populated
values.</p>
      </sec>
      <sec id="sec-4-3">
        <title>B. Ontology Evolution</title>
        <p>
          Ontology evolution has been defined as the "timely
adaptation of an ontology and consistent propagation of
changes to the dependent artifacts" [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ]. As terminology
resources increase in size or complexity, management of
dependencies becomes increasingly challenging. Metadata
about and/or copies of a resource may occur in multiple
registries and repositories. Changes to resource semantics can
impact applications and other terminology resources that
import and extend the changed resource. The formal change
process needs to include a registry of known instances and
dependencies. Additional elements such as
“hasReferenceEntity” and “causeChange” (to identify likely
impacts) were added to the OMV extensions to address this
challenge. More work is needed to define metadata elements
for representing resource semantics and granular changes to
terminology content and enabling ‘time travel’.
        </p>
      </sec>
      <sec id="sec-4-4">
        <title>C. Future Work</title>
        <p>NCBO and NCI’s LexEVS CTS2 server implement a
portion of the developed model. Future work should include (1)
working with the wider community to validate the model for
other terminologies and amend/extend as needed; and (2)
developing a set of competency questions, e.g. for creating
terminology search queries based on the OMV extension. A
search engine based on this could allow users to search
appropriate terminologies using a combination of the metadata
elements, and perhaps even in combination with Ontology
Recommender or a similar tool that suggests ontologies based
on sample text input. This should be done in collaboration
with the rest of the community, since resources like
Biosharing are already working on a ‘wizard’ to help users
find data standards that meet their needs.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>VI. CONCLUSION</title>
      <p>The ORWG conducted research to identify relevant
terminology metadata models that could form the foundation
for a standard ontology profile for use by NCI, NCBO, and
NCRI. The OMV version 2.4.1 was selected as the base model.
It was tested on LOINC, SNOMED-CT, RxNorm, and NCI
Thesaurus, resulting in a revision of the OMV. The OMV
extension, already partially implemented by NCBO and
LexEVS/CTS2, could serve as the starting point for a
terminology resource metadata standard in the biomedical
research community, providing a framework for further work
with other organizations active in this space such as OBO
Foundry, RDA, or Biosharing. Even providing a standard set of
basic metadata about terminologies would be of great value to
help make terminology descriptions consistent across
resources.</p>
      <p>REFERENCES
Action
N/A
Rename and Refine definition
Rename and Refine definition
Refine definition
Refine definition
N/A
N/A
N/A
Rename
Remove and Addition (creationDate moves to
Version.creationDate and versionDate is added here)
Remove (modificationDate moves to
Version.modificationDate)
Refine definition
N/A
N/A
Refine definition</p>
      <p>N/A
URI
name
acronym
description
documentation
reference
notes
keywords
status
creationDate
modificationDate
naturalLanguage
numberOfAxioms
hasContributor
hasCreator
usedOntologyEngineeringT
ool
usedOntologyEngineeringM
ethology
conformsToKnowledgeRepr
esentationParadigm
endorsedBy
hasDomain
isOfType
designedForOntologyTask
hasFormalityLevel
knownUsage
hasOntologyLanguage
hasOntologySyntax
resourceLocator
version
hasLicense
useImports
hasPriorVersion
isBackwardCompatibleWith
isCompatibleWith
numberOfClasses
numberOfProperties
numberOfIndividuals</p>
      <p>N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
Remove
N/A
N/A
N/A
N/A
N/A
Addition
Addition
OMV Extensions
URI
fullName
shortName
description
documentation
reference
notes
keywords
developmentStatus
versionDate
naturalLanguage
numberOfAxioms
hasContributor
hasCreator
usedOntologyEngineeringTool
Comparable terms
from Dublin Core
Metadata
dc: description</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>O.</given-names>
            <surname>Bodenreider</surname>
          </string-name>
          ,
          <article-title>Biomedical ontologies in action: role in knowledge management, data integration and decision support</article-title>
          .
          <source>Yearb Med Inform</source>
          ,
          <year>2008</year>
          : p.
          <fpage>67</fpage>
          -
          <lpage>79</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>D.M.</given-names>
            <surname>Pisanelli</surname>
          </string-name>
          , ed. Ontologies in Medicine.
          <year>2004</year>
          , ISO Press.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>P.L.</given-names>
            <surname>Whetzel</surname>
          </string-name>
          , et al,
          <article-title>BioPortal: enhanced functionality via new Web services from the National Center for Biomedical Ontology to access and use ontologies in software</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J.D.</given-names>
            <surname>Tenenbaum</surname>
          </string-name>
          ,
          <string-name>
            <surname>Sansone S</surname>
            .A., and
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Haendel</surname>
          </string-name>
          ,
          <article-title>A sea of standards for omics data: sink or swim</article-title>
          ?
          <source>J Am Med Inform Assoc</source>
          ,
          <year>2014</year>
          .
          <volume>21</volume>
          (
          <issue>2</issue>
          ): p.
          <fpage>200</fpage>
          -
          <lpage>3</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <article-title>[5] NCI Center for Biomedical Informatics</article-title>
          and
          <string-name>
            <given-names>Information</given-names>
            <surname>Technology</surname>
          </string-name>
          . Available from: https://cbiit.nci.nih.gov/.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>NCI</given-names>
            <surname>Enterprise Vocabulary Services</surname>
          </string-name>
          . Available from: http://evs.nci.nih.gov/.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <article-title>[7] National Center for Biomedical Ontology</article-title>
          . Available from: http://www.bioontology.org/about-ncbo
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Noy</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <article-title>BioPortal Metadata versus OMV</article-title>
          . Available from: https://wiki.nci.nih.gov/download/attachments/24265626/VocabularyGr oupMeetibg_OMV_vs_BPMetadata.pptx.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>National</given-names>
            <surname>Cancer</surname>
          </string-name>
          Research Institute. Available from: http://www.ncri.org.uk/.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>OBO</given-names>
            <surname>Foundry</surname>
          </string-name>
          . Available from: http://www.obofoundry.org/
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Biosharing</surname>
          </string-name>
          . Available from: http://www.biosharing.org/
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Research</surname>
            <given-names>Data</given-names>
          </string-name>
          <string-name>
            <surname>Alliance</surname>
          </string-name>
          . Available from: https://rd-alliance.org/
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>Monarch</given-names>
            <surname>Initiative</surname>
          </string-name>
          . Available from: http://monarchinitiative.org/
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Elixir</surname>
          </string-name>
          . Available from: http://www.elixir-europe.org/
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <source>[15] ISO/IEC 19763-3</source>
          . Available from: http://www.iso.org/iso/catalogue_detail.
          <source>htm?csnumber=52069</source>
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Ontology</surname>
            <given-names>Definition</given-names>
          </string-name>
          <string-name>
            <surname>Metamodel</surname>
          </string-name>
          . Available from: http://www.omg.org/spec/ODM/1.1/PDF/.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>L.</given-names>
            <surname>Moreau</surname>
          </string-name>
          , et al.,
          <source>The Open Provenance Model core specification (v1.1)</source>
          .
          <source>Future Gener. Comput. Syst.</source>
          ,
          <year>2011</year>
          .
          <volume>27</volume>
          (
          <issue>6</issue>
          ): p.
          <fpage>743</fpage>
          --
          <lpage>756</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Open</surname>
            <given-names>Ontology</given-names>
          </string-name>
          <string-name>
            <surname>Repository</surname>
          </string-name>
          . Available from: http://ontologforum.org/index.php/OpenOntologyRepository
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <article-title>Ontology Metadata Vocabulary for the Semantic Web</article-title>
          . Available from: http://ontolog.cim3.net/file/resource/OOR/OMV/OMV-Reportv2.
          <article-title>4.1</article-title>
          .pdf
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>S. de Coronado</surname>
          </string-name>
          , et al,
          <article-title>Piloting a network of CTS2 terminology service nodes for value sets</article-title>
          .
          <source>in AMIA Annu Symp</source>
          .
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>D.J.</given-names>
            <surname>Vreeman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.J.</given-names>
            <surname>McDonald</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.M.</given-names>
            <surname>Huff</surname>
          </string-name>
          ,
          <string-name>
            <surname>LOINC(R) -</surname>
          </string-name>
          <article-title>A Universal Catalog of Individual Clinical Observations and Uniform Representation of Enumerated Collections</article-title>
          .
          <source>Int J Funct Inform Personal Med</source>
          ,
          <year>2010</year>
          .
          <volume>3</volume>
          (
          <issue>4</issue>
          ): p.
          <fpage>273</fpage>
          -
          <lpage>291</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>K.A.</given-names>
            <surname>Spackman</surname>
          </string-name>
          ,
          <article-title>SNOMED CT milestones: endorsements are added to already-impressive standards credentials</article-title>
          .
          <source>Healthc Inform</source>
          ,
          <year>2004</year>
          .
          <volume>21</volume>
          (
          <issue>9</issue>
          ): p.
          <volume>54</volume>
          ,
          <fpage>56</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>S.J.</given-names>
            <surname>Nelson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Zeng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Kilbourne</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Powell</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Moore</surname>
          </string-name>
          ,
          <article-title>Normalized names for clinical drugs: RxNorm at 6 years</article-title>
          .
          <source>J Am Med Inform Assoc</source>
          ,
          <year>2011</year>
          .
          <volume>18</volume>
          (
          <issue>4</issue>
          ): p.
          <fpage>441</fpage>
          -
          <lpage>8</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>N.</given-names>
            <surname>Sioutos</surname>
          </string-name>
          , S. de Coronado,
          <string-name>
            <given-names>M.W.</given-names>
            <surname>Haber</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.W.</given-names>
            <surname>Hartel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. L.</given-names>
            <surname>Shaiu</surname>
          </string-name>
          and
          <string-name>
            <given-names>L.W.</given-names>
            <surname>Wright</surname>
          </string-name>
          , NCI Thesaurus:
          <article-title>a semantic model integrating cancerrelated clinical and molecular information</article-title>
          .
          <source>J Biomed Inform</source>
          ,
          <year>2007</year>
          .
          <volume>40</volume>
          (
          <issue>1</issue>
          ): p.
          <fpage>30</fpage>
          -
          <lpage>43</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <article-title>RxNorm metadata</article-title>
          . Available from: https://wiki.nci.nih.gov/display/VCDE/OMV+Metadata+for+RxNorm
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>Dublin</given-names>
            <surname>Core</surname>
          </string-name>
          . Available from: http://dublincore.org/documents/dcmiterms/
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>Ontology</given-names>
            <surname>Recommender</surname>
          </string-name>
          . Available from: https://bioportal.bioontology.org/recommender
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>H.R.</given-names>
            <surname>Solbrig</surname>
          </string-name>
          and
          <string-name>
            <given-names>C.G.</given-names>
            <surname>Chute</surname>
          </string-name>
          ,
          <article-title>Terminology access methods leveraging LDAP resources</article-title>
          .
          <source>Stud Health Technol Inform</source>
          ,
          <year>2004</year>
          .
          <volume>107</volume>
          (
          <issue>Pt 1</issue>
          ): p.
          <fpage>545</fpage>
          -
          <lpage>9</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>