=Paper=
{{Paper
|id=Vol-2285/ICBO_2018_paper_15
|storemode=property
|title=Quality Assurance of Ontology Content Reuse
|pdfUrl=https://ceur-ws.org/Vol-2285/ICBO_2018_paper_15.pdf
|volume=Vol-2285
|authors=Michael Halper,Christopher Ochs,Yehoshua Perl,Sivaram Arabandi,Mark A. Musen
|dblpUrl=https://dblp.org/rec/conf/icbo/HalperOPAM18
}}
==Quality Assurance of Ontology Content Reuse==
Proceedings of the 9th International Conference on Biological Ontology (ICBO 2018), Corvallis, Oregon, USA 1
Quality Assurance of Ontology Content Reuse
Michael Halper1, Christopher Ochs2, Yehoshua Perl1, Sivaram Arabandi3, Mark A. Musen4
1 2 3 4
NJIT Nokia Bell Labs Ontopro LLC Stanford University
Newark, NJ 07102 USA Murray Hill, NJ 07974 USA Houston, TX 77025 USA Stanford, CA 94305 USA
{michael.halper, christopher.ochs@nokia- sivaram.arabandi@gmail.com musen@stanford.edu
yehoshua.perl}@njit.edu bell-labs.com
Abstract—Building ontologies is difficult and time-consuming. the Cancer Chemoprevention Ontology (CanCo) [11]. These
As such, content reuse has been promoted as an important guiding ontologies all reused content from other ontologies (e.g., BFO),
principle in ontology development. Reusing content from other and in the context of these ontologies, some of the QA
ontologies can reduce the overall effort involved in new ontology problems we encountered related to content reuse.
construction and provide better alignment with existing knowledge
modeling. However, reuse is not a panacea, and it comes with its In this paper, we focus strictly on such ontology QA
own attendant difficulties. In this paper, we investigate some common problems and investigate a broader collection of ontologies that
quality assurance issues associated with reuse, such as duplicated reuse content. The main purpose for this study is to alert
content and versioning problems. Some heuristic-based approaches curators and authors, especially those new to the process, of the
are proposed for analyzing ontologies for these kinds of quality pitfalls of reuse in terms of the errors that they are likely to
assurance issues. An analysis is carried out on a sample of the large encounter. This awareness will help in avoiding the errors in
collection of BioPortal-hosted ontologies, many of which employ the first place and enhancing the content of their own
reuse. The findings indicate that curators and authors, particularly ontologies. Let us note that ontology errors can come in a wide
those new to the reuse process, should be on the alert when range of severity and causes, such as with unsatisfiability,
developing an ontology with reused content to avoid introducing incoherence, and inconsistency of concepts. Even so, we will
problems into their own ontologies. use the term “error” throughout this paper, though one may
argue in certain circumstances whether an irregular modeling
Keywords—ontology; modeling; ontology reuse; ontology quality issue truly warrants that designation.
assurance; BioPortal
Moreover, let us state at the outset that ontology
I. INTRODUCTION development is intrinsically difficult, and the findings that we
present are in no way meant as indictments of anyone’s work.
Ontology reuse is a well-established design pattern. An In fact, some of the errors reported arose from the work of one
ontology author may reuse content to save on development of the co-authors (SA), who took great care in the construction
time and effort, promote interoperability with other ontologies, of the SDO. Ontology developers have the best intentions to do
and ensure that a consistent representation of a domain is a good job and take great pains to review their work. Even with
included in their ontology. Support for importing and reusing that being the case, the inherent complexity of ontology design
ontology content is included in the Web Ontology Language and the reuse of content makes the appearance of errors almost
(OWL) (through the use of owl:imports axioms) [1], and the inescapable. It is our intention to alert ontology maintenance
paradigm is supported by the Protégé ontology editing personnel to this fact through the results of our study.
environment [2]. Top-level ontologies such as the Basic Additionally, we are not criticizing reuse in ontology design,
Formal Ontology (BFO) [3] were designed specifically to with its numerous advantages. We just wish to caution
support content reuse and alignment of ontologies. Top-domain ontology designers to be careful about the potential
ontologies, like the Ontology for General Medical Sciences disadvantages and pitfalls of reuse.
(OGMS) [4] and BioTop [5], extend the BFO and add general
domain knowledge that can also be reused by an ontology Our focus is on the collection of ontologies hosted in
author. BioPortal [12]. The specific QA issues that we wish to examine
are duplicated content (including duplicated classes and
While there are enormous benefits to reuse, an ontology properties), versioning problems with respect to source
author also needs to be keenly aware of potential issues that ontologies of reuse, and mechanical import errors. The
can affect the quality of the resulting ontology. There may be heuristic methods that were used in our analyses are described,
unintended consequences if reused content is not incorporated and our findings from among the BioPortal ontologies are
correctly or not maintained properly. In previous studies [6], reported.
we investigated the issue of quality assurance (QA) in the
context of the Sleep Domain Ontology (SDO) [9], the
Ontology for Drug Discovery Investigations (DDI) [10], and
ICBO 2018 August 7-10, 2018 1
Proceedings of the 9th International Conference on Biological Ontology (ICBO 2018), Corvallis, Oregon, USA 2
II. BACKGROUND However, there may be unexpected consequences
downstream, especially after classification.
A. Prior Reuse QA
Alternatively, the author of O may reuse a fixed version of
Ochs et al. [6] performed a QA review of the SDO and S’s content (either the complete contents of the ontology or a
discovered several significant issues related to the import of selected subset of the ontology, extracted using, e.g., the
content from other ontologies. For example, pairs of duplicated MIREOT approach [21]). Reusing a fixed version of the
classes (e.g., two Clinical finding classes and two Organism content provides the author of O with greater control over
classes), originating from different ontologies, were found and when reused content is updated, at the expense of making it
corrected. However, on revisiting the SDO using a change labor intensive to align changes from S into O.
analysis methodology called a diff partial-area taxonomy [13],
which visually summarizes the differences between two
releases of a given ontology, several additional QA issues III. METHODS
related to the reuse of content were uncovered. In this study, we reviewed a collection of ontologies from
BioPortal, looking for errors and inconsistencies arising from
These preliminary studies, along with further discussions the reuse of content from other ontologies. In analyzing the
with ontology authors and maintainers, motivated the research collection, we employed several heuristic-based methodologies
described herein. The reuse design pattern, and the way it is to determine the prevalence of duplicate content, versioning
applied, can have serious, unintended impacts on an ontology. problems, and any import issues. The collection that was
The advantages of reusing content often come with a cost to examined was extracted from the 355 ontologies studied by
the quality of the overall ontology. Ochs et al. [18], which were obtained from BioPortal in April
2015. Specifically, the collection consisted of the 197
B. Prior Analysis of Ontology Reuse ontologies (55.5%) that were found to reuse content.
Previous studies have reviewed the existence and
We define a source ontology as an ontology that has
prevalence of ontology reuse. Kamdar et al. [14] analyzed
content included in another ontology O. As in Ochs et al. [18],
term reuse among ontologies and noted several error patterns
we define reuse according to the URIs of the entities in an
with ontology reuse. Ghazvinian et al. [15] reviewed the
ontology. For each ontology in this study, we identified its
orthogonality of the OBO Library [16] ontologies. Ochs et al.
base URI (e.g., the base URI of the BFO is
[18] investigated how reused content is utilized in a sample of
http://purl.obolibrary.org/obo/bfo). Similarly, the base URI of
355 ontologies in BioPortal.
the SDO is http://mimi.case.edu/ontologies/2009/1/SDO.owl.
Among the ontologies in BioPortal, reuse of the BFO, an In general, all of the entities in an ontology have a URI that
upper-level ontology, is somewhat common. This is expected starts with the ontology’s base URI. Different versions of an
given the principle of a “commitment to collaboration” ontology may have different base URIs. For this study, an
espoused by the OBO Foundry [16]. Content reuse from top- entity (i.e., class or property) was considered reused if it had a
domain ontologies, like the OGMS [4], and domain-specific different base URI from the ontology it is residing in (e.g., a
ontologies, like GO [19] and ChEBI [20], is also fairly BFO class in SDO will have the BFO base URI). In this study,
common. we did not distinguish between content imported directly and
content imported by transitivity.
C. Methods of Ontology Reuse In the following, we describe the kinds of errors that were
There are several ways an author of an ontology O can sought and the approaches to finding them. Examples from the
reuse content from a source ontology S. Each method of reuse SDO are used to illustrate how each type of error may manifest
has several advantages and disadvantages, particularly in itself during the ontology editing process. Additionally, for
relation to maintaining and updating reused content. Content each kind of error, we describe the heuristic-based approach
included from another ontology may be updated periodically that we utilized to determine the prevalence of the error among
at its source. Corrections of errors and inconsistencies the set of 197 BioPortal ontologies.
performed during maintenance of the source ontology S will
need to be propagated to O. While an ontology like the BFO A. Duplicated Content
may be updated only once every several years (e.g., BFO 1.1
An author may reuse content from multiple ontologies. If
was released in 2009 and BFO 2.0 was released in 2015),
content from two ontologies is reused, and the ontologies cover
other ontologies are updated much more frequently. ChEBI
a similar domain, the potential exists for the inclusion of
and GO, which Ochs et al. [18] found to be reused by 37 and
duplicate classes (i.e., the author could inadvertently include
33 ontologies, respectively, are updated quite frequently:
two classes, from two different ontologies, that represent the
ChEBI, almost every month, and GO, on a daily basis (though
same concept). This kind of duplicate information is not
a new version may only be published monthly).
desired. As mentioned previously, we identified several pairs
An ontology author may include content via the of duplicated classes in the SDO [6].
owl:imports mechanism defined in OWL syntax, and
Class duplication can cause significant issues. For example,
implemented in the OWL API [1]. This approach includes the
the abovementioned duplicate Clinical finding classes in SDO
entire contents of S into O “on the fly,” which allows updates
had the same name and represented the same entity but were
in S to be included in O without work from the author of O.
not set equivalent, and their restrictions were not the same.
ICBO 2018 August 7-10, 2018 2
Proceedings of the 9th International Conference on Biological Ontology (ICBO 2018), Corvallis, Oregon, USA 3
This issue will cause problems for both users and authors alike, As with duplicate classes, the introduction of duplicate
as they will typically not suspect a duplicate and will likely not properties may be due to the fact that two ontologies cover a
suspect that classes representing the same entity will have similar domain. For example, in the SDO, there are has
different modeling within a single ontology. participant object properties included from the Relations
Ontology (RO) [23] and BioTop, both of which represent the
same kind of relationship. Both properties are utilized in the
modeling of the SDO. Some SDO classes have restrictions
using the RO version of has participant, while other SDO
classes have the BioTop version. See Fig. 2 for some examples
of this. These properties were not defined as equivalent in the
SDO.
To identify ontologies with duplicated classes, we can
utilize two heuristic-based methods. (These methods can also
be used to identify duplicate properties). First, an ontology may
have duplicate classes if it reuses classes from two source
ontologies that cover a similar, or identical, domain. For
example, if an ontology reuses classes from FMA [24] and
Uberon [25], ontologies that model the domain of anatomy,
Fig. 1. Hierarchical paths between classes human and organism in BioTop then there is a greater chance of finding duplicate classes than
(left) and CPRO (right). in ontologies that reuse content from only one ontology.
Beyond individual classes being duplicated, two source Second, if an ontology reuses two classes with the same
ontologies may have subhierarchies of duplicate (or very label, but those classes originate from different source
similar) classes, often modeled with different levels of ontologies, then they may be duplicates. One can search for all
granularity. For example, in the SDO, we found duplicated pairs (or, in general, sets) of classes where the label is the same
classes for organism and human, originating from BioTop and but the URIs of the classes are different. While this method
CPRO [22]. (Actually, the terms are slightly different in each: potentially returns many false positives—e.g., “cold,” as in
living organism vs. organism and human vs. human/person, temperature, and “cold,” as in the disease, which are expected
respectively.) In BioTop, living organism is a distant ancestor to have different URIs and may be modeled in different
of human; there are seven other ancestor classes on the domains—it provides an indicator for a potential problem.
ancestry path between them (e.g., great ape, primate, and
mammal). In CPRO, human/person is a direct subclass of B. Versioning Problems
organism. See Fig. 1. The two versions of human have There is also the potential of versioning problems when
different relationship structures. The one on the left has a reusing properties. In general, ontologies should reuse content
defined participates in relationship. The one of the right does consistently from a single version of a source ontology.
not, though its two children, patient and physician, do have However, an ontology may inadvertently include content from
have the relationship. The use of one human version alone may multiple versions of the same source ontology. This may occur
lead to deficient modeling in an application. due to import by transitivity. For example, the SDO includes
Duplicate classes can also be unknowingly included. The multiple versions of the part of property: one from an old
duplicate content may be imported by transitivity, i.e., an version of RO included via FMA and another from a more
ontology was reused by another reused ontology and the author recent version of RO via CPRO. Again, both of these
may or may not have been aware of this. Different versions of properties represent the same relationship. See Fig. 3 for
the same ontology may be reused. For example, as we report illustrations. Below, we identify several ontologies that reuse
below, we identified several ontologies that appear to import content from multiple versions of the BFO.
content from multiple versions of the BFO.
Duplicate properties can also be introduced into an
ontology via reuse. Let us point out that the presence of
duplicate properties in itself is not necessarily an error. It is the
inconsistent use of such properties that constitutes an error.
This situation is analogous to the software engineering scenario
where multiple libraries are imported; in such a case, there is a
high potential for similar functions to be present.
Fig. 3. Property part of from two versions of RO in the SDO.
To analyze inconsistent versioning, we can identify the
base URI of every ontology in our data set, under the heuristic
that a different base URI indicates a significantly different
Fig. 2. Two examples of SDO classes using the has participant property. version of the same ontology. A number of source ontologies,
ICBO 2018 August 7-10, 2018 3
Proceedings of the 9th International Conference on Biological Ontology (ICBO 2018), Corvallis, Oregon, USA 4
such as the BFO, GO, FMA, and others, were found to have For example, in the Synapse Ontology (SYN), there are
multiple base URIs among the BioPortal ontologies. For many (apparently) duplicated classes reused from NCIt and CL
example, in Ochs et al. [18], we identified six base URIs for (e.g., pairs of acinar cell classes). Within SYN, we found three
FMA, and below we describe several base URIs for BFO. We separate Cell subhierarchies. One subhierarchy, from GRO,
mapped each source ontology to its set of base URIs. If an consists of two classes. The other two Cell subhierarchies,
ontology O included entities from the same source ontology S, from NCIt and CL, are much larger. There are no equivalences
but the entities had different base URIs, the ontology is set between the classes in these subhierarchies. For the use case
considered to have a reuse versioning problem. of SYN, this might be an intentional design decision, but from
an ontology design perspective, it is not typical compared to
C. Owl:imports Errors other ontologies that reuse NCIt, CL, etc.
OWL’s owl:imports mechanism enables an ontology author The CSEO contains over 200 potential duplicate class
to include external ontologies without defining the classes and pairs. It includes a large portion of the Disease subhierarchy
properties in their own ontology. The entire external ontology from NCIt and defines its own Finding subhierarchy. In these
will be included when the importing ontology is opened (e.g., two subhierarchies, there are many similar classes (e.g.,
in the OWL API). However, if the URI for the source ontology Abscess) that represent diagnoses. Looking more closely, we
is not correct, or the ontology is no longer available at the found additional pairs of duplicate diagnoses. Similarly, many
specified URI, then the source ontology cannot be loaded. classes related to various kinds of anatomical structures and
To investigate issues related to owl:imports errors, we tissues (e.g., Tongue and Uterus) are included from NCIt and
opened every ontology with the OWL API and logged which added in CSEO. In all of these cases, there are no connections
ontologies encountered an error related to a missing (e.g., equivalences or restrictions) to indicate that these pairs of
classes are related to one another. On the other hand, CSEO
owl:imports file(s).
does define equivalences between classes reused from NCIt
and classes reused from UO (e.g., Lux and Liter).
IV. RESULTS
For duplicated properties, we found 31 ontologies with
Our analysis of the various kinds of errors resulting from properties that have the same label and different base URIs.
reuse was carried out on the 197 ontologies in BioPortal that Twenty of these (64.5%) were found to contain one or more
were found to reuse content by Ochs et al. [18]. See [12] for pairs of duplicated properties. For instance, ENM contains
more information pertaining to the individual ontologies several pairs of duplicated properties from BAO, RO, and NPO
referred to in this section. (e.g., properties named derives from and has part).
A. Duplicate Classes and Properties B. Versioning Problems
Reuse of classes from multiple sources is common, with an The large majority of cases of reuse that appear to have
average of more than five sources [18]. But we found that it is versioning problems, based on different base URIs, were found
relatively uncommon for an ontology to reuse classes from two among ontologies that reuse the BFO and RO. We identified
or more ontologies that cover a similar domain. However, eleven BioPortal ontologies (3.1% of all the ontologies in the
when we investigated cases where ontologies did reuse such BioPortal at the time) that included classes from multiple
content, there were several potential errors. For example, the versions of the BFO. For example, the DDI uses all 39 classes
Cell Line Ontology (CLO) reuses content from the FMA and from an OWL release of BFO and one class from a version of
Uberon. In it, we found several potential duplicate class pairs. the BFO with an OBO URI. Fig. 4 shows eight examples of
For example, there is Scalp from Uberon and Scalp from FMA. ontologies that include classes from multiple versions of the
There were also duplicate Pelvis classes from EFO and BFO. Two of the ontologies, CHEMINF and TEO, include all
Uberon. Many such classes are related using class equivalence
of the content from two versions of the BFO.
axioms (e.g., Amnion, Colon, and Intestine). However, other
duplicate classes are not related in this way (e.g., Scalp, Aorta,
and Liver). Analyzing these examples, one can see that CLO
includes the Anatomical structure subhierarchy from Uberon
and the Organism part subhierarchy from EFO. In such a case,
the potential exists for additional duplicate classes.
When looking for pairs of classes with the same label but
different base URIs, we found that class duplication does not
occur frequently. In total, 149 ontologies were found to reuse
at least one class from another ontology. Among the 149
ontologies, 46 ontologies (30.1%) contain at least one potential
duplicate pair based on our criteria. In general, we found very
few such pairs in a given ontology. Most of the 46 ontologies
either have just a single pair or between two and ten pairs. We
did find several ontologies (e.g., CLO, CSEO, and SYN) that
have many such pairs, and these ontologies reuse content from Fig. 4. Example ontologies reusing content from multiple versions of the
multiple ontologies that cover the same—or similar—domains. BFO
ICBO 2018 August 7-10, 2018 4
Proceedings of the 9th International Conference on Biological Ontology (ICBO 2018), Corvallis, Oregon, USA 5
The reuse of classes from multiple versions of non-BFO the specified location. Five of the 44 ontologies (11.4%) were
ontologies was relatively uncommon. We identified a few previously hosted on Google Code (which is no longer
ontologies that included classes from multiple versions of the available, as of January 2016).
same ontology. For example, COGPO and DDI include classes
from multiple versions of PATO and UO. These classes are not There were relatively few errors caused by other types of
set equivalent. In cases where multiple versions of an ontology invalid import statements. For example, various Psychology
appear, the numbers of classes reused from each are typically (APA thesaurus) ontologies on BioPortal all have owl:imports
disproportionate. For example, the Cell Culture Ontology statements that reference local files. We note that not all
(CCONT) includes classes from multiple versions of EFO (one instances of an ontology using owl:imports are instances of
class, obsolete normal, from one version and 4,882 classes reuse, since the owl:imports mechanism is also frequently
from another). Both ENM and EP include multiple versions of used to include modules from the same ontology.
PATO. In the case of EP, 48 classes are included from one
version and 1,570 classes from another. HUPSON includes one V. DISCUSSION
class from one ChEBI version, and 83 classes from another.
MF includes classes from multiple versions of NBO; MIRNAO We note that when an author is designing an ontology, it is
includes several classes from multiple versions of the GO. often with the intention of supporting a specific set of use-
cases, or some specific application. Thus, some of the issues
We found many different versions of the RO, OBO REL, we identified in this paper may not be problematic for the
and BFO properties (e.g., has part and part of) reused in our intended purposes. However, once it has been discovered that
data set. Along with SDO, we found several ontologies that an ontology appears to contain inconsistencies due to reuse, the
reuse properties from multiple versions of these ontologies issue should be brought to the attention of the author of the
(often in class restrictions). Consider, for example, the has part ontology. These problems could have deleterious effects if the
property. We identified 14 versions of this object property in ontology is utilized beyond its original scope.
our dataset (see Table I). Reviewing the ontologies enumerated
in Table I, we identified a total of 20 ontologies that include One significant complication is that, based on the metrics
multiple versions of these (and other) RO relationships. Four provided in BioPortal, hundreds of ontologies have not been
ontologies, namely, AERO, ONSTR, TAO, and VSO, include updated in several years (if ever). Many of these ontologies are
object properties from three versions of the RO. no longer maintained and reuse old versions of source
ontologies that are long out of date. This leads to, for example,
twelve versions of OBO REL/RO/BFO properties appearing
TABLE I. VARIOUS VERSIONS OF THE HAS PART PROPERTY FOUND throughout BioPortal’s ontologies (as illustrated in Table I).
AMONG THE BIOPORTAL ONTOLOGIES
This situation can impact ontology authors who decide to reuse
URI
# Ontologies That Reuse the contents of these “dormant” ontologies (using, e.g., the
has part Property BioPortal reuse plugin [26] for Protégé). In future work, we
http://purl.obolibrary.org/obo/bfo_0000050 48 will investigate ways of warning ontology authors about
http://www.obofoundry.org/ro/ro.owl#part_of 28
http://purl.obolibrary.org/obo/temp#part_of 27
potential issues when reusing an ontology’s classes. We will
http://purl.obolibrary.org/obo/obo_rel#_part_of 7 also investigate semi-automated techniques for identifying and
http://purl.obolibrary.org/obo/bfo_00000050 5 preventing issues when reusing content (which could, e.g.,
http://purl.obolibrary.org/obo/obo_rel_part_of 4 automatically align the different part of properties used in the
http://purl.org/obo/owl/obo#part_of 2 SDO and other ontologies).
http://purl.org/obo/owl/obo_rel#part_of 2
http://purl.org/obo/owl/ro#part_of 2 The errors reported on in this paper are from the year 2015.
http://purl.obolibrary.org/obo/http://www.obofoundry.org/
1 Checking on a sample of them in the current version of the
ro/ro.owl#part_of
http://purl.org/obo/owlapi/relationship#obo_rel_part_of 1
BioPortal, we found that the errors mentioned here are still in
http://www.ifomis.org/obo/ro/1.0#partof 1 existence because we did not alert the curators of the specific
http://obofoundry.org/ro/ro.owl#part_of 1 ontologies at the time. We can assume that many of the other
http://purl.obofoundry.org/ro/ro.owl#part_of 1 errors still exist. In fact, a July 2018 scan of a sample of the
Total: 130 ontologies reported on in the results revealed a number of
ontologies whose latest BioPortal release predated 2015 (e.g.,
AERO, COGPO, DDI, CSEO, SYN, etc.). Moreover, all these
C. owl:imports Errors ontologies had relatively significant numbers of visits at their
A total of 44 ontologies could not be loaded by the OWL BioPortal pages in the second quarter of 2018, indicating
continued interest in them. Since many more ontologies have
API due to errors caused by missing imported ontologies.
been added to the BioPortal in the interim, another review
There were several reasons for these errors; however, the large
would probably uncover more errors, but we were not in a
majority were caused by URIs being no longer valid web position to perform such a study. Although the examples are
addresses. For example, the DDI ontology includes from 2015, they reflect the reality of some phenomena that
http://www.obofoundry.org/ro/ro.owl, but no ontology file curators and authors are liable to encounter when engaging in
exists at that location. In a similar manner, RoleO includes the practice of ontology reuse. The timeliness of the results is
http://purl.obolibrary.org/obo/RoleO/external/bfo_import.owl. not critical since the purpose of the paper is to alert ontology
Similarly, SDO includes its custom-built Units Ontology via designers and maintenance personnel, especially those new to
an owl:imports statement, but the ontology no longer exists at the process of content reuse, to the kinds of problems and
ICBO 2018 August 7-10, 2018 5
Proceedings of the 9th International Conference on Biological Ontology (ICBO 2018), Corvallis, Oregon, USA 6
errors they are likely to face when creating an ontology with [8] Z. He, C. Ochs, A. Agrawal, Y. Perl, D. Zeginis, K. Tarabanis, G.
the aid of reuse. Elhanan, M. Halper, N. Noy, and J. Geller, “A family-based framework
for supporting quality assurance of biomedical ontologies in BioPortal,”
Also in future work, we plan to offer a set of guidelines for in Proc. 2013 AMIA Annual Symposium, Washington, DC, Nov. 2013,
ontology reuse in order to preempt some of the troubles pp. 581–590.
described herein. Major aspects of those guidelines will deal [9] S. Arabandi, C. Ogbuji, S. Redline, R. Chervin, J. Boero, R. Benca et
al., “Developing a Sleep Domain Ontology,” in Proc. AMIA Clinical
with ontological commitment and the proper consideration of Research Informatics Summit, 2010.
the hierarchical context of reused content. We will also review [10] D. Qi, R. D. King, A. L. Hopkins, G. R. Bickerton, and L. N. Soldatova,
some of the software tools available to complement these “An ontology for description of drug discovery investigations,” Journal
guidelines. of Integrative Bioinformatics, vol. 7, no. 3, Mar. 2010.
[11] D. Zeginis, A. Hasnain, N. Loutas, H. F. Deus, R. Fox, and K. A.
Tarabanis, “A collaborative methodology for developing a semantic
VI. CONCLUSION model for interlinking Cancer Chemoprevention linked-data sources,”
The reuse of content from existing ontologies is an Semantic Web, vol. 5, no. 2, pp. 127–142, 2014.
important design principle that can facilitate the work of [12] “BioPortal,” available at http://bioportal.bioontology.org/. Accessed
curators and authors when creating new ontologies. It can also May 7, 2018.
help to ensure alignment of the new ontologies with previously [13] C. Ochs, Y. Perl, J. Geller, M. Haendel, M. Brush, S. Arabandi, and S.
Tu, “Summarizing and visualizing structural changes during the
modeled knowledge. However, the process of reuse is not a
evolution of biomedical ontologies using a Diff abstraction network,”
simple one, and there are potential pitfalls. In this paper, we Journal of Biomedical Informatics, vol. 56, pp. 127–144, 2015.
studied a collection of BioPortal ontologies to determine what [14] M. R. Kamdar, T. Tudorache, and M. A. Musen, “A systematic analysis
problems may have been introduced via reuse. We focused on of term reuse and term overlap across biomedical ontologies,” Semantic
three kinds of errors and presented heuristic methodologies to Web Journal, vol. 8, no. 6, pp. 853–871, 2017.
uncover these within a collection of ontologies. The results [15] A. Ghazvinian, N. F. Noy, and M. A. Musen, “How orthogonal are the
showed that significant errors could arise from reuse. This OBO Foundry ontologies?” Journal of Biomedical Semantics, vol.
should encourage ontology maintenance personnel to be 2(Suppl 2): S2, 2011.
cautious and vigilant when adopting the reuse approach. [16] “The OBO Foundry,” available at http://www.obofoundry.org/.
Accessed April 30, 2018.
[17] B. Smith, M. Ashburner, C. Rosse, J. Bard, W. Bug, W. Ceusters et al.,
ACKNOWLEDGMENT “The OBO Foundry: Coordinated evolution of ontologies to support
biomedical data integration,” Nature Biotechnology, vol. 25, pp. 1251–
Research reported in this publication was supported by the 1255, 2007.
National Cancer Institute of the National Institutes of Health [18] C. Ochs, Y. Perl, J. Geller, S. Arabandi, T. Tudorache, and M. A.
under award number R01CA190779. The content is solely the Musen, “An empirical analysis of ontology reuse in BioPortal,” Journal
responsibility of the authors and does not necessarily represent of Biomedical Informatics, vol. 71, pp. 165–177, 2017.
the views of the National Institutes of Health. [19] “Gene Ontology Consortium,” available at
http://www.geneontology.org. Accessed May 6, 2018.
REFERENCES [20] “Chemical Entities of Biological Interest (ChEBI),” available at [12].
Accessed April 2, 2018.
[1] “OWL Web Ontology Language Reference,” available at
https://www.w3.org/TR/owl-ref. Accessed May 11, 2018. [21] M. Courtot, F. Gibson, A. L. Lister, J. Malone, D. Schober, R. R.
Brinkman, and A. Ruttenberg, “MIREOT: The minimum information to
[2] “Protégé,” available at http://protege.stanford.edu. Accessed May 13, reference an external ontology term,” Applied Ontology, vol. 6, no. 1,
2018. pp. 23–33, 2011.
[3] P. Grenon, B. Smith, and L. Goldberg, “Biodynamic ontology: Applying [22] “Computer-based Patient Record Ontology,” available at
BFO in the biomedical domain,” Ontologies in Medicine, pp. 20–38, https://code.google.com/archive/p/cprontology. Accessed May 13, 2018.
2004.
[23] B. Smith, W. Ceusters, B. Klagges, J. Köhler, A. Kumar, J. Lomax, C.
[4] “OGMS – Ontology for General Medical Science,” available at Mungall, F. Neuhaus, A. L. Rector, and C. Rosse, “Relations in
https://code.google.com/archive/p/ogms. Accessed May 13, 2018. biomedical ontologies,” Genome Biology, vol. 6:R46, 2005.
[5] E. Beisswanger, S. Schulz, H. Stenzhorn, and U. Hahn, “BioTop: An [24] C. Rosse and J. L. V. Mejino, “A reference ontology for biomedical
upper domain ontology for the life sciences – a description of its current informatics: The Foundational Model of Anatomy,” Journal of
structure, contents, and interfaces to OBO ontologies,” Applied Biomedical Informatics, vol. 36, no. 6, pp. 478–500, Dec. 2003.
Ontology, vol. 3, no. 4, pp. 205–212, 2008.
[25] C. J. Mungall, C. Torniai, G. V. Gkoutos, S. E. Lewis, and M. A.
[6] C. Ochs, Z. He, Y. Perl, S. Arabandi, M. Halper, and J. Geller, Haendel, “Uberon, an integrative multi-species anatomy ontology,”
“Refining the granularity of abstraction networks for the Sleep Domain Genome Biology, vol. 13:R5, 2012.
Ontology,” in Proc. Fourth Int’l Conference on Biomedical Ontology
(ICBO 2013), Montreal, Canada, Jul. 2013, pp. 84–89. [26] J. Nair, T. Tudorache, T. Whetzel et al., “The BioPortal Import Plugin
for Protégé,” in Int’l Conference on Biomedical Ontology (ICBO 2011),
[7] Z. He, C. Ochs, L. Soldatova, Y. Perl, S. Arabandi, and J. Geller, 2011, pp. 298–299.
“Auditing redundant import in reuse of a top level ontology for the Drug
Discovery Investigations ontology,” in Proc. Int’l Workshop on Vaccine
and Drug Ontology Studies (VDOS-2013), Montreal, Canada, Jul. 2013
ICBO 2018 August 7-10, 2018 6