<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Towards an Ontology for Representing Malignant Neoplasms</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>William D. Duncan</string-name>
          <email>william.duncan@roswellpark.org</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Carmelo Gaudioso</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alexander D. Diehl</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Biomedical Informatics, University at Buffalo</institution>
          ,
          <addr-line>Buffalo, NY, 14203</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Biostatistics and Bioinformatics, Roswell Park Cancer Institute</institution>
          ,
          <addr-line>Buffalo, NY, 14203</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Oncology research produces data about a wide variety of entities such as tumor types, locations, pathology, and staging, patient treatments and outcomes, and experimental systems such as mouse models and cell lines. In order to conduct effective cancer research, terminologies, classification systems, and ontologies are needed that can integrate these various datasets and provide standards for consistently representing entities. In this paper, we discuss our ongoing efforts to address these difficulties by developing a realism-based ontology for representing instances of malignant neoplasms, disease progression, treatments, and outcomes. This ontology is being built using the principles of the OBO Foundry, and makes use of other OBO Foundry ontologies, such as the Ontology for General Medical Sciences, Uberon, and the Cell Ontology. As a result of our efforts, we have made worthwhile progress towards developing a robust ontological framework for representing malignant neoplasms.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>Oncology research produces data about a wide variety of
entities such as tumor types, locations, pathology, and
staging, patient treatments and outcomes, and experimental
systems such as mouse models and cell lines. In order to
conduct effective cancer research, terminologies, classification
systems, and ontologies are needed that can integrate these
various datasets and provide standards for consistently
representing entities. These standards facilitate the meaningful
linking, sharing, and analysis of disparate datasets between
researchers and across institutions. However, the incomplete
and inconsistent representation of cancer-related data makes
it difficult to perform these activities.</p>
      <p>
        In this paper, we discuss our ongoing efforts to address
these difficulties by developing a realism-based ontology for
representing instances of malignant neoplasms, disease
progression, treatments, and outcomes. This ontology is being
built using the principles of the OBO Foundry
        <xref ref-type="bibr" rid="ref12">(Smith et al.
2007)</xref>
        , and makes use of other OBO Foundry ontologies,
such as the Ontology for General Medical Sciences and the
Cell Ontology. We chose to focus on these entities because
they are key elements driving accurate cohort selection
based on diagnosis, stage, and treatment; and clinical
decision support. As a result of our efforts, we have made
worthwhile progress towards developing a robust
ontological framework for representing malignant neoplasms.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2 PROJECT MOTIVATION</title>
      <p>This research developed out of a number of interests.
The first is that we recognized a need to connect cancer data
from multiple sources with differing levels of granularity.
Some important levels include: (1) diagnosis and treatment
information about the patient and how the patient responses
to treatment; (2) anatomical information about the organs in
which the cancer originates; (3) pathology information
about the tissues removed during procedures, such as tumor
tissues and lymph nodes; (4) cellular information, such as
data obtained from flow cytometry and
immunohistochemistry; (5) and molecular information, such as genomic
sequencing. Providing a framework for tying these kinds of
data together is essential for cancer research by providing
the basis for the use of advanced ontology-based querying
and analytical methods that allow for data integration across
multiple sources and scales.</p>
    </sec>
    <sec id="sec-3">
      <title>3 CURRENT CLASSIFICATION SYSTEMS,</title>
    </sec>
    <sec id="sec-4">
      <title>TERMINOLOGIES, AND ONTOLOGIES</title>
      <p>
        A number of existing classification systems, ontologies,
and terminologies have terms for representing malignant
neoplasms. Prominent examples include the International
Statistical Classification of Diseases 10th Revision
(ICD10), the International Statistical Classification of Diseases
for Oncology (ICD-O), the National Cancer Institute
Thesaurus (NCIT), and the Systematized Nomenclature of
Medicine Clinical Terms (SNOMED CT). However, since many
of these information organization systems do not share a
common upper-level framework, it is not easy to leverage
information contained in other terminologies and
ontologies. For instance, SNOMED CT does not have terms for
checkpoint inhibitors, whereas the NCI Thesaurus does.
Ideally, we would like to use terms from each system (i.e.,
SNOMED CT and NCIT), but due to differences in their
relations and hierarchical structures, it is difficult to do so.
For example, the term ‘metastasis’ denotes a disorder in
SNOMED CT, but denotes the spread of cancer (i.e., a
process) in the NCIT. OBO Foundry ontologies, in contrast, are
generally designed using the Basic Formal Ontology (BFO)
as their upper-level framework, and this enables the creation
of domain specific ontologies whose terms can be reused by
other OBO Foundry ontologies. For instance, the Drug
Ontology
        <xref ref-type="bibr" rid="ref3">(Hanna et al. 2013)</xref>
        uses terms from the Chemical
Entities of Biological Interest (ChEBI)
        <xref ref-type="bibr" rid="ref4">(Hastings et al.
2013)</xref>
        ontology to represent drug ingredients.
3.1
      </p>
      <p>International Statistical Classification of
Diseases</p>
      <p>Due to its long history and widespread adoption, the
International Statistical Classification of Diseases (ICD)1 is
perhaps the most relevant system for classifying diseases.
Maintained by the World Health Organization (WHO), ICD
is a globally recognized healthcare classification system
consisting of hierarchically structured codes that represent
diseases, disorders, and other health related issues.2 In
relation to the current topic, the International Statistical
Classification of Diseases for Oncology (ICD-O)3 has codes for
representing a number of pertinent characteristics of a
malignant neoplasm, such as the anatomical site of the
neoplasm, the neoplasm’s histology (e.g., small cell, clear cell),
and behavior (e.g., if it has metastasized). For example, an
ovarian adenocarcinoma is represented using the following
combination of codes:
•
•</p>
      <p>C56 – the site code for an ovary
8140/3 – 8140 is the code for a neoplasm arising
from glandular epithelial tissue, and ‘/3’ represents
that the neoplasm is malignant</p>
      <p>The advantage of ICD’s coding system is that allows
diseases to be easily grouped and counted for statistical and
reporting purposes. For instance, to find all patients who
have an adenocarcinoma, you only have to look for patients
whose histological code begins with ‘814’ and has a
behavior code greater than 3. However, there are two related
noteworthy drawbacks to implementing ICD as an ontology.
First, ICD does not contain codes for many of the important
cancer related entities that need to be represented, such as
treatments and molecular disorders. This shortcoming is
compounded by ICD’s lack of formal relations that would
allow codes to be linked to other information. Thus, even if
we created code lists for the missing entities, we would still
be faced with the task of creating well-defined relations that
would allow this information to be linked to ICD codes.
3.2</p>
      <sec id="sec-4-1">
        <title>National Cancer Institute Thesaurus</title>
        <p>
          The National Cancer Institute Thesaurus (NCIT) is a
reference terminology developed by the National Cancer
Institute
          <xref ref-type="bibr" rid="ref9">(Sioutos et al. 2007)</xref>
          . It contains over 100,000 concepts
with textual definitions and 400,000 cross links between its
concepts.4
1 For brevity, we use the general term ‘ICD’ to refer to the number of
different versions of ICD, such as ICD-10 and ICD-O.
2 http://www.who.int/classifications/icd/en, accessed 2017-06-20.
3 https://training.seer.cancer.gov/icdo3, accessed 2017-06-20.
4 https://ncit.nci.nih.gov/ncitbrowser, accessed 2017-06-21.
        </p>
        <p>In our examination of the NCIT, we found that many of
the definitions in the malignant neoplasms branch were
sufficiently defined and the hierarchy was rich enough to suit
our purposes. However, when we examined other branches
of the NCIT, certain problems became apparent. In
particular, we found the definitions for cell types related to cancer
to be inadequate. Consider the following NCIT concepts and
definitions:
•
•
•</p>
        <p>Abnormal Cell (C12913): An abnormal human cell
type which can occur in either disease states or
disease models.</p>
        <p>Neoplastic Cell (C12922): Cells of, or derived
from, a tumor.</p>
        <p>Malignant Cell (C12917): Cells of, or derived
from, a malignant tumor.</p>
        <p>The definition for Abnormal Cell suffers from its being
circular (i.e., an abnormal cell is defined as being an
abnormal cell type), and thus the definition does not provide any
new information. Furthermore, the definition specifically
states that an abnormal cell is a human cell. This prevents
the NCIT from consistently modeling data about abnormal
cells from non-human species despite the fact that the NCIT
does contain concepts for mouse diseases, such as Mouse
Carcinoma (C24010). Given the importance of mouse
models in cancer research, not being able to represent data from
mouse studies correctly is a severe limitation.</p>
        <p>The definitions for Neoplastic Cell and Malignant Cell
do not provide much clarity about how these cells relate to
neoplasms. Since a neoplasm may also contain normal cell
types, more details are needed about what it means to be a
neoplastic cell other than being derived from a tumor.
Furthermore, while a metastasis may be said in some sense to
derive from a tumor, this cannot be said of the originating
neoplastic cells that first started proliferating during the
tumor formation process. Lastly, it needs to be pointed out
that these cell types form a hierarchy. A Malignant Cell is a
type of Neoplastic Cell, and Neoplastic Cell is a type of
Abnormal Cell. This information is not contained in the
textual definitions in an Aristotelian fashion, although it is
represented in NCIT’s taxonomic relations.
3.3</p>
        <p>Systematized Nomenclature of Medicine
Clinical Terms</p>
        <p>Systematized Nomenclature of Medicine Clinical Terms
(SNOMED CT) is a comprehensive health terminology that
provides a standardized way to represent clinical
information in an electronic health record.5 Although SNOMED
CT has a large number of terms for clinical findings and
disorders, it does not have worked out terms for other terms
5 http://www.snomed.org/snomed-ct/what-is-snomed-ct, accessed
2017-0621.
related to neoplasms. For example, the concept Tumor cell
(SCTID 252987004) is defined as subtype of the concept
Abnormal cell (SCTID 39266006), but this does not specify
if the concept Tumor cell represents malignant cells.6
Furthermore, the concept Malignant tumor cells (SCTID:
88400008) is defined as being a subtype of the concept
Malignant neoplasm, primary (SCTID: 86049000).7 This
classification is incorrect for at least two reasons. First, although
malignant tumor cells are often part of a malignant
neoplasm, they are not a kind of malignant neoplasm. A
malignant neoplasm (as stated above) will also include a number
of non-cancer cells as part of its makeup. Second, even if we
accept that a malignant tumor cell is a kind of malignant
neoplasm, this definition is incorrect because a malignant
tumor cell is also found in a metastasis (metastatic offshoot
of a primary tumor). Finally, SNOMED CT classifies
Malignant tumor cells as a kind of Morphologic abnormality
(SCTID 4147007), and not a Disorder (SCTID 64572001).
In SNOMED CT, the distinction between a morphologic
abnormality and a disorder is that some underlying
pathological process supports a disorder.8 However, the reason
that cell becomes malignant is because of underlying
pathological processes (resulting from dysregulation) occurring
with it.
3.4</p>
      </sec>
      <sec id="sec-4-2">
        <title>Disease Ontology</title>
        <p>
          The Disease Ontology (DO) is an OBO Foundry
ontology built for the purposes of providing the biomedical
community with consistent, standardized, and reusable
definitions to represent the range of human diseases
          <xref ref-type="bibr" rid="ref8">(Schriml et
al. 2015)</xref>
          . Although we found the DO to have decent
coverage for cancer types, there are two difficulties with it that
made the DO not suitable for our purposes.
        </p>
        <p>First, the DO is not consistent in its use of the terms
‘cancer’ and ‘neoplasm’. In DO, cancer is defined as a kind
of disposition. This means that cancer is not a material thing
(i.e., does not have mass), but rather is a kind of latent
potential that is actualized when cells start proliferating out of
control. Malignant neoplasms, are material objects that
come into being due to uncontrolled cell proliferation. In the
DO, however, there are number of terms in the cancer
branch that reference neoplasms as material things and not
the disposition of cancer. For example, ovary
neuroendocrine neoplasm is defined as a subtype of ovarian cancer.
Because of DO’s inconsistent use of terms ‘cancer’ and
‘neoplasm’ and our remaining true to the OBO Foundry
principles, we decided it would be beneficial to the development
of our ontology to use the term ‘malignant neoplasm’ and
avoid using the term ‘cancer’ when possible.
6 http://browser.ihtsdotools.org, accessed 2017-06-22.
7 Ibid.
8 https://confluence.ihtsdotools.org/display/DOCEG/6.1.1+Clinical+
+-+definition, accessed 2017-06-22.
Second, the DO is missing needed formal axioms that
relate entities having the disposition of cancer to the
anatomical structures in which these entities are located. For
instance, the DO term ovary epithelial cancer does not have
axioms that formally relate the disposition to the epithelial
cells that are part of the ovary. The lack of these axioms can
make it difficult to query data modeled using the DO. For
example, it is not possible to query for the most common
anatomical structures in which malignant neoplasms are
found.
3.5</p>
        <p>On carcinomas and other pathological
entities</p>
        <p>
          In Smith et al. (2005b), the Ontology for Biomedical
Reality
          <xref ref-type="bibr" rid="ref6">(Rosse et al. 2005)</xref>
          is modified to account for material
anatomical entities, material pathological entities, and
pathological formations. Material anatomical entities are
anatomical structures (e.g., organs, cells) or bodily
substances (e.g., blood) that are found in a healthy organism.
Anatomical structures are defined as being material
anatomical entities that have an inherent 3D structure generated by
the coordinated expression organism’s own structural genes
          <xref ref-type="bibr" rid="ref10 ref11">(Smith et al. 2005b)</xref>
          . They include both canonical and
variant anatomical structures. Canonical anatomical structures
belong to ‘idealized’ healthy human beings. Variant
anatomical structures are entities that deviate from the norm
(e.g., having extra fingers), but are not pathological in the
sense discussed below.
        </p>
        <p>
          An anatomical entity is defined as being a material
pathological entity when
          <xref ref-type="bibr" rid="ref10 ref11">(Smith et al. 2005b)</xref>
          :
•
•
•
        </p>
        <p>It has come into being as a result of changes in
some pre-existing canonical anatomical structure
through processes other than the expression of the
normal complement of genes of an organism of the
given type.</p>
        <p>It is predisposed to have health-related
consequences for the organism in question manifested by
symptoms and signs.</p>
        <p>Material pathological entities include pathological structures
and pathological bodily substances. These are anatomical
structures and body substances, respectively, that host some
kind of pathological formation, a formation being
pathological when it affects an organism’s physiological processes to
the degree that they give rise to signs and symptoms. For
instance, a carcinoma is a pathological formation that arises
within an anatomical structure, such as an ovary.</p>
        <p>A high-level summary of the hierarchy for material
anatomical, material pathological entities is depicted below:
material anatomical entity
o anatomical structure
§ canonical anatomical structure
§ variant anatomical structure
portion of canonical body substance (e.g.,
portion of blood)
material pathological entity
o pathological structure (e.g., neoplasm)
o Portion of pathological substance (e.g.,</p>
        <p>portion of pus)</p>
        <p>
          Pathological formations are then related to their hosts
and the entities out of they originate using the following
relations from the Open Biomedical Ontology
          <xref ref-type="bibr" rid="ref10 ref11 ref6">(Smith et al.
2005a)</xref>
          :9
•
•
•
•
•
•
instance of: A primitive relation that holds
between a particular individual and the universal
(type or kind) that the particular individual
instantiates at particular time. For example, particular
patient is an instance of a human being at a particular
time.
part of: A primitive relation between instances of
parts and wholes at a particular time. For example,
a particular mass of malignant epithelia tissue is
part of a particular ovary at a particular time.
is a: A is a B means that A and B are universals and
for all times t every particular individual i, if i
instance of A at t, then i instance of B at t. For
example, a human being is a mammal.
derived from:10 A primitive relation between two
distinct instances i, j and times t, t’ and is such that
changes in i at t results in a new second entity j at
t’. For example, a particular blastocyst derived
from a particular zygote.
transformation of: A transformation of B means
that are universals and for all times t if i instance
of A at t, then there is an earlier time t’ at which i
was an instance of B.
        </p>
        <p>As an example, suppose a patient (patient1) has a
carcinoma (carinoma1) that originated within her ovary
(ovary1). We represent this using the axioms:
ovary1 at t part of patient1 at t
carcinoma1 at t part of ovary1 at t
carcinoma1 at t instance of pathological structure
at t</p>
        <p>Because carcinomas arise from the epithelial tissue
lining of organs, we can assert the following about the
patient’s tumor:
9 Hereafter, relations are represented in bold.
10 In the referenced Open Biomedical Ontology relations, the name the
relation is named ‘derives from’. However, to avoid confusion, we use the
term as presented in the paper.
carcinoma1 at t derived from some epithelial cell
at some t’ prior to t</p>
        <p>And, since the part of relation is transitive, we infer
that:</p>
        <p>carcincoma1 at t part of patient1 at t</p>
        <p>Although this inference is trivial, the advantage of
representing the patient’s tumor in this manner is that we are not
required to explicitly state this within an information system
using the ontology. Rather, we let the computer system
handle this through automated inferencing.</p>
        <p>The benefit of doing this becomes apparent when we
consider the multiple ways we classify malignant
neoplasms. A malignant neoplasm may be classified according
to:
•
•
•
•</p>
        <p>The cell type from which the neoplasm is
originates, e.g., carcinomas arise from epithelial cells,
and sarcomas arise from non-epithelial cells.</p>
        <p>The organ in which the neoplasm develops, e.g., an
ovarian carcinoma originates in the ovary.</p>
        <p>The organ system to which the organ of origin
belongs, e.g., an ovarian carcinoma is a kind of
reproductive system cancer
The anatomical site or region in which the organ of
origin is found, e.g., a tongue carcinoma is a kind
of head and neck cancer.</p>
        <p>When such classification information is axiomatized, we
can then query the information system along these multiple
axes without have to maintain complex data structures that
explicitly assert this information. For instance, we can now
query an information system for all carcinomas (i.e.,
malignant neoplasms that are derived from epithelial cells) that
belong to patients’ reproductive systems without having to
explicitly link each kind of carcinoma (e.g., ovarian, uterine,
testicular) to the organ and associated organ system.
4</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>OUR PROPOSED ONTOLOGY</title>
      <p>
        While we consider the work of Smith et al. to be a
significant improvement over the aforementioned classification
systems and terminologies, a number of ontologies have
been developed after this work was published. We take
advantage of these more recent ontologies as follows. First, we
make use of the terms and relations from the Cell Ontology
(CL)
        <xref ref-type="bibr" rid="ref2">(Diehl et al. 2016)</xref>
        to represent the types of cells from
which a malignant neoplasm arises. Moreover, as a result of
our work, the CL added the terms abnormal cell, neoplastic
cell, and malignant cell in order to better represent cell types
that play integral roles in tumor formation:
•
abnormal cell: A cell found in an organism or
derived from an organism exhibiting a phenotype that
•
•
•
•
•
deviates from the expected phenotype of any native
cell type of that organism. Abnormal cells are
typically found in disease states or disease models.
neoplastic cell: An abnormal cell exhibiting
dysregulation of cell proliferation or programmed
cell death and capable of forming a neoplasm, an
aggregate of cells in the form of a tumor mass or an
excess number of abnormal cells (liquid tumor)
within an organism.
malignant cell: A neoplastic cell that is capable of
entering a surrounding tissue.
      </p>
      <p>
        Second, an important criterion in Smith et al.’s definition
of an entity being pathological is that it is predisposed to
have health related consequences
        <xref ref-type="bibr" rid="ref10 ref11">(Smith et al. 2005b)</xref>
        . To
more precisely account for predispositions of this sort, we
adopt Ontology for General Medical Sciences’ (OGMS)
model of disease
        <xref ref-type="bibr" rid="ref7">(Scheuermann et al. 2009)</xref>
        . In OGMS, a
disease is type of disposition that is manifested (or realized)
during those processes that compromise an organism’s
physiological health. This permits us to represent that an
organism may have a disease even though the disease is not
currently being realized. A malignant neoplasm, for
instance, may shed malignant cells that remain dormant in the
patient until at some later time they begin to proliferate.
During this dormant period, these malignant cells possess
the disposition for undergoing uncontrolled cell
proliferation, although the disposition is not being realized.
Similarly, the genome within a native cell may have mutations in
its BRCA1 or BRCA2 genes, but the cell may behave
normally until certain cellular processes uncover the
pathological effects of these mutations. Using the dispositional
account of disease, we then incorporate the Disease
Ontology’s (DO) representation of cancer as follows:
disease: A disposition (i) to undergo pathological
processes that (ii) exists in an organism because of
one or more disorders in that organism.11
disease of cellular proliferation: A disease that is
characterized by abnormally rapid cell division.
cancer: A disease of cellular proliferation that is
malignant and primary, characterized by
uncontrolled cellular proliferation, local cell invasion and
metastasis.
      </p>
      <p>Recall that above we criticized the DO for its
inconsistent usage of the term ‘neoplasm’. However, given the
need to represent the dispositional aspect of cancer, we find
DO’s hierarchy appropriate for characterizing cancer as we
are clear and consistent about which sense of ‘cancer’ we
are using.
11 DO uses the OGMS term disease.
Third, in order to account for Smith et al.’s distinction
between material pathological entities and material
anatomical entities, we adopt OGMS’ account of a disease (as a
disposition) being based on a disorder:
disorder: A material entity which is clinically
abnormal and part of an extended organism.
Disorders are the physical basis of disease.</p>
      <p>Since OGMS uses the Basic Formal Ontology (BFO) as
its upper-level framework and a disease in OGMS is type of
BFO disposition, it cannot (like all dispositions) exist on its
own. Rather, a disease must be borne by a disorder whose
structural abnormalities serve as a disease’s basis. For
example, a sprained ankle is a disorder in the sense that the
physical structures are clinically abnormal, and these
physiological abnormalities are the reason that a sprained ankle is
disposed to swell.</p>
      <p>Fourth, to relate a disease to the disorder upon which it
is based, we define the has material basis in relation as
follows:
has material basis in: A primitive relation
between an instance of a disease i and an instance of
a disorder j at particular time t in which i exists
because of the physical makeup of some part of j at
time t.</p>
      <p>In addition to relating a disease to its basis, we must also
account for the processes that realize (or make manifest) an
instance of a disease. For this we use OGMS’ term
pathological bodily process:
pathological bodily process: A bodily process that
is clinically abnormal.</p>
      <p>As observed in the definition, a pathological bodily
process is a type of bodily process. However, the term bodily
process is not defined in OGMS.</p>
      <p>
        Fifth, in order to account for the temporal development
of malignant neoplasms, we make use of the Relations
Ontology’s derives from and develops from relations
        <xref ref-type="bibr" rid="ref10 ref11 ref6">(Smith
et al. 2005a)</xref>
        . The derives from relation is similar to the
aforementioned derived from relation, but adds the criteria
that the originating entity ceases to exist when the new
entity is created and the newly created entity inherits a
significant portion of its matter from the originating entity. For
example, the assertion:
      </p>
      <p>abnormal cell derives from native cell
entails that a particular native cell no longer exists once the
abnormal cell derived from it comes into existence.</p>
      <p>The develops from relation also represents new entities
that arise from previously existing entities, but does not
require that the originating entity cease to exist. This allows
us to represent that an instance of a secondary neoplasm
material anatomical entity: An anatomical entity
that has mass.
anatomical structure: A material anatomical entity
that is a single connected structure with inherent
3D shape generated by coordinated expression of
the organism's own genome.
pathological anatomical structure: A material
entity that comes into being as a result of changes in
some pre-existing anatomical structure through
processes other than the expression of the normal
complement of genes of an organism of the given
type, and is predisposed to have health-related
consequences for the organism in question manifested
by symptoms and signs.</p>
      <p>We note here that although intuitively a pathological
anatomical structure is a type of anatomical structure, for
reasons that will be discussed below, we classify them in
separate hierarchies. Moreover, we assert that a particular
pathological anatomical structure (1) develops from an
instance of a previously existing anatomical structure, and
(2) has part an instance of a disorder. These two assertions
define both necessary and sufficient conditions for an entity
to be a pathological anatomical structure.</p>
      <p>Lastly, with the above modifications in place, we define
the following terms to necessary for an ontology of
malignant neoplasms:
develops from an instance of a primary neoplasm without
having to commit the primary neoplasm’s ceasing to exist.</p>
      <p>
        Sixth, given the importance of representing the
anatomical structures in which malignant neoplasm from, we
incorporate the Uberon’s anatomical structure and OGMS’
pathological anatomical structure terms, and define them as
follows
        <xref ref-type="bibr" rid="ref5">(Mungall et al. 2012)</xref>
        :
dysregulation of cell proliferation: A pathological
bodily process during which cell proliferation
occurs at a level not normal for that cell type in its
native context.
neoplasm: A disorder that results from
dysregulation of cell proliferation (uncontrolled cell
proliferation).
malignant neoplasm: A neoplasm that has acquired
the disposition to invade surrounding tissues and
spread to remote anatomical sites.
primary neoplasm: A malignant neoplasm that is
found in the site where the malignant cells first
began proliferating.
secondary neoplasm: A malignant neoplasm that
develops from a primary neoplasm.
      </p>
      <p>A summary of proposed ontology of malignant
neoplasms is depicted in Figure 1.
5</p>
    </sec>
    <sec id="sec-6">
      <title>DISCUSSION</title>
      <p>We began our work in order to build an application
ontology to assist us in analyzing data in an ovarian cancer
patient registry (work in progress). Because of our
commitment to OBO Foundry principles and ontological realism,
we began our ontology development by considering existing
ontologies, including OGMS and DO, and related
ontologies such as Uberon and CL. Our aim has been to reuse
ontology classes where possible and create new classes and
hierarchies where existing ontologies either are missing
classes or providing faulty modeling of the domain.</p>
      <p>We have found the NCIT to be a very useful source of
information about cancer related entities, their definitions,
and their relationships to each other. Although the NCIT is
very large and has been developed over many years, it really
remains a terminology rather than an ontology. For
example, the NCIT includes the term Disease or Disorder
defined as:</p>
      <p>Any abnormal condition of the body or mind that
causes discomfort, dysfunction, or distress to the
person affected or those in contact with the person.
The term is often used broadly to include injuries,
disabilities, syndromes, symptoms, deviant
behaviors, and atypical variations of structure and
function.</p>
      <p>This definition does not adequately distinguish between
the processes and material entities that result in abnormal
conditions. This distinction is important for precisely
representing the nature of a malady. If a cancer patient has
difficulty breathing due to metastatic tumors spreading
throughout the lungs, both the difficulty in breathing and the tumors
are abnormal conditions, and hence, are would be classified
using the term Disease or Disorder (C2991). But, in reality,
the process of breathing is a distinct kind of entity than a
tumor, which is a material entity. There are past and current
efforts to redevelop NCIT or at least sections of it into a
proper ontology. Our hope is these efforts will make the
NCIT more aligned with OBO Foundry principles. One
important result of our work was the addition the abnormal
cell, neoplastic cell, and malignant cell types to CL. These
CL classes parallel the naming and relationships of the
NCIT concepts, but as discussed above, we chose to write
new definitions that better define these cell types and do not
limit their applicability unnecessarily.</p>
      <p>In considering the Disease Ontology, we found it to be a
useful catalog of cancer types, but as discussed above, we
find that there is confusion as to whether neoplasms are
dispositions or disorders. Because of our need to represent
pathological findings, we need to reflect that these findings
are about disorders (which are material entities) that are
observed by pathologists, and not about dispositions, which
are not directly observable.</p>
      <p>An important finding of our work is that we found that
OBO Foundry ontologies have difficulty representing
abnormal or pathological entities. Two prominent examples
are pathological anatomical structures and pathological
processes. Intuitively, a pathological anatomical structure is a
kind of anatomical structure. For instance, an ovary
containing a carcinoma is still an instance of an ovary. However,
the standard definition (with some variations) for
anatomical structure found in Uberon, the Common Anatomy
Reference Ontology, the Foundational Model of Anatomy, and
the Anatomical Entity Ontology does not allow for this:12
Material anatomical entity that has inherent 3D
shape and is generated by coordinated expression
of the organism's own genome.</p>
      <p>
        This issue is that disorders (such as neoplasms and
fractures) that arise with anatomical structures are not
necessarily generated by the organism’s genome. Thus, the definition
is too strong. Smith et al. are aware of this propose an
anatomical hierarchy consisting of top-level anatomical
structure term with subtypes of canonical anatomical structure,
variant anatomical structure, and pathological anatomical
structure
        <xref ref-type="bibr" rid="ref10 ref11 ref6">(Smith et al. 2005a)</xref>
        :
•
      </p>
      <p>Anatomical structure
o Canonical anatomical structure
o Variant anatomical structure
o Pathological anatomical structure</p>
      <p>While we think this a reasonable proposal, the lack of a
definition for anatomical structure makes is unclear as to
what canonical, variant, and pathological structures have in
common.</p>
      <p>A similar problem exists for abnormal processes (such as
dysregulation of cell proliferation). The OGMS, for its part,
does provide the term pathological bodily process. But, this
term is orphaned from other biological processes found in
other OBO ontologies. For example, the Gene Ontology
(GO) includes the term biological process:13</p>
      <p>Any process specifically pertinent to the
functioning of integrated living units: cells, tissues, organs,
and organisms. A process is a collection of
molecular events with a defined beginning and end.</p>
      <p>Again, intuitively it makes senses that a pathological bodily
process should be a subtype of biological process.
However, the definition of biological process does not permit this.</p>
      <p>Although we do not have any concrete solutions, at this
point, for how to align pathological structures and processes
12 Definitions retrieved from www.ontobee.org, accessed 2017-07-06.
13 Definition retrieved from www.ontobee.org, accessed 2017-07-06.
with these other OBO terms, we look forward to
collaborating with the OBO Foundry community on creating a
coherent structure for these upper level classes that is shared
among all OBO Foundry ontologies. Thus, we simply leave
pathological anatomical structure as a subtype of material
anatomical entity, and pathological bodily process in its
current OGMS hierarchy.</p>
      <p>Our goal is to contribute to the oncology domain by
creating a strong and consistent ontological foundation for
providing metadata and data analysis of patient cancer data
for both research and clinical applications including clinical
decision support. The ontological framework described
herein attempts to solve some continuing issues in the
representation of cancer as a disease and the disorders
(neoplasms) in which it presents. Our framework is intended to
be useful for the description and classification of data used
in cancer diagnosis and treatment. In future work, we will
be adding classes to represent additional entities associated
with cancer such as laboratory methods and results,
treatments, and outcomes. We hope our ontology will support
other oncology researchers in exploiting the full potential of
patient data registries and other cancer-related datasets.</p>
    </sec>
    <sec id="sec-7">
      <title>ACKNOWLEDGEMENTS</title>
      <p>We gratefully acknowledge support as follows. William
Duncan and Carmelo Gaudioso received support from the
Clinical Data Network, a Roswell Park Cancer Institute
Cancer Center Support Grant shared resource funded by
NCI P30CA16056. Alexander Diehl received support from
NCATS 5UL1TR001412. All three authors received support
from NCI P50CA159981.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Arp</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Spear</surname>
            ,
            <given-names>A.D.</given-names>
          </string-name>
          (
          <year>2015</year>
          ).
          <article-title>Building Ontologies With Basic Formal Ontology</article-title>
          . The MIT Press. doi:
          <volume>10</volume>
          .7551/mitpress/9780262527-
          <fpage>811</fpage>
          .
          <fpage>001</fpage>
          .0001.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Diehl</surname>
            ,
            <given-names>A. D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Meehan</surname>
            ,
            <given-names>T. F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bradford</surname>
            ,
            <given-names>Y. M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brush</surname>
            ,
            <given-names>M. H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dahdul</surname>
            ,
            <given-names>W. M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dougall</surname>
            ,
            <given-names>D. S.</given-names>
          </string-name>
          , …
          <string-name>
            <surname>Mungall</surname>
            ,
            <given-names>C. J.</given-names>
          </string-name>
          (
          <year>2016</year>
          ).
          <source>The Cell Ontology</source>
          <year>2016</year>
          :
          <article-title>enhanced content, modularization, and ontology interoperability</article-title>
          .
          <source>Journal of Biomedical Semantics</source>
          ,
          <volume>7</volume>
          (
          <issue>44</issue>
          ).
          <source>doi: 10.1186/s13326-016- 0088-7</source>
          . PMCID: PMC4932724. https://github.com/obophenotype/cellontology.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Hanna</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Joseph</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brochhausen</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Hogan</surname>
            ,
            <given-names>W. R.</given-names>
          </string-name>
          (
          <year>2013</year>
          ).
          <article-title>Building a drug ontology based on RxNorm and other sources</article-title>
          .
          <source>Journal of Biomedical Semantics</source>
          ,
          <volume>4</volume>
          (
          <issue>44</issue>
          ). doi:
          <volume>10</volume>
          .1186/2041-1480-4-
          <fpage>44</fpage>
          . PMCID:
          <fpage>PMC3931349</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Hastings</surname>
            , J., de Matos,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dekker</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ennis</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Harsha</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kale</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Muthukrishnan</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Owen</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Turner</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Williams</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Steinbeck</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          (
          <year>2013</year>
          ).
          <article-title>The ChEBI reference database and ontology for biologically relevant chemistry: enhancements for 2013</article-title>
          .
          <source>Nucleic Acids Research</source>
          ,
          <volume>41</volume>
          (Database issue),
          <fpage>D456</fpage>
          -
          <lpage>D463</lpage>
          . doi:
          <volume>10</volume>
          .1093/nar/gks1146.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>Mungall</surname>
            ,
            <given-names>C. J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Torniai</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gkoutos</surname>
            ,
            <given-names>G. V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lewis</surname>
            ,
            <given-names>S. E.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Haendel</surname>
            ,
            <given-names>M. A.</given-names>
          </string-name>
          (
          <year>2012</year>
          ).
          <article-title>Uberon, an integrative multi-species anatomy ontology</article-title>
          .
          <source>Genome Biology</source>
          ,
          <volume>13</volume>
          (
          <issue>1</issue>
          ), R5. doi:
          <volume>10</volume>
          .1186/gb-2012
          <source>-13-1-r5. PMCID: PMC3334586</source>
          . http://uberon.github.io.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>Rosse</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kumar</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mejino</surname>
            ,
            <given-names>J. L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cook</surname>
            ,
            <given-names>D. L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Detwiler</surname>
            ,
            <given-names>L. T.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          (
          <year>2005</year>
          ).
          <article-title>A Strategy for Improving and Integrating Biomedical Ontologies</article-title>
          .
          <source>AMIA Annual Symposium Proceedings</source>
          ,
          <year>2005</year>
          ,
          <fpage>639</fpage>
          -
          <lpage>643</lpage>
          . PMCID: PMC1560467
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>Scheuermann</surname>
            ,
            <given-names>R.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ceusters</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>B. Toward</given-names>
          </string-name>
          <article-title>an ontological treatment of disease and diagnosis</article-title>
          . San Francisco:
          <source>Proceedings of the 2009 AMIA Summit on Translational Bioinformatics</source>
          ,
          <year>2009</year>
          ,
          <fpage>116</fpage>
          -
          <lpage>120</lpage>
          . PMCID: PMC3041577. https://github.com/OGMS/ogms.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>Schriml</surname>
            ,
            <given-names>L. M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Mitraka</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          (
          <year>2015</year>
          ).
          <article-title>The Disease Ontology: fostering interoperability between biological and clinical human disease-related data</article-title>
          .
          <source>Mammalian Genome</source>
          ,
          <volume>26</volume>
          (
          <fpage>9</fpage>
          -10):
          <fpage>584</fpage>
          -
          <lpage>589</lpage>
          . doi:
          <volume>10</volume>
          .1007/s00335- 015-9576-
          <fpage>9</fpage>
          . PMCID: PMC4602048. http://disease-ontology.org.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <surname>Sioutos</surname>
          </string-name>
          , N.,
          <string-name>
            <surname>de Coronado</surname>
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Haber</surname>
            <given-names>M.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hartel</surname>
            <given-names>F.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shaiu</surname>
            <given-names>WL</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Wright</surname>
            <given-names>LW.</given-names>
          </string-name>
          (
          <year>2007</year>
          ).
          <article-title>NCI Thesaurus: A semantic model integrating cancer-related clinical and molecular information</article-title>
          .
          <source>Journal of Biomedical Informatics</source>
          ,
          <volume>40</volume>
          (
          <issue>1</issue>
          ):
          <fpage>30</fpage>
          -
          <lpage>43</lpage>
          . doi:
          <volume>10</volume>
          .1016/j.jbi.
          <year>2006</year>
          .
          <volume>02</volume>
          .013. PMID:
          <volume>16697710</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ceusters</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Klagges</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Köhler</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kumar</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lomax</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mungall</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Neuhaus</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rector</surname>
            ,
            <given-names>A.L.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Rosse</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          (
          <year>2005a</year>
          ).
          <article-title>Relations in biomedical ontologies</article-title>
          .
          <source>Genome Biology</source>
          ,
          <volume>6</volume>
          (
          <issue>5</issue>
          ): R46. doi:
          <volume>10</volume>
          .1186/gb-2005
          <source>-6-5-r46. PMCID: PMC1175958.</source>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <surname>Smith</surname>
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kumar</surname>
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ceusters</surname>
            <given-names>W.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Rosse</surname>
            <given-names>C.</given-names>
          </string-name>
          (
          <year>2005b</year>
          ).
          <source>On Carcinomas and Other Pathological Entities. Comparative and Functional Genomics</source>
          ,
          <volume>6</volume>
          (
          <issue>7</issue>
          -8):
          <fpage>379</fpage>
          -
          <lpage>387</lpage>
          . doi:
          <volume>10</volume>
          .1002/cfg.497. PMCID:
          <fpage>PMC2447494</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ashburner</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosse</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bard</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bug</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ceusters</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goldberg</surname>
            ,
            <given-names>L. J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Eilbeck</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ireland</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mungall</surname>
            ,
            <given-names>C. J.</given-names>
          </string-name>
          ,
          <source>The OBI Consortium</source>
          , Leontis,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Rocca-Serra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Ruttenberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Sansone</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.-A.</given-names>
            ,
            <surname>Scheuermann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. H.</given-names>
            ,
            <surname>Shah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Whetzel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. L.</given-names>
            , &amp;
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.</surname>
          </string-name>
          (
          <year>2007</year>
          ).
          <article-title>The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration</article-title>
          .
          <source>Nat Biotechnology</source>
          ,
          <volume>25</volume>
          (
          <issue>11</issue>
          ):
          <fpage>1251</fpage>
          -
          <lpage>1255</lpage>
          . PMCID:
          <fpage>PMC2814</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>