=Paper= {{Paper |id=Vol-2931/ICBO_2019_paper_29 |storemode=property |title=Foundations for a Realism-based Drug Repurposing Ontology |pdfUrl=https://ceur-ws.org/Vol-2931/ICBO_2019_paper_29.pdf |volume=Vol-2931 |authors=James Schuler,William Mangione,Ram Samudrala,Werner Ceusters |dblpUrl=https://dblp.org/rec/conf/icbo/SchulerMSC19 }} ==Foundations for a Realism-based Drug Repurposing Ontology== https://ceur-ws.org/Vol-2931/ICBO_2019_paper_29.pdf
                      Foundations for a Realism-based Drug Repurposing Ontology
                      James Schuler, William Mangione, Ram Samudrala*, Werner Ceusters*

Department of Biomedical Informatics, Jacobs School of Medicine and Biomedical Sciences at the University at Buffalo, Buffalo, NY,
                                                             USA

                                                     *senior and corresponding authors

Abstract                                                                     requires formalization and standardization at all levels of
                                                                             representation ranging from data to information over
Several ontologies represent entities pertinent to the domain
                                                                             knowledge, using methods that avoid ambiguities,
of medicinal drugs. An analysis of these ontologies and the
                                                                             redundancies, and information loss. One such method is
related literature shows that they primarily do so from the
                                                                             realism-based ontology.
perspective of treatment and that the definitions for many of
the core entities fall short when applied to drug discovery in               Aspects of biomedicine that have yet to be described
general and drug repurposing in particular. We therefore                     ontologically are drug discovery and drug repurposing.
redefined or created new elucidations and definitions for                    Any drug discovery pipeline involves scientists from
terms which are most important to understanding what is                      numerous disciplines working at different levels of
meant by ‘drug repurposing’ using guidelines of ontological                  granularity. This leads to numerous, perhaps conflicting,
realism, thereby making judicious use of the Basic Formal                    understandings of terms such as ‘drug’ and ‘drug discovery’.
Ontology, the Ontology for Biomedical Investigations, the                    A typical process of drug discovery begins when a biomedical
Ontology for General Medical Science, and the Drug                           researcher identifies a protein involved in some disease. A
Ontology. We tested the appropriateness of these                             computational researcher then uses digital models of the
modifications for the description of a use case on what is                   protein and some drug, together with some protocol to use
involved, and inferred when using the Computational Analysis                 molecular docking to measure the energy of binding (how
of Novel Drug Opportunities (CANDO) drug repurposing                         strong the chemical interaction is) and find the binding pose
platform. We found that the definitions proposed remove some                 (the spatial relationship between all atoms in the
of the shortcomings of other ontologies but that still more                  compound-protein system) of the drug to the protein. Based on
work is needed to address all issues.                                        these results, the next experiment undertaken may be
Keywords:                                                                    measuring cell growth in a petri dish, when those cells
drug repurposing, ontological realism, Basic Formal Ontology                 containing the protein are treated with the drug, i.e., subjected
                                                                             to the presence of some preparation containing the small
Introduction                                                                 molecule, e.g., in a liquid preparation. This is an ​in vitro
                                                                             experiment. In some ​in vivo work which follows, some pill or
                                                                             injectable solution containing the drug may be given to some
It is critical for all involved in any aspect of biomedicine to
                                                                             animal model, e.g., an animal such as a mouse which has a
stay on top of advances in the state of the art of the interplay
                                                                             disease that is assessed to be similar to a disease which occurs
between drugs and the human body. This is true at all levels of
                                                                             in humans. If these preclinical studies are successful, then
granularity: from the level at which basic science researchers
                                                                             clinical trials can be undertaken, going through different
study how drug molecules interact with cellular and
                                                                             phases (I to IV in the United States), with different
subcellular structures, all the way up to the level at which
                                                                             formulations of the drug and different patient populations. The
clinicians are aiming to provide optimal direct patient care by
                                                                             Food and Drug Administration (FDA) or relevant government
prescribing the best suited medicinal products for the diseases
                                                                             authority may then approve the drug for sale and distribution
from which their patients are suffering.
                                                                             for the studied disease.
The amount of information generated is enormous and sifting
                                                                             Some compounds hypothesized to have useful medicinal
through it a tedious task unless it could be supported by
                                                                             properties do not have known ‘targets’, so a pharmaceutical
accurate and reliable automatic methods. This requires, for
                                                                             company or research group may perform a ‘high throughput
instance, that such automatic methods would come with some
                                                                             screening’ experiment ​(1)​. In these experiments, the action of
form of understanding what it means for something to be a
                                                                             many different compounds against many different proteins are
drug, and to understand what it means for something to be a
                                                                             measured in a large well-plate, with promising compounds
treatment. It would require also that researchers present their
                                                                             (‘hits’) moving on to more careful and specific investigations,
findings in a way that minimizes the risk for automatic
methods to misunderstand what is being conveyed. This

Copyright © 2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
ideally culminating in clinical trials and safe, efficacious         useful existing resources, and 3) ontological analysis,
human use.                                                           definition and elucidation of key entities to jumpstart a future
                                                                     ontology for drug repurposing.
Under some circumstances clinicians may prescribe some drug
for another type of disease if they believe it is medically          Literature review
sound. In common language, ‘drug repurposing’ and its
                                                                     We used the general Google search engine, Google Scholar,
synonym ‘drug repositioning’ mean finding a new use for an
                                                                     and PubMed to look for research articles using combinations
old or previously approved drug. A classic example of drug
                                                                     of the following terms anywhere in the document or all in the
repurposing is sildenafil (Viagra) ​(2)​. Originally developed to
                                                                     title: ‘drug repurposing’ (‘drug repositioning’), ‘ontology’,
treat high blood pressure and chest pain, the male participants
                                                                     and ‘BFO’. The search parameters and counts were
in the early clinical trials noticed peculiar side effects pop up.
                                                                     established on April 5, and the searches themselves conducted
Sildenafil was then studied and sold for treating erectile
                                                                     on April 9. The number of articles found is listed in ​Table 1​,
dysfunction; it was successfully ‘repurposed’ from one
                                                                     but relevant articles are scarce.
indication to another. Sildenafil has in fact been repurposed
for a second time, in this case, to treat pulmonary hypertension               Table 1: Search for Relevant Publications
(3)​.                                                                 search term        search   Google        Google PubMed
                                                                                        method    search        Scholar
In the above example, drug repurposing was driven by
coincidental observations. A better approach would be to turn        drug             anywhere in 25,300           1,530   40
it into an active search process. That is the goal of the            repurposing,     document
Computational Analysis of Novel Drug Opportunities                   ontology         in title         64               4          3
(CANDO) platform for shotgun drug repurposing ​(4–10)​. The          drug             anywhere in 24,000            1,610         29
platform uses large-scale molecular modeling and docking             repositioning,   document
simulations to calculate drug-target interactions to infer           ontology         in title          4               1          0
similarity of drug behavior on a proteomic scale. CANDO is           drug           anywhere in          252            9          0
composed of several key components such as drug/compound             repurposing,   document
and protein structural data and drug-indication associations         BFO            in title               0            0          0
(data on whether a particular drug is used in the treatment of a     drug           anywhere in          355            7          0
given indication). Although CANDO has already                        repositioning, document
demonstrated success ​(4)​, our hypothesis is that a better          BFO            in title               0            0          0
ontological understanding of drug repurposing experiments
                                                                     drug           anywhere in          135            8          0
and of the relationship between drugs/compounds and diseases
                                                                     repurposing,   document
will increase the benchmarking performance of the platform
                                                                     ontology, BFO in title                 0           0          0
and the fidelity of our models to reality. Furthermore, we
believe that the integration of realism-based ontologies in          drug           anywhere in          254            7          0
CANDO will ensure our work to be directly comparable with            repositioning, document
other drug discovery, development, and repurposing                   ontology, BFO in title                 0           0          0
approaches.
The data sources we have used thus far in CANDO versions             Most within the scope is ‘An Ontology for Description of
include non-ontologic understandings of compounds and                Drug Discovery Investigations’ which follows OBO Foundry
disease. For example, in version 1 of CANDO (v1) we used a           principles, uses BFO as its upper-level ontology, and makes
compound-indication association mapping from the                     judicious use of definitions from Ontology for Biomedical
Comparative Toxicogenomics Database (CTD) where the                  Investigations (OBI) and the Information Artifact Ontology
indications are labeled with a Medical Subject Headings              (IAO). This research focuses on the use case of a robot for
(MeSH) identification ​(11)​. MeSH is not an ontology, and           screening compounds and individual results as opposed to
there are known issues ​(12)​. Additionally, our drug and            answering, ‘What is Drug Discovery/Repurposing?’ ​(13)​.
protein structure data sets have never been curated with any         Also pertinent is An Ontology for Pharmaceutical Ligands and
ontologies. Therefore, we hypothesize by integrating Open            Its Application for in Silico Screening and Library Design’
Biomedical Ontologies (OBO) Foundry ontologies which                 (14)​. The researchers sought to fill what they saw as a void in
follow ontological realism into CANDO, we will obtain more           annotation schemes for pharmaceutical ligands, as at the time
accurate results with an increased fidelity to reality from our      annotation efforts were focused on genomic sequences. The
models enabling us to bring repurposed drugs to the market           theme of this work was on development of databases.
quicker and in a more cost efficient manner.                         Additionally, they claimed – oddly – a function of a ‘drug’ to
This paper aims to lay the foundations for this effort.              be at the level of an individual molecular entity.
                                                                     Finally, we can mention Gómez-Pérez ​et al.​, who reviewed
Methods                                                              several important ontologies used in medicinal chemistry ​(15)​.
                                                                     They write short characterizations of ontologies without
We followed a three-step approach: 1) extensive search for           delving into much detail or describing strengths and
relevant literature in drug repurposing, 2) identification of        weaknesses of a particular tool. The ontologies they enumerate
are grouped into the following categories: ontologies about the     namely, to ontologically describe what is involved, and what
classification of chemical compounds, ontologies about the          is inferred when using the Computational Analysis of Novel
classification of drugs, and ontologies about drug discovery,       Drug Opportunities (CANDO) drug repurposing platform ​(4)​.
design, and development.
We thus did not identify any attempt towards formal                 Results
constructions of a drug repurposing ontology, but only work
which uses ontology as part of a drug repurposing experiment.       Definitions
To prepare for the second step, we took a broad view in             Our definitions or elucidations for all terms we have created or
analyzing these works, thereby critically analyzing key aspects     changed are listed in ​Table 2​.
of drugs, treatment, drug discovery, and drug repurposing as
                                                                    Ontological description of a CANDO use case
documented in the literature and identifying shortcomings in
these attempts.                                                     A key aspect of CANDO is modeling the interaction of
                                                                    compounds with proteins. We have many instances of models
Relevant existing ontologies                                        of     ChEBI:molecules, including ChEBI:protein and
                                                                    ChEBI:compound. Using an instance of some molecular
In our attempt to define drug repurposing and build a Drug
                                                                    docking software (which is some subtype of OBI:software),
Repurposing Ontology of related and important terms, we
                                                                    e.g., ‘CANDOCK’ ​(33)​, we predict the pose of an interacting
have made judicious use of established ontologies, especially
                                                                    compound and protein structure, as well as the corresponding
those espousing ontological realism and adhering to the
                                                                    interaction score/energy. After combining individual
principles of the Open Biomedical Ontologies (OBO) Foundry
                                                                    OBI:datum together, we can complete a process of
(16,17)​. Most of the OBO Foundry ontologies have been built
                                                                    OBI:drawing a conclusion based on data and then participate
using Basic Formal Ontology (BFO) as a top level ontology,
                                                                    in a OBI:prediction about what scattered molecular aggregate
and we retain this for its use ​(18)​.
                                                                    whose parts are individual molecular compounds from the
The BioAssay Ontology (BAO) was originally developed to             earlier computational experiment can be used in some
support standardization of data generation, collection, and         DRO:treatment of a given OGMS:disease after ingestion using
searching from high-throughput screening (HTS) experiments          an appropriate DrOn:drug product.
(19)​. It was then extensively further developed, expanding its
                                                                    The entire process of using CANDO is an occurrent part of
scope to assays and screening results beyond HTS. This
                                                                    some DRO:drug repurposing. Other researchers may use
included many entities relevant to drug discovery and drug
                                                                    hypotheses generated by us to inform them of which further
repurposing ​(20–22)​.
                                                                    occurrent parts of the drug repurposing process need to occur,
Recently, efforts have been made to work with other                 for example, a preclinical study using a mouse model, or a
ontologies, such as the Ontology for Biomedical                     clinical trial with human participants.
Investigations (OBI) ​(23,24)​. The GPCR Ontology is an effort
to describe one specific type of ‘drug targets’, G-protein          Discussion
coupled receptors (GPCRs), and was intended to integrate
with the BAO ​(25)​. The Drug Target Ontology hopes to              What counts as ‘drug’?
describe the sorts of entities with which the molecular entity
of ‘drug’ may interact and cause some effect ​(26)​.                The creators of DrOn recognize different levels of granularity
The most relevant previous work is the Drug Ontology (DrOn)         when discussing drugs. First and foremost are the individual
(27–30)​, developed by practitioners of ontological realism and     molecular entities, namely the single instances of compounds.
aligned with OBO Foundry ontologies. It turned out to be an         Next are collections of instances of molecular entities, i.e., the
adequate tool as a starting point for our work.                     ‘portion of pure substances’, and the subtypes ‘portion of
                                                                    compound’ and ‘portion of element’, or ‘portion of mixture’.
                                                                    Finally there is the ‘drug product’, e.g., a tablet with a specific
Ontological analysis                                                amount of some ‘scattered molecular aggregate’ which has an
Through careful reading of the biomedical ontology literature       ‘active ingredient role’ and another scattered molecular
and through analysis of definitions and elucidations found          aggregate, with an ‘excipient role’. Additionally, parts of
using Ontobee ​(31)​, we attempted to describe a drug               DrOn include realizable entities that inhere in molecular
repurposing experiment using available terms, but we found          entities, such as the disposition of an individual molecule to
these terms and their definitions, insofar available, inadequate.   bind to a protein. The DrOn also reveals issues of drug-related
With this in mind we delved into redefining or creating new         entities of other terminologies and ontologies, including those
definitions for terms which are most important to                   present in the: NDF-RT (National Drug File - Reference
understanding what is meant by ‘drug repurposing’ using             Terminology) ​(34)​,        SNOMED            CT (Systematized
guidelines of ontological realism, thereby making judicious         Nomenclature of Medicine -- Clinical Terms) ​(35)​, ChEBI
use of BFO, OBI, the Ontology for General Medical Science           (Chemical Entities of Biological Interest) ​(36)​, OBI, and ATC
(OGMS) ​(32)​, and with a focus on the Drug Ontology.               (Anatomical Therapeutic Chemical Classification System)
Finally, we applied our new understanding of the entities           (37)​.
involved in drug repurposing to describe a use case example,
   Table 2: Foundational definitions for drug repurposing              Nonetheless, the definition for drug product uses the phrase ‘at
 applications. Proposed terms are in bold, re-used terms from          least one scattered molecular aggregate as part’, which implies
               existing ontologies are in italics.                     a drug product could exist with a single scattered molecular
                                                                       aggregate as a part. This seems to be inconsistent.
Discovery: ​process that creates ​information content entities
                                                                       One way to solve this inconsistency, and to better represent
about aspects of a ​portion of reality which were not
                                                                       the reality of drugs and drug repurposing, is to use a term to
documented in some existing body of ​information content
                                                                       signify an object aggregate consisting of molecular entities.
entities g​ enerally available to some community​.
                                                                       There are related terms in DrOn, chiefly, ‘portion of pure
                                                                       substance’, ‘portion of mixture’ and ‘scattered molecular
Drug discovery​: ​discovery documenting the ​disposition of
                                                                       aggregate’. We believe changing the definition of SMA to, ‘an
a ​scattered molecular aggregate to regain or maintain
                                                                       object aggregate that consists of all molecules that are located
homeostasis.
                                                                       in some bounded region’, provides nice solutions, namely,
Drug repurposing: drug discovery documenting the                       removing the inconsistency, and giving us the ability to talk
disposition of a ​scattered molecular aggregate to ​treat              about both portions of pure substances and portions of
some ​disease​, when another such ​disposition is already              mixtures.
documented.                                                            A drug product is not generally without use, however. Indeed,
                                                                       a function which inheres in a given drug product may be an
Treatment / to treat​: ​process that influences the                    instance of an entity we call ‘scattered molecular aggregate
realization​ of a ​disease​ toward homeostasis.                        delivery’, which we define as, ‘a function of a drug product to
                                                                       enable some molecular aggregate to be located in the
Scattered molecular aggregate​: ​object aggregate that                 appropriate spatiotemporal region such that the molecular
consists of all molecules that are located in some bounded             aggregate can participate in treatment’. It is critical a scattered
region.                                                                molecular aggregate is at the appropriate location at the
                                                                       correct time to realize its disposition.
Scattered molecular aggregate delivery​: ​function of a                Drug Discovery and Drug Repurposing as a process
drug product to enable some ​scattered molecular
                                                                       Drug repurposing is a subtype of drug discovery, which is a
aggregate to be located in the appropriate ​spatiotemporal
                                                                       subtype of discovery, which is a subtype of process. We do
region such that the ​scattered molecular aggregate can
                                                                       not claim to have proposed a general definition of ‘discovery’
participate​ in ​treatment
                                                                       as we recognize that the very notion crosses many boundaries
Prodrug​: ​role inhering in a ​scattered molecular aggregate           of sciences and that the term is also used in non-scientific
x​i composed out of molecules which have the ​disposition to           contexts. We do not, for instance, include uses of the word
undergo a chemical transformation to ​molecules ​of another            ‘discovery’ as when a child ‘discovers’ an Easter egg under
type resulting in x​i becoming the bearer of a ​disposition to         some plant while hunting for Easter eggs.
participate in a ​treatment​.                                          Treatment
                                                                       We found the term for ‘treatment’ from OGMS to be
                                                                       problematic, both in general usage and for our current needs.
While we found the Drug Ontology to be the best and most               Based on version 1.0 of BFO, the OGMS definition is ‘a
relevant ontology for our work in describing drug repurposing,         processual entity whose completion is hypothesized (by a
we do not commit to the existence and definition of certain            healthcare provider) to alleviate the signs and symptoms
                                                                                                        3
entities committed to in DrOn. This precludes us from                  associated with a disorder’ . Although present in the
accurately describing our drug repurposing research in their           OWL-version of OGMS, this term was not defined in the
terms.                                                                 foundational paper which is at the basis of the OGMS ​(32)​.
Firstly, we believe there is an inconsistency with two critical        Entities on the side of the patient should insofar possible never
terms used by DrOn. OBI defines a ‘scattered molecular                 be defined on the basis of what is known or hypothesized
aggregate’ (SMA) to be ‘a material entity that consists of all         about them. In this case, the definition allows for a physician
the molecules of a specific type that are located in some              to say ‘I hypothesize some homeopathic regimen will decrease
bounded region and which is part of a more massive material            the size of your tumor’. As any homeopathic treatment would
                                                            1
entity that has parts that are other such aggregates’ . DrOn           never be the causative agent in shrinking the size of the tumor,
uses SMA in related definitions. A ‘drug product’ is defined as        the hypothesis is false ​(38)​, but by the current definition, the
‘a material entity (1) containing at least one scattered               homeopathic regimen would be a treatment.
molecular aggregate as part that is the bearer of an active            We define ‘treatment’ as a ‘process that influences the
ingredient role and (2) that is itself the bearer of a clinical drug   realization of a disease toward homeostasis’. The consequence
     2
role’ . The definition as written implies that if a scattered          is that a ‘treatment’ that doesn’t work is not a treatment under
molecular aggregate exists, then it exists necessarily as part of      this definition. In other words: what one in general language
a larger entity with other scattered molecular aggregate parts.        would call ‘an unsuccessful treatment’ is under our definition
1
    http://purl.obolibrary.org/obo/OBI_0000576
2                                                                      3
    http://purl.obolibrary.org/obo/DRON_00000005                           http://purl.obolibrary.org/obo/OGMS_0000090
no treatment at all. Note that when such a process about which        consequences of a heart attack inheres in that particular
we hypothesize it will benefit the patient is started, we will        portion of aspirin, but if it has not been manufactured for that
only know whether the process is an instance of treatment             purpose, it is not its function. This is consistent with the Drug
after observing the desired results of the process. This is           Ontology to some degree, but we disagree about in what entity
similar with the side effect involved in the common definition        the function inheres. According to DrOn, it inheres in the drug
of chronic pain as ‘a pain that is present for at least 3 months’:    product (e.g., pill). We believe this to be false, and claim that
it means that when presented with a patient exhibiting pain           any realizable entity related to treatment inheres in some
since one day, that pain might already be a chronic pain but          scattered molecular aggregate (a term for which we are
we have to wait 3 months before we are able to identify that          suggesting an updated definition).
pain as such. Note also that it does not matter what kind of          Consider a person consuming a drug product for which it is
process is done or on what something is done as long as the           claimed that there inheres some function to treat renal cell
disease realization is changed towards homeostasis.                   carcinoma. If the drug product is a tablet which is meant to be
A scattered molecular aggregate may have the disposition to           chewed, and if the person chews the tablet, then the tablet is
influence the homeostasis of an organism. If this disposition is      no longer in existence, but no function to treat the cancer has
to regain or maintain homeostasis, and the scattered molecular        been realized. However, a portion of compound which was
aggregate exists in a sufficient amount, and the disposition is       previously a part of the tablet is appropriately distributed
realized, a treatment has occurred. If this disposition of a          throughout the body. The molecular entities which make up
scattered molecular aggregate was specifically evolved or             the SMA realize their disposition to bind to and inhibit certain
designed for, then it is a function.                                  disordered proteins, i.e., the disorder. In the ultimate case, the
Besides ‘homeostasis’, we are using also the OGMS terms and           renal cell carcinoma tumor is destroyed and the treatment
definitions of disorder, disease, and disease course by               process is complete. In this situation, there is indeed some
Scheuermann et al. ​(32) to justify our definition for treatment.     entity which was pivotal in the treatment, but it cannot have
With a disorder being the physical basis of some disposition to       been the tablet, as it was not in existence during the entire
undergo pathological processes (disease), and a disease course        temporal region during which the treatment, i.e., the
the totality of all processes through which a disease is realized.    elimination of the tumor, occurred. As we agree with the
Eliminating the disorder gets rid of the corresponding disease        creators of the Drug Ontology such a realizable entity does not
and any potential disease course thereof (although, of course,        inhere in individual molecules, we therefore say it must have
further disorders for which the former diseases was a                 been some scattered molecular aggregate.
pre-disposition might continue to exist). For example, if there       One question might be: which one precisely? There are indeed
is a mutation in one’s DNA which causes a protein to misfold          widely variable amounts of portions of compound in which
and perform some actions which, if left ‘untreated’ would             these treatment functions may inhere. For example, a function
cause problems in the heart leading to death, and if the totality     to treat a bacterial infectious disease may inhere in the
of misfolded proteins is successfully inhibited using some            scattered molecular aggregate which is contained in 20 tablets
‘drug’, then a disorder is still present, in the form of instances    of some antibiotic pill. In the case of a chronic illness such as
of misfolded proteins. The formation of a misfolded protein is        essential hypertension, a function inheres in the portion of
itself a pathological process, and so the disease is still being      compound contained in all the tablets a person with essential
realized. However, the temporal parts of the disease course           hypertension ingests over the course of some treatment.
that are realized after the drug is doing its job, are of different   Consider another example where a portion of compound has
types than the parts before: the disease has been influenced          some function to treat a disease, i.e., scientists have discovered
toward homeostasis so that the person will not, for example,          it has such a disposition, and the portions of compound are
experience heart problems or death; there will just be the            manufactured specifically for this purpose. If we have a
production of misfolded proteins.                                     powder of this portion of compound which can be absorbed
For every instance of a scattered molecular aggregate                 into the body through the buccal mucosa, enter the
composed of particular molecules, the disposition to treat a          bloodstream, and end up in the correct location where it will
particular disease inheres in all such instances. This is not to      be able to realize its function, then by simply placing the
say any instance of an SMA has some disposition to treat a            powder underneath the tongue, one is enabling the portion of
disease: only those whose parts consist of particular                 compound to begin the process of realizing its function. In this
molecules, i.e. those that have the disposition to interact with      case, no drug product is ever present as a Drug Ontology drug
bodily components such as proteins that participate in the            product contains, by definition, at least several scattered
realization of some disease. The disposition exists whether it        molecular aggregates as parts. The entity which participates in
is known to science or not.                                           the treatment which results in the beneficial amelioration of
A function to treat a disease only inheres in some portion of         some disorder, disease, or disease course is the molecular
compound if the molecular entity parts have evolved or been           aggregate of compound. Similarly, chewing tree bark which
designed to participate in the treatment process.                     contains a portion of aspirin to relieve headache involves no
                                                                      drug product ​(39)​.
If a company manufactures some portion of aspirin with only
the specific intent to treat headaches, this portion has the          Prodrugs
function to treat headaches, but has no other function. The           The view of some treatment disposition or function to inhere
disposition, but not the function, to prevent or minimize the         in a scattered molecular aggregate and not in a drug product
also lends itself to better understand ‘prodrugs’ and                accurately describe the reality of drug discovery and drug
combination therapies. A prodrug is generally described as ‘a        repurposing. The definitions proposed here remove some of
drug for which the dosed ingredient is an inactive or only           the shortcomings of other ontologies. More work is however
mildly efficacious entity, but once in the body it is converted      needed for ‘scattered molecular aggregate’: the revision
to the active ingredient by either a spontaneous or an               proposed here eliminates inconsistencies but leaves further
enzyme-catalysed reaction’ ​(40)​. Sofosbuvir, a drug used in        questions open.
the treatment of hepatitis C, is an example of a prodrug ​(41)​.     Acknowledgements
The scattered molecular aggregate which is in a drug product
may not have the disposition or function to engage in some           This work was supported in part by a 2010 National Institute
treatment for a given disease. Each individual molecular entity      of Health Director’s Pioneer Award [1DP1OD006779], a
does have the disposition to be modified in some way to a            National Institute of Health Clinical and Translational
molecular entity of a different type, and the resulting              Sciences Award [UL1TR001412], a National Library of
molecular aggregate, composed of different molecular entities,       Medicine T15 Award [T15LM012495], a NCI/VA BD-STEP
is where any realizable entity related to treatment inheres.         Fellowship in Big Data Sciences, and startup funds from the
                                                                     Department of Biomedical Informatics at the University at
Our new understanding of prodrugs can be highlighted with
                                                                     Buffalo.
several cases. A particular disease treatment may consist of
taking more than one drug product at a time. In one scenario,
                                                                     We wish to thank all members of the Fall 2018 Biomedical
one or both of the molecular aggregates in the drug products
                                                                     Ontology course from the Departments of Philosophy and
may have the disposition to treat the disease by themselves. In
                                                                     Biomedical Informatics, at the University at Buffalo, who
another, none of the molecular aggregates have any
                                                                     offered feedback on an early iteration of this work.
disposition to treat the disease by themselves, but rather only
when both are in the body at the same time does some
therapeutic effect occur. This type of interaction has been
                                                                     Address for correspondence
discovered through analysis of electronic health record (EHR)
data by Tatonetti et al ​(42)​. In all of these scenarios, none of   The     corresponding      authors  are  Ram Samudrala
the combinations of molecular aggregates may exist in any            (ram@compbio.org)            and      Werner   Ceusters
individual drug product, and yet some disposition or function        (wceusters@gmail.com); Gateway Building, 77 Goodell
to treat the disease certainly exists in the combination of          Street, Suite 540, Buffalo, NY 14203.
molecular aggregates.
                                                                     References
Limitations and Future Work
While we have suggested changing the definition of scattered         1.   Gupta PB, Onder TT, Jiang G, Tao K, Kuperwasser C,
molecular aggregate to better fit our understanding, we                   Weinberg RA, et al. Identification of selective inhibitors
recognize this may be too dramatic, and perhaps we could                  of cancer stem cells by high-throughput screening. Cell.
simply create a new term, and keep SMA as a term to refer to              2009 Aug 21;138(4):645–59.
some ‘molecular aggregates’ in a drug product, specifically.         2.   Ashburn TT, Thor KB. Drug repositioning: identifying
However, we wish to define some entity which subsumes both                and developing new uses for existing drugs. Nat Rev
‘portion of compound’ and ‘portion of mixture’, as in the Drug            Drug Discov. 2004 Aug;3(8):673–83.
Ontology they are both currently subtypes of BFO:object. We          3.   Nelson SJ, Oprea TI, Ursu O, Bologa CG, Zaveri A,
believe some new supertype, if we keep the original definition            Holmes J, et al. Formalizing drug indications on the road
for SMA, would be a subtype of BFO:object aggregate.                      to therapeutic intent. J Am Med Inform Assoc. 2017 Nov
                                                                          1;24(6):1169–72.
There remains difficulty in creating an ontology so general it
                                                                     4.   Minie M, Chopra G, Sethi G, Horst J, White G, Roy A,
can accurately describe every aspect of pharmaceuticals, both
                                                                          et al. CANDO and the infinite drug discovery frontier.
from the clinical and research perspective. The entire drug
                                                                          Drug Discov Today. 2014 Sep;19(9):1353–63.
discovery or drug repurposing process is complex and
                                                                     5.   Sethi G, Chopra G, Samudrala R. Multiscale modelling
sometimes one claim may not be applicable to another
                                                                          of relationships between protein classes and drug
instance of how it is believed some other drug ‘works’.
                                                                          behavior across all diseases using the CANDO platform.
Armed with our improved understanding of the drug                         Mini Rev Med Chem. 2015;15(8):705–17.
repurposing process, we aim to incorporate a more rigorous           6.   Chopra G, Samudrala R. Exploring Polypharmacology in
ontological understanding in future computational experiments             Drug Discovery and Repurposing Using the CANDO
with CANDO to better describe the compounds, proteins,                    Platform. Curr Pharm Des. 2016;22(21):3109–23.
diseases, and related associations.                                  7.   Chopra G, Kaushik S, Elkin PL, Samudrala R.
Conclusions                                                               Combating Ebola with Repurposed Therapeutics Using
We have found what we believe are errors in the                           the CANDO Platform. Molecules [Internet]. 2016 Nov
understanding and definitions of core entities in drug                    25;21(12).                 Available                from:
discovery, drug repurposing and drug treatment. Chief among               http://dx.doi.org/10.3390/molecules21121537
them are ‘treatment’ and several entities in the Drug Ontology       8.   Mangione W, Samudrala R. Identifying Protein Features
describing basic tenants of ‘drugs’, which made it difficult to           Responsible for Improved Drug Repurposing Accuracies
      Using the CANDO Platform: Implications for Drug                 21. Visser U, Abeyruwan S, Vempati U, Smith RP, Lemmon
      Design. Molecules [Internet]. 2019 Jan 4;24(1).                     V, Schürer SC. BioAssay Ontology (BAO): a semantic
      Available                                             from:         description of bioassays and high-throughput screening
      http://dx.doi.org/10.3390/molecules24010167                         results. BMC Bioinformatics. 2011 Jun 24;12(1):257.
9.    Schuler J, Samudrala R. Fingerprinting CANDO:                   22. Vempati UD, Przydzial MJ, Chung C, Abeyruwan S, Mir
      Increased Accuracy with Structure and Ligand Based                  A, Sakurai K, et al. Formalization, Annotation and
      Shotgun Drug Repurposing [Internet]. bioRxiv. 2019                  Analysis of Diverse Drug and Probe Screening Assay
      [cited 2019 Apr 11]. p. 591123. Available from:                     Datasets Using the BioAssay Ontology (BAO). PLoS
      https://www.biorxiv.org/content/10.1101/591123v1.abstr              One. 2012 Nov 14;7(11):e49198.
      act                                                             23. Abeyruwan S, Vempati UD, Küçük-McGinty H, Visser
10.   Falls Z, Mangione W, Schuler J, Samudrala R.                        U, Koleti A, Mir A, et al. Evolving BioAssay Ontology
      Exploration of interaction scoring criteria in the CANDO            (BAO): modularization, integration and applications. J
      platform [Internet]. bioRxiv. 2019 [cited 2019 Apr 11].             Biomed Semantics. 2014 Jun 3;5(1):S5.
      p.            591578.           Available             from:     24. Bandrowski A, Brinkman R, Brochhausen M, Brush
      https://www.biorxiv.org/content/10.1101/591578v1.abstr              MH, Bug B, Chibucos MC, et al. The Ontology for
      act                                                                 Biomedical Investigations. PLoS One. 2016 Apr
11.   Lipscomb CE. Medical Subject Headings (MeSH). -                     29;11(4):e0154556.
      PubMed - NCBI [Internet]. [cited 2019 Apr 10].                  25. Przydzial MJ, Bhhatarai B, Koleti A, Vempati U,
      Available                                             from:         Schürer SC. GPCR ontology: development and
      https://www.ncbi.nlm.nih.gov/pubmed/10928714                        application of a G protein-coupled receptor
12.   Cowell LG, Smith B. Infectious Disease Ontology                     pharmacology knowledge framework. Bioinformatics.
      [Internet]. Infectious Disease Informatics. 2010. p.                2013 Dec 15;29(24):3211–9.
      373–95.                   Available                   from:     26. Lin Y, Mehta S, Küçük-McGinty H, Turner JP, Vidovic
      http://dx.doi.org/10.1007/978-1-4419-1327-2_19                      D, Forlin M, et al. Drug target ontology to classify and
13.   Qi D, King RD, Hopkins AL, Bickerton GRJ, Soldatova                 integrate drug discovery data. J Biomed Semantics. 2017
      LN. An ontology for description of drug discovery                   Nov 9;8(1):50.
      investigations. J Integr Bioinform [Internet]. 2010 Mar         27. Hogan WR, Hanna J, Joseph E, Brochhausen M.
      25;7(3).                  Available                   from:         Towards a Consistent and Scientifically Accurate Drug
      http://dx.doi.org/10.2390/biecoll-jib-2010-126                      Ontology. CEUR Workshop Proc. 2013;1060:68–73.
14.   Schuffenhauer A, Zimmermann J, Stoop R, van der                 28. Hanna J, Joseph E, Brochhausen M, Hogan WR.
      Vyver J-J, Lecchini S, Jacoby E. An ontology for                    Building a drug ontology based on RxNorm and other
      pharmaceutical ligands and its application for in silico            sources. J Biomed Semantics. 2013 Dec 18;4(1):44.
      screening and library design. J Chem Inf Comput Sci.            29. Hanna J, Bian J, Hogan WR. An accurate and precise
      2002 Jul;42(4):947–55.                                              representation of drug ingredients. J Biomed Semantics.
15.   Gómez-Pérez          A,       Martínez-Romero             M,        2016 Apr 19;7:7.
      Rodríguez-González A, Vázquez G, Vázquez-Naya JM.               30. Hogan WR, Hanna J, Hicks A, Amirova S, Bramblett B,
      Ontologies in medicinal chemistry: current status and               Diller M, et al. Therapeutic indications and other
      future     challenges.    Curr     Top      Med      Chem.          use-case-driven updates in the drug ontology:
      2013;13(5):576–90.                                                  anti-malarials, anti-hypertensives, opioid analgesics, and
16.   Smith B, Ashburner M, Rosse C, Bard J, Bug W,                       a large term request. J Biomed Semantics. 2017 Mar
      Ceusters W, et al. The OBO Foundry: coordinated                     3;8(1):10.
      evolution of ontologies to support biomedical data              31. Ong E, Xiang Z, Zhao B, Liu Y, Lin Y, Zheng J, et al.
      integration. Nat Biotechnol. 2007 Nov;25(11):1251–5.                Ontobee: A linked ontology data server to support
17.   Smith B, Ceusters W. Ontological realism: A                         ontology term dereferencing, linkage, query and
      methodology for coordinated evolution of scientific                 integration.    Nucleic     Acids     Res.    2017     Jan
      ontologies. Appl Ontol. 2010 Jan 1;5(3-4):139–88.                   4;45(D1):D347–52.
18.   Arp R, Smith B, Spear AD. Building Ontologies with              32. Scheuermann RH, Ceusters W, Smith B. Toward an
      Basic Formal Ontology [Internet]. 2015. Available from:             ontological treatment of disease and diagnosis. Summit
      http://dx.doi.org/10.7551/mitpress/9780262527811.001.0              Transl Bioinform. 2009 Mar 1;2009:116–20.
      001                                                             33. Fine JA, Konc J, Samudrala R, Chopra G. CANDOCK:
19.   Vempati U, Visser U, Abeyruwan S, Sakurai K,                        Chemical atomic network based hierarchical flexible
      Przydzial M, Chung C, Smith RP, Koleti A, Mader C,                  docking algorithm using generalized statistical potentials
      Lemmon VP, Schürer SC. Bioassay Ontology to                         [Internet].                Available                 from:
      Describe High-Throughput Screening Assays and their                 http://dx.doi.org/10.1101/442897
      Results. In University of Miami; 2011. p. 209–16.               34. Brown SH, Elkin PL, Rosenbloom ST, Husser C, Bauer
20.   Schürer SC, Vempati U, Smith R, Southern M, Lemmon                  BA, Lincoln MJ, et al. VA National Drug File Reference
      V.      BioAssay     ontology     annotations      facilitate       Terminology: a cross-institutional content coverage
      cross-analysis of diverse high-throughput screening data            study. Stud Health Technol Inform. 2004;107(Pt
      sets. J Biomol Screen. 2011 Apr;16(4):415–26.                       1):477–81.
35. Donnelly K. SNOMED-CT: The advanced terminology             40. Santos R, Ursu O, Gaulton A, Bento AP, Donadi RS,
    and coding system for eHealth. Stud Health Technol              Bologa CG, et al. A comprehensive map of molecular
    Inform. 2006;121:279–90.                                        drug targets. Nat Rev Drug Discov. 2017
36. Hastings J, Owen G, Dekker A, Ennis M, Kale N,                  Jan;16(1):19–34.
    Muthukrishnan V, et al. ChEBI in 2016: Improved             41. Sofia MJ, Bao D, Chang W, Du J, Nagarathnam D,
    services and an expanding collection of metabolites.            Rachakonda      S,    et     al.  Discovery     of     a
    Nucleic Acids Res. 2016 Jan 4;44(D1):D1214–9.                   β-d-2’-deoxy-2'-α-fluoro-2'-β-C-methyluridine
37. Anatomical Therapeutic Chemical Classification System           nucleotide prodrug (PSI-7977) for the treatment of
    (WHO) [Internet]. The SAGE Encyclopedia of                      hepatitis C virus. J Med Chem. 2010 Oct
    Pharmacology       and    Society.   Available     from:        14;53(19):7202–18.
    http://dx.doi.org/10.4135/9781483349985.n37                 42. Tatonetti NP, Ye PP, Daneshjou R, Altman RB.
38. Angell M, Kassirer JP. Alternative medicine--the risks of       Data-driven prediction of drug effects and interactions.
    untested and unregulated remedies. N Engl J Med. 1998           Sci Transl Med. 2012 Mar 14;4(125):125ra31.
    Sep 17;339(12):839–41.
39. Weissmann G. Aspirin. Sci Am. 1991 Jan;264(1):84–90.