=Paper= {{Paper |id=Vol-1428/BDM2I_2015_paper_10 |storemode=property |title=Formalizing Knowledge and Evidence about Potential Drug-drug Interactions |pdfUrl=https://ceur-ws.org/Vol-1428/BDM2I_2015_paper_10.pdf |volume=Vol-1428 |dblpUrl=https://dblp.org/rec/conf/semweb/SchneiderBRCHMN15 }} ==Formalizing Knowledge and Evidence about Potential Drug-drug Interactions== https://ceur-ws.org/Vol-1428/BDM2I_2015_paper_10.pdf

Formalizing knowledge and evidence about
potential drug-drug interactions

Jodi Schneider1 , Mathias Brochhausen2 , Samuel Rosko1 , Paolo Ciccarese36 ,
William R. Hogan4 , Daniel Malone5 , Yifan Ning1 , Tim Clark6 , and
Richard D. Boyce1
1
Department of Biomedical Informatics, University of Pittsburgh
jos188, scr25, yin2, rdb20@pitt.edu
2
Department of Biomedical Informatics, University of Arkansas for Medical Sciences
mbrochhausen@uams.edu
3
Innovation Lab at PerkinElmer, paolo.ciccarese@gmail.com
4
University of Florida, hoganwr@ufl.edu
5
College of Pharmacy, The University of Arizona, malone@pharmacy.arizona.edu
6
Massachusetts General Hospital and Harvard Medical School
tim clark@harvard.edu

Abstract. Potential drug-drug interactions (PDDI) are a significant
source of preventable drug-related harm. One contributing factor is that
there is no standard way to represent PDDI knowledge claims and asso-
ciated evidence in a computable form. The research we present in this
paper addresses this problem by creating a new version of the Drug In-
teraction Knowledge Base, with scalable, interlinkable repositories for
PDDI evidence and PDDI knowledge claims.

Keywords: Linked Data, drug-drug interactions, evidence bases, Mi-
cropublications, Nanopublications, knowledge bases

1 Introduction
A challenging area of focus for patient safety is the management of potential
drug-drug interactions (PDDIs). These are defined as co-prescription or co-
administration of two drugs known to interact, which potentially exposes the
patient to adverse drug events [9]. PDDIs are a significant source of preventable
drug-related harm: according to a recent review, clinically important events at-
tributable to PDDI exposure occur in 5.3% to 14.3% of inpatients, and are re-
sponsible for 0.02% to 0.17% of the 129 million emergency department visits that
occur each year [12].1 Unfortunately, most drug information sources disagree sub-
stantially in their guidance about specific PDDIs [1, 16, 13, 2]. Addressing this
is urgent as United States healthcare organizations consider PDDI screening in
their strategies to achieve the effective use of electronic health records.

1
http://www.cdc.gov/nchs/fastats/ervisits.htm
There are both technical and social factors underlying the disagreement that
exists across drug information sources [14]. As Figure 1 shows, evidence that
might be relevant for establishing PDDI knowledge claims is distributed across
several sources including product labeling, the scientific literature, and case re-
ports. Each source provides complementary evidence that editors of drug infor-
mation resources (public sources [2] or proprietary sources such as Micromedex,
Epocrates, and Medscape) must synthesize. A major social factor underlying dis-
agreement is that drug information editors have different criteria for assessing
evidence. Fortunately, two different conference series have brought leading drug
information editors to discuss a standard set of methods for assessing evidence
[14, 10].

Pre$market)studies Post$market)studies Clinical) experience

Rarely(reported(in

Reported(in Reported(in Rarely(reported(in

Scientific) literature
Rarely(reported(in

Product)labeling
Source(for
Source(for

Drug)Compendia) )synthesize)PDDI)evidence)into)
knowledge)claims but
• May)fail)to)include) important) evidence
• Disagree)if)specific) evidence)items) can)support)
or)refute)PDDI)knowledge)claims

Fig. 1. Editors of drug information resources might seek evidence for or against poten-
tial drug-drug interactions from numerous sources. Different information is reported in
each type of source, making synthesis necessary.

A major technical factor yet to be addressed is that there currently does
not exist a standard way to represent PDDI knowledge claims and associated
evidence in a computable form. As a result, drug information editors resort to
ad hoc information retrieval methods that can yield different sets of evidence
to assess [10]. The research we present in this paper addresses this problem by
creating scalable, interlinkable repositories for both PDDI evidence and PDDI
knowledge claims. This paper describes our new approach. In Section 2, we
outline requirements. In Section 3, we discuss the technical details. In Section 4,
we present a benchmark analysis that tests the ability of the new approach to
scale. After discussion, we conclude the paper.
2 Background and requirements

In prior work, we created the original Drug Interaction Knowledge Base (DIKB-
old) [3, 4]. The DIKB2 is an evidence-focused knowledge base designed to support
pharmacoepidemiology and clinical decision support. It contains quantitative
and qualitative knowledge claims about drug mechanisms and pharmacokinetic
drug-drug interactions for over 60 drugs.
Prior work on the DIKB-old focused on development of an evidential ap-
proach representing the evidence associated with a scientific claim. The system
considers the evidence board as a socio-technical reasoning system that manages
both a knowledge base and an evidence base. The knowledge base holds PDDI
knowledge claims while the evidence base stores information artifacts that can
be used to support or challenge those claims. PDDI knowledge claims may be
direct (e.g., “drug X interacts with drug Y”); or inferred from pharmacological
properties (e.g.,“drug X inhibits enzyme Q which is important for the clearance
of drug Y from the body”).
In prior work [3, 4], all evidence was collected and entered by an evidence
board consisting of an informaticist and a minimum of two drug-experts. The
board used the following process to manage the evidence and knowledge base
components:

1. All members of the board select drugs of interest. This determines the set
of PDDI knowledge claims to be investigated.
2. The informaticist conducts a systematic search for evidence that might sup-
port or refute the pre-determined PDDI knowledge claims.
3. Retrieved items are filtered by applying study inclusion criteria.
4. Evidence items that meet inclusion criteria are entered into the system where
they are linked to specific PDDI knowledge claims and any evidence use
assumptions (knowledge claims that must be true for the evidence to hold).
5. A truth value for each knowledge claim is determined based on belief criteria.

Experience with the DIKB-old revealed a great need for improvements to the
system that would make this process more efficient. First, a substantial amount
of time was spent on reconciling and integrating information from various sources
(Figure 1). Decision rationales were not recorded in a computable form and the
evidence board did not have a process in place to keep up with relevant new
evidence. Furthermore, the DIKB-old was ontologically informal, failed to adopt
common biomedical ontology terms3 , and did not distinguish drug and enzyme
classes from individuals. This hindered automated reasoning that integrated ex-
ternal knowledge sources and resulted in treating PDDIs the same as observed
2
When we do not need to distinguish between the old (‘DIKB-old’) and new (‘new
DIKB’) versions, we simply mention ‘DIKB’.
3
For example, the DIKB-old used the predicate ‘substrate of’ to represent the
metabolic process of xenobiotic catalysis. However, this predicate was defined with-
out reference to the formally defined biological process (e.g, such as that provided
by the Gene Ontology).
drug-drug interactions. In the new system we wished to resolve these issues. We
also wished to retain the ability to compute with a logical representation of drug
mechanism knowledge claims, using a rule-based theory of how to infer PDDIs
from metabolic mechanistic knowledge of how drugs interact [4].
We summarize these as three requirements for the new DIKB:

R1 Create a maintainable structure that supports evidence entry of data, meth-
ods, and materials from multiple sources on an ongoing basis.
R2 Create computable, logical representations of drug mechanism knowledge
claims.
R3 Link to biological processes while also carefully distinguishing between a
drug drug interaction (an actual occurrence in a patient) and a potential
drug drug interaction (an information content entity that may exist because
of an observation or inference).

Our approach to addressing these criteria are as follows:

Addressing R1 We adopt the emerging Micropublications (MP) [6] model for literature in-
tegration using ‘argument graphs’ to represent published claims as formal
assertions linked to primary data and resources.
Addressing R2 We extended the MP ontology to add two new properties,
MP:formalizedAs/MP:formalizes, to enable natural language claims
to be linked to useful logical formalizations.
Addressing R3 To stress that potential drug drug interactions are information artifacts, we
use a new ontology called DIDEO [5] which has several advantages. DIDEO:
(a) Reuses identifiers from existing ontologies (e.g., CHEBI, PRO) that rep-
resent biological entities and processes;
(b) Differentiates between the representation (statements about drugs and
drug-drug interactions) and the represented (actual drug-drug interac-
tions);
(c) Prevents unwanted existential import (further explained in Section 3.3
below); and
(d) Distinguishes between the type of a drug or enzyme and portions of a
specific drug or enzyme, by using punning.

3 Technical implementation

3.1 Create a maintainable structure that supports evidence entry
of data, methods, and materials from multiple sources

We used micropublications to create a structure that supports evidence entry of
data, methods, and materials from multiple sources [15]. We now represent PDDI
knowledge claims and supporting evidence as queryable RDF statements4 con-
structed using the Micropublication ontology (MP) [6]. PDDI knowledge claims
4
Queryable at http://purl.org/net/nlprepository/swat-4-med-safety-sparql-
endpoint
and evidence were transformed from the DIKB-old model into the new one using
Python scripts. Drug identifiers were converted to ChEBI identifiers to enable
the use of DIDEO. So far, the mapping has been completed for 70% of the drugs
that had data from clinical studies or mechanistic experiments. We envision
that additional DIKB micropublications could be created by multiple parties,
including evidence boards, and potentially the original authors, as we describe
in Section 5.
Figure 2 shows the generic form of a DIKB micropublication graph using
the example erythromycin - simvastatin interaction. Notice that MP has rigor-
ously defined ontology classes that support the DIKB evidence curation process
discussed above. In MP, the primary object of interest is the claim. A claim is
supported by methods, materials, and data:
– MP:Claim, a text string representing a scientific claim.
– MP:Method, representing a scientific method.
– MP:Materials, for materials, such as study participants and drugs.
– MP:Data, such as the area under the concentration curve (AUC).
These are used for entering evidence and later, the evidence is used to determine
truth values for the claims that the evidence supports.

obo:CHEBI_48923 obo:CHEBI_9150 obo:DIDEO_00000000

mp:qualifiedBy mp:qualifiedBy mp:qualifiedBy

MP mp:argues
1 Claim
erythromycin increases the AUC of simvastatin
1

mp:supports

Data
1
mp:supports
mp:supports
http://dx.doi.org/
Method
10.1016/
1
S0009-9236(98)90151-5
mp:supports
Materials
1

Fig. 2. DIKB micropublication graph for the erythromycin - simvastatin interaction.

The process for managing the evidence base and knowledge base described
in Section 2 includes assessing the truth value of each PDDI knowledge claim us-
ing belief criteria. Operationally, the evidence board uses labels from a taxonomy
of evidence types5 to tag each evidence item as it is entered into the evidence
base. The board then decides which evidence types are credible for specific types
of PDDI knowledge claims: this specifies a belief criterion.
As an example, the evidence board might decide that, to support a claim that
a drug is a substrate of an enzyme, only clinical drug-drug interaction studies
5
http://purl.org/net/drug-interaction-knowledge-base/evidence-types-
and-inclusion-criteria
are admissible. This would become a belief criterion for all ‘substrate of’ claims.
To implement a belief criterion in the new DIKB, the evidence base is queried
to find all PDDI knowledge claims that have at least one supporting evidence
item meeting the criterion. The resulting claims are assigned the value of ‘True’.

3.2 Create computable, logical representations
PDDI knowledge claims mention specific entities such as drugs, drug metabolites,
enzymes, and biological pathways whose relationships with each other are more
generally modeled in a rule-based theory that infers PDDIs [4]. Sources external
to the DIKB provide additional formalized knowledge about these entities. For
example, the Gene Ontology provides cellular location and molecular function for
the enzyme CYP3A4; this is relevant when the evidence board seeks information
about gene expression and about enzyme metabolization.
The spans of unstructured text in MP:Claim are not inherently computable
entities, and the semantic qualifiers (MP:qualifiedBy) cannot specify the order
(i.e. separate the object drug from the precipitant drug). Therefore, we extended
the MP ontology to add two new properties, MP:formalizedAs/MP:formalizes,
that enable natural language claims represented as MP:Claim to be linked to their
logical representation.
RDF is also the language chosen for the formalization of MP:Claim resources,
so that a single query language (i.e., SPARQL) can be used to retrieve infor-
mation from the whole evidence base. We chose to represent the logical form of
knowledge claims using OWL for two reasons. First, OWL provides classes and
properties that enable the representation of logical statements in RDF. Second,
logical statements written in OWL can be checked for logical consistency and
new inferences by a reasoner such as Hermit [7].
We chose to formalize claims using the Nanopublication (NP) [8] ontology
because:
1. NP provides a class called NP:Assertion that can hold any RDF graph,
including logical statements written in OWL.
2. OWL logical statements stored as an NP:Assertion can be integrated into
full nanopublications that combine the NP:Assertion, the provenance of the
assertion, and the provenance of the nanopublication into a single publishable
and citable entity.
A nanopublication represents the logical structure of a claim as an RDF
graph. Like micropublications, nanopublications are publishable and citable enti-
ties. Their citability and use of provenance enable us to make the evidence review
process transparent and auditable. The uptake of nanopublications by the wider
community suggests that nanopublication is a relevant publishing mechanism for
reconsumption by others.6 Unlike micropublications, nanopublications have no
6
One measure of uptake is the variety of authors of papers using nanopublications;
see the bibliography at http://nanopub.org/wordpress/?page_id=638. Another is
the size and geographic distribution of current nanopublication datasets: see [11]
Table 1 and Figure 3, respectively.
explicit evidence structure and do not support claim conflict. They are therefore
complementary to micropublications, which provide these missing features.

3.3 Handling reasonable extrapolation

Reasonable extrapolation is an important way to infer a PDDI. In contrast to
drug-drug interactions (DDIs) that are based on observing an actual drug-drug
interaction in some patient, inferred PDDIs based on reasonable extrapolation
might not actually occur in reality. Since we do not know whether a PDDI occurs,
we cannot assume the existence of any instance of a drug interaction. To model
this correctly, we differentiate between actual drug interactions and statements
about PDDIs using the DIDEO ontology [5].

catalyzes6a6Phase6I6or6
molecularly6 Phase6II6enzymatic6
erythromycin decreases6activity CYP3A4 reaction6involving simvastatin

obo:CHEBI_48923 obo:RO_0002449 obo:PR_000006130 obo:DIDEO_00000096 obo:CHEBI_9150

inhibits