<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Easy-to-use semantic search of pharmacological data</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Guillermo Vega-Gorgojo</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Laura Slaughter</string-name>
          <email>laura.slaughter@gmail.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Informatics, University of Oslo</institution>
          ,
          <country country="NO">Norway</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Oslo University Hospital</institution>
          ,
          <country country="NO">Norway</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Patient safety and treatment e ectiveness can be improved by introducing pharmacogenomic testing into current clinical practice. LOD resources such as DrugBank and SIDER are readily available to be used but are not integrated and cannot be easily exploited by clinical health workers. To overcome these limitations, we have set up a novel pharmacological search facility that combines data from multiple RDF sources and uses PepeSearch to formulate queries. We demonstrate this search system that is currently being tested with clinicians as a means for exploring the knowledge contained in these databases, to create exible queries and to support decision-making.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Access to evidence-based pharmaceutical knowledge with accompanying genetic
information must be incorporated into the systems currently used in healthcare.
By introducing pharmacogenomic testing into current clinical practice both
patient safety and treatment e ectiveness can be improved. However, Taber &amp;
Dickinson [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] have shown that physicians lack knowledge about the topic and
need educational as well as informational resources. Making this information
available has been studied related to the design of Computerized Provider
Order Entry systems (CPOE) having context-sensitive information combined with
alerts [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Essential work by Romagnoli et al. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] has focused on information
needs of hospital pharmacists, who have the complex job of handling di cult
patient cases and medication reconciliation tasks.
      </p>
      <p>
        Physicians and hospital pharmacists can bene t from the use of a
pharmacological knowledge base composed of Linked Open Data (LOD) resources and
additional information from FDA labelling. RDF datasets, such as DrugBank and
the Side E ect Resource SIDER are readily available to be used, but
unfortunately, (1) databases are not integrated, and (2) search facilities are inadequate.
To overcome these limitations we are working on providing an easy-to-use search
facility of pharmacological data. In this work, we have chosen to focus on
physicians and pharmacists' information needs, i.e. for clinical prescription. Other
potential cases such as drug discovery (see [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] and [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]) will expand this work in
future e orts. Using input from [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], the current work focuses on integration and
searching multiple open drug Linked Data sources for general information needs
proposed by at least 25% of pharmacists in this study. These information needs
are listed in Fig. 1.
      </p>
    </sec>
    <sec id="sec-2">
      <title>A Novel Pharmacological Search Facility</title>
      <p>
        We aim to create a pharmacological resource based on existing LOD drug
databases that can be used to ful l the information needs of pharmacists and
physicians. As a foundation we have employed DrugBank, a comprehensive repository
of drug, drug-target and drug action information based on extensive literature
surveys and curated by experts [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. DrugBank is one of the most popular
resources in the pharmacological domain due to its wide coverage and to the links
to other well-known databases such as ATC, ChEBI, PubChem and KEGG.
More importantly, DrugBank provides relevant medication information to
pharmacists (see Fig. 1) and Bio2RDF already o ers an RDF version of DrugBank.1
      </p>
      <p>
        In addition to DrugBank, we have employed the Side E ect Resource SIDER
[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] to obtain information about drug safety and drug indications (see Fig. 1).
SIDER extracts indications and adverse drug reactions (ADRs) of marketed
medicines from package inserts and the biomedical literature. Bio2RDF also
exposes an RDF version of SIDER,2 thus facilitating the integration with
DrugBank. In this regard, the two datasets employ di erent URIs, so we found
matches based on the drug name. We generated 1179 mapping triples to link
the drugs in the two databases.
      </p>
      <p>
        Finally, our main goal was to include pharmacogenomic information, given
its importance to reduce the risks of adverse events and to improve the e
ectiveness to treatments [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. We have employed the list of FDA-approved drugs
with pharmacogenomic information in their labeling, including speci c actions
to be taken based on the biomarker information [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Since this resource is not
available as LOD, we have translated this information into RDF and linked it
to the drug URIs in DrugBank. As a result of this work, we have set up a triple
store that integrates DrugBank, SIDER and FDA's pharmacogenomic data.
      </p>
      <p>
        As pharmacists and physicians need a search tool that allows them to easily
express their information needs without requiring knowledge of SPARQL or
1 http://download.bio2rdf.org/release/3/drugbank/drugbank.html
2 http://download.bio2rdf.org/release/3/sider/sider.html
RDF, we have employed PepeSearch [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] for this purpose. PepeSearch is a portable
form-based search interface for SPARQL endpoints that allows the searcher to set
multiple constraints on any of the classes in the target dataset. For an arbitrary
RDF class, PepeSearch creates a form block in which datatype properties are
mapped to widget elements, e.g. text boxes for string literals or slide ranges
for integers. In this way, a searcher can easily set restrictions on a class by
manipulating the visual elements of the block form. Beyond re ning a single
class, PepeSearch allows the formulation of queries that involve multiple classes;
the user interface will include new form blocks for each class connected with an
object property to the selected class.
      </p>
      <p>
        We have thus set up a PepeSearch instance for querying our pharmacological
triple store at http://sws.ifi.uio.no/project/drugsearch/. Note that we
have pruned some of the classes employed in Drugbank and SIDER (such as
carrier information and polarizability) for the sake of simplicity. Our criterion
has been to only cover the information needs identi ed in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] (see Fig. 1).
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Demonstration</title>
      <p>
        To illustrate the operation of the resulting search facility, we will employ the
following information need: \obtain the drugs indicated for myocardial
infarction that elicit variable responses for patients with biomarker CYP2C19". It is
inspired in the pharmacogenomic study by Taber &amp; Dickinson [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] and requires
the usage of the three data sources of our pharmaceutical database: drugs
(DrugBank), indications (SIDER) and pharmacogenomic biomarkers (FDA). Speci
cally, the interaction with PepeSearch can be the following:
1. PepeSearch presents a list of the top classes available { see Fig. 2(a).
2. The searcher selects \Drug".
3. PepeSearch presents a form block for the \Drug" class and a list of
collapsibles corresponding to related classes: \Indication", \Biomarker". . .
4. The searcher sets the restrictions required for this search task: she expands
the \Indication" collapsible and lls in \myocardial infarction"; then she
expands the \Biomarker" collapsible and lls in \CYP2C19" { see Fig. 2(b).
5. The searcher pushes the \Get results" button at the top right corner.
6. PepeSearch generates a SPARQL query that is sent to the endpoint.
7. PepeSearch prepares a tabular representation of the results { see Fig. 2(c).
8. The searcher can navigate through the results by following the links in the
table, e.g. to obtain additional information about Clopidogrel.
4
      </p>
    </sec>
    <sec id="sec-4">
      <title>Discussion and Future Work</title>
      <p>We have introduced a search system for integrated LOD pharmaceutical data
which is a promising resource that can provide clinicians a means for exploring
the knowledge contained in these databases, to create exible queries and to
4
support decision-making. The bene t to this search tool includes cost-e ective
reuse of existing datasets, a simple form-based query interface, and a means to
express information needs within multiple elds for more precise results.</p>
      <p>
        The information needs for hospital pharmacists and clinicians for
pharmacogenomics are di erent from researchers' needs. The majority of datasets and
resources are geared towards serving research needs. Callahan et al. [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]
discussed di culties with searching and integrating biological data in the
opensource Bio2RDF project. We have taken the rst step towards reuse of datasets
for the \average hospital clinician". Our current research involves user testing of
the clinicians' ability to express their information needs using our search facility.
Further future steps include the use of PharmaGKB [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] as a key resource to
implement pharmacogenomics in real-world clinical practice.
      </p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgements</title>
      <p>This work has been funded by the BIGMED (IKT 259055), HealthInsight (NFR
247784/O70), Optique (FP7 GA 318338), and BYTE (FP7 GA 619551) projects.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Taber</surname>
            ,
            <given-names>K.A.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dickinson</surname>
            ,
            <given-names>B.D.</given-names>
          </string-name>
          :
          <article-title>Pharmacogenomic knowledge gaps and educational resource needs among physicians in selected specialties</article-title>
          .
          <source>Pharmacogenomics and Personalized Medicine</source>
          <volume>7</volume>
          (
          <year>2014</year>
          )
          <volume>145</volume>
          {
          <fpage>162</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Devine</surname>
            ,
            <given-names>E.B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>C.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Overby</surname>
            ,
            <given-names>C.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Abernethy</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McCune</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>J.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tarczy-Hornoch</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Usability evaluation of pharmacogenomics clinical decision support aids and clinical knowledge resources in a computerized provider order entry system: a mixed methods approach</article-title>
          .
          <source>International Journal of Medical Informatics</source>
          <volume>83</volume>
          (
          <issue>7</issue>
          ) (
          <year>2014</year>
          )
          <volume>473</volume>
          {
          <fpage>483</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Romagnoli</surname>
            ,
            <given-names>K.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Boyce</surname>
            ,
            <given-names>R.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Empey</surname>
            ,
            <given-names>P.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Adams</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hochheiser</surname>
          </string-name>
          , H.:
          <article-title>Bringing clinical pharmacogenomics information to pharmacists</article-title>
          .
          <source>International Journal of Medical Informatics</source>
          <volume>86</volume>
          (
          <year>2016</year>
          )
          <volume>54</volume>
          {
          <fpage>61</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>R.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goossens</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hoshida</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Use of big data in drug development for precision medicine</article-title>
          .
          <source>Expert Review of Precision Medicine and Drug Development</source>
          <volume>1</volume>
          (
          <issue>3</issue>
          ) (
          <year>2016</year>
          )
          <volume>245</volume>
          {
          <fpage>253</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Dumontier</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wild</surname>
            ,
            <given-names>D.J.</given-names>
          </string-name>
          :
          <article-title>Linked data in drug discovery</article-title>
          .
          <source>IEEE Internet Computing</source>
          <volume>16</volume>
          (
          <issue>6</issue>
          ) (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Law</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Knox</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Djoumbou</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jewison</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guo</surname>
            ,
            <given-names>A.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Maciejewski</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Arndt</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , Wilson,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Neveu</surname>
          </string-name>
          ,
          <string-name>
            <surname>V.</surname>
          </string-name>
          , et al.:
          <article-title>Drugbank 4.0: shedding new light on drug metabolism</article-title>
          .
          <source>Nucleic Acids Research</source>
          <volume>42</volume>
          (
          <year>2014</year>
          )
          <volume>1091</volume>
          {
          <fpage>1097</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Kuhn</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Letunic</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jensen</surname>
            ,
            <given-names>L.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bork</surname>
            ,
            <given-names>P.:</given-names>
          </string-name>
          <article-title>The sider database of drugs and side e ects</article-title>
          .
          <source>Nucleic Acids Research</source>
          <volume>44</volume>
          (
          <year>2015</year>
          )
          <volume>1075</volume>
          {
          <fpage>1079</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8. Food and Drug Administration:
          <article-title>Table of pharmacogenomic biomarkers in drug labeling (</article-title>
          <year>2016</year>
          ) URL: http://www.fda.gov/Drugs/ScienceResearch/ ResearchAreas/Pharmacogenetics/ucm083378.htm.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Vega-Gorgojo</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Giese</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Heggest yl</surname>
          </string-name>
          , S.,
          <string-name>
            <surname>Soylu</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Waaler</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>PepeSearch: Semantic data for the masses</article-title>
          .
          <source>PLOS ONE</source>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Callahan</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cruz-Toledo</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dumontier</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Ontology-based querying with Bio2RDF's linked open data</article-title>
          .
          <source>Journal of biomedical semantics 4(Suppl 1)</source>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Whirl-Carrillo</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McDonagh</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hebert</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gong</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sangkuhl</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thorn</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Altman</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Klein</surname>
          </string-name>
          , T.E.:
          <article-title>Pharmacogenomics knowledge for personalized medicine</article-title>
          .
          <source>Clinical Pharmacology and Therapeutics</source>
          <volume>92</volume>
          (
          <issue>4</issue>
          ) (
          <year>2012</year>
          )
          <fpage>414</fpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>