<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Accelerating Drug Discovery in Rare and Complex Diseases</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Shima Dastgheib</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Craig Webb</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Qiaonan Duan</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Rowan Copley</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gini Deshpande</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Asim Siddiqui</string-name>
          <email>asim.siddiquig@numedii.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>NuMedii,Inc.</institution>
          ,
          <addr-line>San Mateo, CA</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>We report the adoption of Semantic Web technologies by NuMedii to lay the foundation for accelerating drug discovery in complex and rare diseases.</p>
      </abstract>
      <kwd-group>
        <kwd>Drug Discovery Fibrotic diseases RDF graph database</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Overview</title>
      <p>Despite the incredible advancements in prognosis, diagnosis, treatment and
management of many chronic diseases, there remain thousands of complex or rare
conditions that are still challenging to manage and are usually incurable. With
the rapid development of high-throughput technologies, enormous amounts of
biomedical data are being generated and continue to accumulate exponentially.
Today, it is even possible to extract the DNA sequence information from
individual cells 1, generating millions of data points from a single cell. This vast
repository of invaluable data together held in biomedical repositories (covering
entities such as diseases, drugs, genes, pathways, etc.) have great potential to
uncover new therapeutic options for patients. The key challenge, however, is to
make sense out of this, taking into account the volume and variety of available
data, and the complex relationships connecting them.</p>
      <p>As a computational biopharma company, NuMedii has access to millions of
high quality data on diseases and drugs at di erent stages of development. We
built a semantic Knowledge Base compliant with W3C standards2 to integrate
and unify heterogeneous data from various public and proprietary data sources;
and to build meaningful relationships between them. This Knowledge Base
empowers us to ask questions pertaining to multiple data sources, by executing
a single SPARQL query. We use OntoText GraphDB3, a highly scalable RDF
graph database, which includes triple store, inference engine and SPARQL query
engine. In addition, we developed a graphical user interface, which enables the
domain experts to explore the Knowledge Base by interacting with graphical
elements (Figure 1).
1 https://www.nature.com/articles/nmeth.2769
2 https://www.w3.org/standards/
3 https://ontotext.com/products/graphdb/</p>
    </sec>
    <sec id="sec-2">
      <title>Case Study</title>
      <p>Idiopathic Pulmonary Fibrosis (IPF) is a lung disease with unknown
origin, which causes brosis (scarring) in the lungs. IPF is a heterogeneous
disease, i.e. mixture of cell types are involved. Current treatments, pirfenidone and
nintedanib 4, slow the brosis progression at best, but there is no cure for IPF.</p>
      <p>To accelerate the identi cation of e ective drugs for IPF, we built a
Knowledge Base (Section 1), that helps us traverse relevant information about IPF very
rapidly. Since IPF is a brotic disease, we also included data of other brotic
diseases. This has resulted in a unique and uni ed resource of Fibrotic diseases
which so far contains around 8 billion explicit and inferred RDF statements.
Figure 1 shows a screenshot from the user interface visualizing requested data
from the Knowledge Base. In addition, we mined more than 700,000 PubMed
abstracts on di erent brotic diseases, RDFized the results and added them to
the Knowledge Base.</p>
    </sec>
    <sec id="sec-3">
      <title>Conclusion</title>
      <p>As a critical step towards nding e ective drugs for incurable diseases such as
IPF, NuMedii adopted Semantic Web technologies to integrate and interrogate
all relevant data in one place in a uni ed fashion; and to infer new knowledge.
The interactive graphical user interface allows scientists explore the Knowledge
Base without knowing RDF and SPARQL. Moreover, To address the challenges
of large graph visualization, the interface allows to tailor the graphs and shed
light on nodes and relationships of interest. Furthermore, the Knowledge Base
empowers NuMedii to evaluate the drugs predicted by the company's proprietary
algorithms. Both the Knowledge Base and the user interface are expandable and
have been applied by NuMedii to other complex disease types.
4 https://www.drugbank.ca/drugs/DB04951 and https://www.drugbank.ca/drugs/DB09079</p>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>