=Paper= {{Paper |id=Vol-2180/paper-81 |storemode=property |title=Accelerating Drug Discovery in Rare and Complex Diseases |pdfUrl=https://ceur-ws.org/Vol-2180/paper-81.pdf |volume=Vol-2180 |authors=Shima Dastgheib,Craig Webb,Qiaonan Duan,Rowan Copley,Gini Deshpande,Asim Siddiqui |dblpUrl=https://dblp.org/rec/conf/semweb/DastgheibWDCDS18 }} ==Accelerating Drug Discovery in Rare and Complex Diseases== https://ceur-ws.org/Vol-2180/paper-81.pdf
      Accelerating Drug Discovery in Rare and
                 Complex Diseases

Shima Dastgheib, Craig Webb, Qiaonan Duan, Rowan Copley, Gini Deshpande
                           and Asim Siddiqui

                       NuMedii,Inc. , San Mateo, CA, USA
                  {shima.dastgheib, asim.siddiqui}@numedii.com


      Abstract. We report the adoption of Semantic Web technologies by
      NuMedii to lay the foundation for accelerating drug discovery in complex
      and rare diseases.

      Keywords: Drug Discovery · Fibrotic diseases · RDF graph database.


1   Overview
Despite the incredible advancements in prognosis, diagnosis, treatment and man-
agement of many chronic diseases, there remain thousands of complex or rare
conditions that are still challenging to manage and are usually incurable. With
the rapid development of high-throughput technologies, enormous amounts of
biomedical data are being generated and continue to accumulate exponentially.
Today, it is even possible to extract the DNA sequence information from indi-
vidual cells 1 , generating millions of data points from a single cell. This vast
repository of invaluable data together held in biomedical repositories (covering
entities such as diseases, drugs, genes, pathways, etc.) have great potential to
uncover new therapeutic options for patients. The key challenge, however, is to
make sense out of this, taking into account the volume and variety of available
data, and the complex relationships connecting them.
    As a computational biopharma company, NuMedii has access to millions of
high quality data on diseases and drugs at different stages of development. We
built a semantic Knowledge Base compliant with W3C standards2 to integrate
and unify heterogeneous data from various public and proprietary data sources;
and to build meaningful relationships between them. This Knowledge Base em-
powers us to ask questions pertaining to multiple data sources, by executing
a single SPARQL query. We use OntoText GraphDB3 , a highly scalable RDF
graph database, which includes triple store, inference engine and SPARQL query
engine. In addition, we developed a graphical user interface, which enables the
domain experts to explore the Knowledge Base by interacting with graphical
elements (Figure 1).
1
  https://www.nature.com/articles/nmeth.2769
2
  https://www.w3.org/standards/
3
  https://ontotext.com/products/graphdb/
2        Dastgheib,S. et al.

2     Case Study
Idiopathic Pulmonary Fibrosis (IPF) is a lung disease with unknown ori-
gin, which causes fibrosis (scarring) in the lungs. IPF is a heterogeneous dis-
ease, i.e. mixture of cell types are involved. Current treatments, pirfenidone and
nintedanib 4 , slow the fibrosis progression at best, but there is no cure for IPF.
    To accelerate the identification of effective drugs for IPF, we built a Knowl-
edge Base (Section 1), that helps us traverse relevant information about IPF very
rapidly. Since IPF is a fibrotic disease, we also included data of other fibrotic
diseases. This has resulted in a unique and unified resource of Fibrotic diseases
which so far contains around 8 billion explicit and inferred RDF statements.
Figure 1 shows a screenshot from the user interface visualizing requested data
from the Knowledge Base. In addition, we mined more than 700,000 PubMed
abstracts on different fibrotic diseases, RDFized the results and added them to
the Knowledge Base.




Fig. 1: Screenshot from the user interface shows drugs developed (in any phase) for IPF
and another fibrotic disease; as well as genes targeted by the selected drugs.

3     Conclusion
As a critical step towards finding effective drugs for incurable diseases such as
IPF, NuMedii adopted Semantic Web technologies to integrate and interrogate
all relevant data in one place in a unified fashion; and to infer new knowledge.
The interactive graphical user interface allows scientists explore the Knowledge
Base without knowing RDF and SPARQL. Moreover, To address the challenges
of large graph visualization, the interface allows to tailor the graphs and shed
light on nodes and relationships of interest. Furthermore, the Knowledge Base
empowers NuMedii to evaluate the drugs predicted by the company’s proprietary
algorithms. Both the Knowledge Base and the user interface are expandable and
have been applied by NuMedii to other complex disease types.
4
    https://www.drugbank.ca/drugs/DB04951 and https://www.drugbank.ca/drugs/DB09079