Knowledge Graph Solutions in Healthcare for Improved Clinical Outcomes

                                  Jans Aasman, Ph.D.1 and Parsa Mirhaji, MD, Ph.D2

                              1 Franz Inc, 2201 Broadway, Suite 715, Oakland, California 94612

                                                   jans.aasman@franz.com
                        2 Montefiore Medical Center, Institute for Clinical and Translational Research,

                                  6 Executive Plaza, Suite 112, Yonkers, New York, 10701
                                                 pmirhaji@montefiore.org

         Abstract. Deploying patient Knowledge Graphs based on Semantic Technologies offers improved patient care and
         revolutionizes care models and medical research. Knowledge Graphs provide the ability to search the information that’s
         available in a much more efficient way in order to find patterns in the data and to use those patterns for clinical
         purposes to improve clinical outcomes.

         Keywords: Knowledge Graph, Graph Database, Machine Learning, RDF, Semantic Web


1. Healthcare Challenges

Located in the Bronx, Montefiore Health System serves one of the most ethnically and socioeconomically diverse
populations in the US. Like all healthcare organizations, Montefiore faces complex challenges—from government
pressures to reduce costs and stringent regulatory guidelines to diverse patient populations and disruptive
technologies. Understanding patients requires information on a complex array of factors, some of which may not
even be known during a clinical interaction, such as the home and work environment, nutrition, and genetics.

The industry has long collected data on patients—it is not uncommon for hospitals to gather thousands of data points
per patient per day. Data ranges from unstructured free text information to images and waveforms to data from
sensors and monitoring devices. Access to accurate data is vital for assessing risk from intubation to drug
interactions. But often this data cannot be analyzed quickly, nor can hospital data be easily combined with external
data sources such as those from pharmaceutical companies and researchers.

2. Knowledge Graph Solutions

To optimize healthcare based on advanced data analytics and make sure clinicians have the right information
available in time to impact patient outcomes, Montefiore in partnership with Franz Inc and Intel Corp. have
deployed and deployed PALM - Patient-centered Analytic Learning Machine, a solution that brings together varied
and vast amounts of raw data for deeper analysis to flag patients who are at risk or help clinicians identify optimal
treatment plans. The PALM Knowledge Graph platform integrates both structured and unstructured data ranging
from basic science, clinician records, and population demographics to community, environmental, behavioral, and
wellness research data. By assessing a holistic and realistic profile of patients—along with relevant science, clinical
population histories, drug information, and medical imaging—PALM has the capability to improve care, identify at-
risk patients, and personalize medicine, while reducing error and inefficiency.

3. PALM Technical details

The PALM Knowledge Graph is a ‘semantic solution’ from the ground up. At the core of the Knowledge Graph are
more than 180 different life science and health care taxonomies and ontologies that are interlinked (see Fig.1). Every
conceivable type of patient information coming from more static databases or from dynamic HL7 streams are
mapped onto an events based ontology that makes it several orders of magnitude easier to do ad-hoc queries,
analytics and feature extraction for machine learning. A frame description language (FDL) makes it straight forward
for data scientists to declaratively specify the features needed for analytics and machine learning. The FDL is also
used to facilitate the learning aspect of the Knowledge Graph, that is: it specifies how to systematically store the
output of analytics back into the Knowledge Graph. We expect the size of the Knowledge Graph to grow to more
than 2 trillion triples in the near future so scalability is a very important consideration. We have implemented a
uniquely scalable approach to storage of the entire knowledge graph in a distributed graph database. The architecture
of this distributed graph database is based on a combination of partitioning and federation. Patient data is partitioned
and the partitions are federated with local unpartitionable knowledge bases and terminology systems. A new parallel
distributed SPARQL was developed to leverage all hardware resources in executing queries and performing
analytics. For queries that need to combine data from various partitions we developed a unique SPARQL pipeline
mechanism. Part of the current efforts is deep integration with SPARK to facilitate this pipeline mechanism but also
to facilitate machine learning and analytics.


Fig. 1 - The PALM Knowledgebase

4. Summary

The use of Semantic Web technologies to generate automated predictive and preventive approaches has proven
effective in identifying patients at highest risk, and providing consistent clinical decision support to relevant
practitioners. In one use case, accurate prediction of prolonged ventilation detects patients with more than 70%
likelihood of an event, 48 hours in advance of a fatal episode or respiratory failure in the hospital in order to avoid
the crisis.

PALM provides the ability to search the information that’s available to clinicians in a much more efficient way in
order to find patterns in the data and to use those patterns for to improve patient outcomes.