<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Elsevier's Healthcare Knowledge Graph and the Case for Enterprise Level Linked Data Standards</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Alex DeJong</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Radmila Bord</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Will Dowling</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Rinke Hoekstra</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ryan Moquin</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Charlie O</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mevan Samarasinghe</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Paul Snyder</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Craig Stanley</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anna Tordai</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michael Trefry</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Paul Groth</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Radarweg 29. Amsterdam 1043 NX a.dejong</institution>
          ,
          <addr-line>r.bord, w.dowling, p.groth, r.hoekstra, r.moquin, charlie.o, m.samarasinghe, p.snyder, c.stanley, a.tordai, m.trefry, @elsevier.com</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>Linked Data principles and standards provide a powerful means for data integration across multiple sources. However, these standards are often too open to interpretation. In this work, we describe the necessity to de ne enterprise wide Linked Data standards in order to deliver Elsevier's healthcare knowledge graph and lessons learned in its implementation.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>for example, knowledge about which tables can be combined, or about access
and licensing controls is in the logic.</p>
      <p>Thus, to tackle these challenges, we looked to Linked Data which addresses
many of these concerns in terms of heterogeneity of technology stacks and the
distributed nature of data. Here, we describe how we adapted Linked Data to
Elsevier's context and a number of lessons learned in this internal adoption.</p>
      <p>A Linked Data Approach: As a starting point, we asked three development
teams to expose their data using APIs following the Linked Data principles.4 In
the course of initial development, it became clear that the breadth of technology
choices that are Linked Data compliant is large and hindered interoperability.
For example, end-points could provide RDF-XML as their serialization format
or JSON-LD. Both are valid formats to establish an a Linked Data end-point,
but a data consumer application that wants to use these end-points has to
understand and implement two formats. Likewise, in other areas Linked Data is
under speci ed. For example, security is is typically a limited consideration in
the standards, since it is assumed all the data is publicly accessible on the
Internet, which is true for many published endpoints but enterprise data needs to
consider access requirements.</p>
      <p>Thus, we took a two pronged approach. The rst prong was, working with
the development teams, to de ne a set of enterprise-wide Linked Data standards.
Examples of constraints in these standards are as follows: All URIs need to be
dereferenceable, support content-negotiation and at a minimum must return
JSON-LD. A common set of namespace pre xes is speci ed. Class and property
de nitions need to follow a common set of naming conventions. Basic provenance
information with a common set of provenance relations must be provided.</p>
      <p>The second prong was to provide a central location (i.e. hostname) for access
to the data network. This proxy maps selected context paths, e.g. /health/drugs,
to a speci c linked data API in the network. This central location has three
bene ts. It provides a well-known location to access Elsevier's Linked Data. It
allows us to have control over what APIs are able to become o cially part of
the data network. Lastly, it allows us to employ standard security measures
across the data network minimizing overhead on the decentralized providers.
The combination of a technical enforcement point with clear guidance provided
a path to implementation. Our current healthcare knowledge graph is currently
being used in the development of a number of new products.</p>
      <p>Conclusion &amp; Lessons Learned: Overall, Linked Data has provided a viable
mechanism for producing a cohesive data network. In the process, we came away
with a number of key lessons learned: 1. an API centric approach is crucial as
it ts in with standard web app models; 2. using existing developer tools (Jira,
Git, Slack, Con uence) is important for integrating into the development cycle;
3. specifying which part of Linked Data technologies should be used is
critical; 4. providing internal examples eases adoption; and 5. the platform agnostic
nature of Linked Data is helpful in heterogenous environments within the
enterprise.</p>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>