<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Schema Extraction for Privacy Preserving Processing of Sensitive Data</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Lars Christoph Gleim</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Md. Rezaul Karim</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lukas Zimmermann</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Oliver Kohlbacher</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Holger Stenzhorn</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Stefan Decker</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Oya Beyan</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Chair of Methods in Medical Informatics, University of Tubingen</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Fraunhofer FIT</institution>
          ,
          <addr-line>Sankt Augustin</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Informatik 5, RWTH Aachen University</institution>
          ,
          <addr-line>Aachen</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Sharing privacy sensitive data across organizational boundaries is commonly not a viable option due to the legal and ethical restrictions. Regulations such as the EU General Data Protection Rules impose strict requirements concerning the protection of personal data. Therefore new approaches are emerging to utilize data right in their original repositories without giving direct access to third parties, such as the Personal Health Train initiative [16]. Circumventing limitations of previous systems, this paper proposes an automated schema extraction approach compatible with existing Semantic Web-based technologies. The extracted schema enables ad-hoc query formulation against privacy sensitive data sources without requiring data access, and successive execution of that request in a secure enclave under the data provider's control. The developed approach permit us to extract structural information from non-uniformed resources and merge it into a single schema to preserve the privacy of each data source. Initial experiments show that our approach overcomes the reliance of previous approaches on agreeing upon shared schema and encoding a priori in favor of more exible schema extraction and introspection.</p>
      </abstract>
      <kwd-group>
        <kwd>Semantic Web</kwd>
        <kwd>Linked Data</kwd>
        <kwd>RDF</kwd>
        <kwd>Schema</kwd>
        <kwd>Privacy</kwd>
        <kwd>Data Access</kwd>
        <kwd>Distributed Systems</kwd>
        <kwd>Query Design</kwd>
        <kwd>Personal Health Train</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Data driven methods play an increasingly important role for cost e cient and
timely research results and e ective decision support [
        <xref ref-type="bibr" rid="ref45">45</xref>
        ] throughout numerous
domain such as economics [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], education [
        <xref ref-type="bibr" rid="ref42">42</xref>
        ], manufacturing [
        <xref ref-type="bibr" rid="ref49">49</xref>
        ], healthcare and
life sciences [
        <xref ref-type="bibr" rid="ref1 ref39 ref48">1, 39, 48</xref>
        ].
      </p>
      <p>
        At the same time, the data that build the foundation of these models
oftentimes underlies strict sharing requirements. For example, in the sensitive
healthcare domain, although rst responders, hospitals, and many other stakeholders
already collect valuable data for data-driven research and treatment today, large
portions of this data remain inaccessible to the majority of stakeholders { largely
due to ethical, administrative, legal and political hurdles that render data
sharing infeasible [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
      </p>
      <p>In practice, this leads to an inability to access large amounts of data crucial
for a variety of tasks such as the optimization of decision support systems, rst
response systems and data-driven research. At the core of this issue lies the lack
of an e ective mechanism to allow for data access in a legally certain, sustainable
and cost-e cient manner without extensive delays.</p>
      <p>
        For example, learning health systems, allowing for data-driven research on
sensitive data such as electronic health records (EHRs), have long been said
to bear the potential to \ ll major knowledge gaps about health care costs,
the bene ts and risks of drugs and procedures, geographic variations,
environmental health in uences, the health of special populations, and personalized
medicine." [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. While a variety of such systems have been proposed [
        <xref ref-type="bibr" rid="ref18 ref19 ref22 ref26">18, 19, 22,
26</xref>
        ], practical implementation has so far not become a reality, likely due to the
aforementioned hurdles.
      </p>
      <p>In order to enable data economy in privacy-sensitive domains and e ective
reuse of existing data and research, novel approaches are emerging to
overcome these limitations. One of those approaches is the Personal Health Train
(PHT) framework, which aims to bring algorithms and statistical models to data
sources, rather than sharing data with the third parties such as researchers. The
main bene t of the PHT approach is its ability of utilizing all the data,
including the sensitive and private information, without data having to leave from the
original data source. One of the main challenges of the approach is that data
users such as researchers are required to develop their models without having a
grasp of the actual data. Unless there are universally agreed data set
descriptions, there is a need to create and communicate a schema { that is information
about the structure of the data { to enable writing queries for heterogeneous
data resources.</p>
      <p>The key contributions of this paper consist of an automated approach for
extracting task-relevant schema from RDF data sources for the formulation of
data selection and integration queries without direct access to the data and a
corresponding integration with an information system architecture that allows
for the subsequent evaluation of that query in a secure enclave.</p>
      <p>The rest of the paper is structured as follows: Section 2 describes some
related work and the basic foundation. Section 3 formulates the key challenges of
schema extraction from sensitive data without sacri cing privacy, followed by
the description of our proposed schema extraction approach from existing data
in section 4. Additionally, it also demonstrates how to perform the data
selection and integration using the extracted schema. Section 5 then outlines initial
experimental result based on a sample use case. Finally, some future works have
been mentioned before concluding the paper in section 6.</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>
        In order to facilitate knowledge discovery for both humans and machines, the
FAIR data principles [
        <xref ref-type="bibr" rid="ref53">53</xref>
        ] have been proposed: A set of guiding principles to make
research and scienti c data Findable, Accessible, Interoperable, and Re-usable.
These guidance principles promise to help in the discovery, access, integration
and analysis of task-appropriate scienti c data and associated algorithms and
work ows. Thus, FAIR is gaining a lot of attention and increasing adoption.
      </p>
      <p>
        Core to realizing these principles are Semantic Web technologies [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], which
provide a framework for data sharing and reuse by making the semantics of
data machine interpretable. Particularly the directed, graph-based data model
RDF [
        <xref ref-type="bibr" rid="ref10 ref36 ref40">10, 36, 40</xref>
        ] in conjunction with formal conceptualizations of information
models, semantics and encoding conventions in RDF vocabularies and ontologies
takes an important role.
      </p>
      <p>
        As such, RDF Schema (RDFS) [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] and the Web Ontology Language (OWL) [
        <xref ref-type="bibr" rid="ref43">43</xref>
        ]
provide a proven framework in order to describe (but not necessarily enforce)
the structure and semantics of data. Substantially, RDFS introduces the
concepts of classes and properties as well as basic relations between them. OWL
{ a computational logic-based language { extends upon these concepts in order
to represent rich and complex knowledge about things, groups of things, and
relations between them.
      </p>
      <p>In the context of this work, we use the term 'schema' to refer to the semantic
and structural annotation of data using especially these two vocabularies.</p>
      <p>
        On the other hand, the classical notion of schema as the formal de nition of
the shape that data needs to comply with in order to be valid (i.e. schema
validation and enforcement) also exists in the Semantic Web with the Shape
Expression Languages (ShEx) [
        <xref ref-type="bibr" rid="ref46">46</xref>
        ] and the Shapes Constraint Language (SHACL) [
        <xref ref-type="bibr" rid="ref37">37</xref>
        ].
At this time, they are however not part of common data encoding standards,
vocabularies or ontologies and as such will not be regarded further in this work.
      </p>
      <p>Nevertheless, using RDFS and OWL, it is possible to create domain-speci c,
optionally interoperable vocabularies and ontologies, which may declare e.g. term
or concept equivalences and dependencies between each other and subsequently
enable interoperability across individual encodings.</p>
      <p>
        Popular examples include the Ontology for Biomedical Investigation (OBI) [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]
in the biology and healthcare domain, the GoodRelations ontology [
        <xref ref-type="bibr" rid="ref32">32</xref>
        ] in
eBusiness and the DCAT vocabulary [
        <xref ref-type="bibr" rid="ref41">41</xref>
        ], which is used for the general purpose
metadata annotation of datasets and data catalogs.
      </p>
      <p>
        In the context of eHealth systems, rst-class support for the Semantic Web
is becoming more and more prominent with popular candidates such as HL7
FHIR [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and SNOMED CT [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] providing corresponding ontologies, as well as
the establishment of clear guidelines for dataset descriptions such as the HCLS
Community Pro le [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ].
      </p>
      <p>
        Various high-quality catalogs of freely reusable vocabularies that provide
the description of the data that are available for an easy discovery of suitable
ontologies. Examples include the Linked Open Vocabulary (LOV) [
        <xref ref-type="bibr" rid="ref51">51</xref>
        ] and the
BioPortal [
        <xref ref-type="bibr" rid="ref44">44</xref>
        ] project.
      </p>
      <p>
        Related ideas using schema export and import for federated data access date
back to as early as 1985 [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ] but it is only recently that the idea has received
more attention in the context of the Semantic Web.
      </p>
      <p>
        Kellou-Menouer and Zoubida [
        <xref ref-type="bibr" rid="ref34">34</xref>
        ] propose a schema discovery approach based
on hierarchical clustering instead of data annotations thus leading to an
approximate schema. Florenzano et al. [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ], Weise et al. [
        <xref ref-type="bibr" rid="ref52">52</xref>
        ] and Dudas et al. [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]
introduce approaches focused on schema extraction for visualization of the data
structure but do not consider publishing or reuse of the extracted schema. Benedetti
et al. [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ] propose an interesting related approach for schema extraction,
visualization and query generation but do not consider interoperability issues and
rely on custom mechanisms for schema storage.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Motivation</title>
      <p>
        Recently, Jochems et al. [
        <xref ref-type="bibr" rid="ref33">33</xref>
        ] and Deist et al. [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] introduced two related
promising Semantic Web-based approaches in the context of the PHT initiative, founded
on the key concept of bringing research to the data rather than bringing data
to the research. As such the underlying information system architecture enables
learning from privacy sensitive data without the data ever crossing
organizational boundaries, maintaining control over the data, preserving data privacy
and thereby overcoming legal and ethical issues common to other forms of data
exchanges.
      </p>
      <p>
        The general approach of this underlying system may be outlined as follows:
1. Initially, both the client and data provider agree upon a set of attributes
or features, such that all participating data providers have corresponding
sources of (privacy sensitive) data.
2. Then each data provider encodes their data using an (also agreed upon)
ontology or vocabulary, converting it into RDF representation. This
process yields proper Linked Data [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ] and thus enables semantic
interoperability [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
3. The resulting RDF data is deployed to a private triple store at each
location, providing a private SPARQL [
        <xref ref-type="bibr" rid="ref47">47</xref>
        ] query endpoint, which is not directly
accessible by the client.
4. A SPARQL data query is then formulated based on the previously agreed
upon encoding and a corresponding distributable processing algorithm
dened.
5. The shared query is then executed locally at each data provider against their
respective triple store and the returned data processed using the
corresponding algorithm.
6. The local results are combined into a global one.
7. Depending on the approach, steps 5 and 6 may be further iterated.
      </p>
      <p>While these approaches { introduced in the context of the PHT initiative {
work well when multiple parties agree on jointly collecting, encoding and
evaluating data in advance { such as is the case for conducting individual coordinated
studies { they solve the issue of interoperability by agreeing on a single shared
knowledge representation and encoding methodology a priori (steps 1-3 in the
above process). In an optimal setting where agreeing on a single shared and
global information model and encoding, reuse of diverse and existing data could
always be directly accomplished with this approach.</p>
      <p>However to our knowledge, so far all corresponding e orts have been
unsuccessful. At the time of writing the popular https://fairsharing.org/ portal
indexes 1055 databases using 1136 standards, suggesting that in practice, each
collected dataset and domain much rather tends to introduce its own encoding
methodology.</p>
      <p>
        Thus when trying to reuse diverse existing data, especially without direct
access to the data, ad-hoc data selection and integration facilities (corresponding
to the rst two steps of the classical Knowledge Discovery in Databases (KDD)
process [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]) are indispensable.
      </p>
      <p>For a client without direct access to the data, this process is however
typically infeasible since it inherently relies upon inspection of the structure of the
data. In this setting, in order to allow for the e ective design of such queries
(corresponding to step 4 in the above approach), a proper description of the
structure of the available data { a schema of the data { is required.</p>
      <p>This schema should further be compatible with standard Semantic Web
tools for interoperability and thus be available as RDFS and OWL
vocabulary via a SPARQL endpoint. While OWL provides a powerful set of
modeling primitives, in the context of this work we focus on RDFS and the OWL
owl:equivalentClass, owl:equivalentProperty and owl:sameAs predicates,
which we deem most relevant in order to enable interoperability and the e ective
formulation of selection and integration queries.</p>
      <p>As a result, the schema not only contains everything that is needed in order
to create data queries (i.e. using SPARQL), but also conveys far less privacy
critical information than the actual data. As such it can be published publicly
without privacy concerns in many scenarios.</p>
      <p>In the following we describe an automated approach for schema extraction
from RDF data which allows for the formulation of data selection and integration
queries without direct access to the data and the subsequent evaluation of that
query in a secure enclave.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Proposed Approach</title>
      <p>In this section, we discuss the proposed approach. First, we describe the schema
extraction technique. Then we show how the extracted schema can be used
further for the data selection and integration.
4.1</p>
      <p>The schema extraction
We propose an approach for schema extraction based on exploiting the key
characteristics of RDF, RDFS, and OWL. RDF data encoded in compliance with
aforementioned vocabularies inherently include metadata about their semantics
and structural relationships.</p>
      <p>For the schema extraction, the rdf:type relation plays the key role, as it
declares data points to be instances of speci c data types or classes. Anything
that is a type in the sense of occurring as the target of this relation thus
automatically becomes part of the schema. Additionally, any relation (that is any
identi er occurring in the predicate position of a subject-predicate-object triple)
which occurs in the data is itself a part of the schema and is included as well.</p>
      <p>
        As such, the entire schema of a given RDF data set can be extracted
using a single SPARQL CONSTRUCT query as depicted in listing 1.1. For this
approach, we assume OWL entailment regime [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ] support of the SPARQL
endpoint and proper inclusion of the used vocabularies in the triple store.
      </p>
      <p>Listing 1.1. SPARQL schema extraction query using entailment regime
As stated earlier, the preceding query constructs an RDF graph (line 1)
containing all the directly describing triples ?s ?p ?o that occur in the tripe
store but having only the following subjects:</p>
      <sec id="sec-4-1">
        <title>1. ?s that are used as RDF types (line 3),</title>
      </sec>
      <sec id="sec-4-2">
        <title>2. ?s that are used as predicates (line 4)</title>
        <p>According to the SPARQL entailment regime, all the subclass relationships,
transitive properties etc. used in the data are automatically resolved and included
too. The query however only extracts direct properties and as such some complex
constraints such as OWL disjointness axioms are not extracted properly, which
we however consider to be irrelevant for the task of query formulation.</p>
        <p>Note that we de ne the relevant subset of all available schema information
to be that which is actually used by the data, i.e. the instantiated schema, and
thus only extract that.</p>
        <p>
          Since in practice few SPARQL endpoints actually support any kind of
entailment, it is alternatively also possible to extract the schema directly using the
SPARQL 1.1 Property Paths [
          <xref ref-type="bibr" rid="ref38">38</xref>
          ] feature, independent of entailment support on
the endpoint. A corresponding SPARQL query is depicted in listing 1.2.
1
2
3
4
5
6
7
8 }
        </p>
        <p>C O N S T R U C T {
? a ? b ? c ; a rdfs : C l a s s .</p>
        <p>? d ? e ? f ; a rdf : P r o p e r t y .
} {
{ S E L E C T ? a ? b ? c { [] a /( owl : e q u i v a l e n t C l a s s | owl :
s a m e A s | rdfs : s u b C l a s s O f ) * ? a O P T I O N A L { ? a ? b ? c }
} }
U N I O N
{ S E L E C T ? d ? e ? f { [] ? x []. ? x ( owl :
e q u i v a l e n t P r o p e r t y | owl : s a m e A s | rdfs : s u b P r o p e r t y O f ) *
? d O P T I O N A L { ? d ? e ? f } } }</p>
        <p>Listing 1.2. SPARQL 1.1 schema extraction query using property paths
The query constructs a graph of all RDFS Classes and RDF Properties
and their direct properties which are either directly instantiated or used in the
dataset, a generalization of one used in the dataset and equivalent resources.</p>
        <p>
          Corresponding to RDF 1.1 Semantics [
          <xref ref-type="bibr" rid="ref29">29</xref>
          ] we detect instantiated RDF
properties as any IRI used in predicate position (c.f. rdfD2) and annotate them
accordingly in line 3 and 7. We detect instantiated classes based on the RDFS
axiomatic triple rdf:type rdfs:range rdfs:Class as any object of the RDF
type predicate in line 2 and annotate them accordingly in line 5.
        </p>
        <p>For both properties and classes, we resolve corresponding generalizations
directly using the relevant RDFS entailment patterns (rdfs5, rdfs7, rdfs9, rdfs11)
and concept equivalences using OWL's owl:equivalentClass,
owl:equivalentProperty and owl:sameAs predicates. While owl:sameAs is
only supposed to be used for the declaration of equivalence between
individuals, it is commonly misused in practice and as such deliberately included in
this query.</p>
        <p>As such the presented approach is capable of extracting the relevant (i.e.
instantiated) schema from a given RDF dataset which can subsequently be used
for SPARQL query design without requiring access to the original data.
4.2</p>
        <p>
          Data selection and integration using the extracted schema
Once the schema is extracted, the resulting schema can be publicly exposed using
a dedicated SPARQL endpoint. It is then possible to use existing SPARQL query
writing assistance tools (i.e. query builder) such as OWLPath [
          <xref ref-type="bibr" rid="ref50">50</xref>
          ], QueryVOWL [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ]
or VSB [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ] together with the extracted schema for schema introspection aided
design of data selection and integration queries. An overview of available tools
can be found in [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ].
        </p>
        <p>The work ow of the proposed architecture is illustrated in gure 1, which
depicts the communication between client and data provider over a public network.
In this scenario, the data provider's internal communication within its private
network is highlighted by the bounding box. In preparation for client usage, the
schema of the data stored in the private triple store is extracted in step 0
using the approach presented above and deployed to a publicly accessible schema
endpoint.</p>
        <p>The client can now start to create a SPARQL query in step 1, using a query
builder of their choice in conjunction with the schema endpoint for
introspection. The query is then sent to a submission endpoint acting as the gateway
between the data provider and the client in step 2. For the scope of this work,
We assume that this requests includes algorithmic means of data anonymization,
ensuring its results are no longer privacy sensitive and that validation is done
manually.</p>
        <p>
          Once validated, the request is scheduled in step 4 for processing within a
secure enclave (processing), where the query and algorithm are evaluated (step
5 ). This is analogous to the approach proposed by Jochems et al. [
          <xref ref-type="bibr" rid="ref33">33</xref>
          ] and Deist
et al. [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] as detailed in section 2. Finally, only the processing result is returned
to the client in step 6 without ever directly granting access to the data.
5
        </p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Experimental Results</title>
      <p>In order to allow for a rst evaluation of the proposed approach, a simple test
case was constructed. A number of records containing personal information of
individuals such as name, birthday and phone number were constructed, encoded
using the foaf and schema.org vocabularies and deployed to a private triple store.
The relevant schema subset was then extracted using the query depicted in
listing 1.2 and its correctness and completeness validated manually. Subsequently
the public schema was used in order to create a data selection and integration
query, which was successfully executed against the private data endpoint.</p>
      <p>
        In order to evaluate the e ectiveness of the schema extraction process, we
employ the HCLS core statical measures to compare the characteristics of the full
schema.org [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ] and foaf [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] vocabularies, their union and the extracted schema.
The results, depicted in table 1, show that the number of extracted triples is
signi cantly lower compared to the original count and roughly 3.3 percent of
the union of both vocabularies. As intended, only the subset of the vocabularies
that actually describes the private dataset is extracted, allowing for focused
query design based on only the relevant schema, thus saving cognitive as well as
computational e ort during schema introspection.
      </p>
      <p>Since the extracted schema also contains explicit equivalence information (for
example between the foaf:Person and schema:Person, which is in this case
only declared in the schema.org vocabulary) it is possible to explicitly design
queries considering the corresponding implications at query design time without
relying upon inference support of the SPARQL endpoints. As such it may provide
an additional building block for enabling e cient interoperability across di erent
data codings.
6</p>
    </sec>
    <sec id="sec-6">
      <title>Conclusion and Outlook</title>
      <p>In this paper, we proposed an automated way of schema extraction from Linked
Data in RDF format. Our proposed approach enables the introspection
supported development of SPARQL queries without access to the actual data. We
presented a system architecture to realize the overall work ow of the approach.
From the users perspective, our approach enables in query formulation against
privacy sensitive data sources and successive evaluation of that request in a
secure enclave at the data provider's end.</p>
      <p>With this architecture, we can overcome the reliance of previous approaches
on agreeing upon shared schema and encoding a priori in favor of more exible
schema extraction and introspection.</p>
      <p>This method promises to provide a key building block in enabling e cient
reuse of data across a variety of domains. In conjunction with advanced
distributed learning and processing systems, the approach could be used in order
to overcome existing data sharing hurdles and unlock hidden value in existing
data silos.</p>
      <p>
        In the future, we plan to extend this work by exploring integrations with
query federation engines and access control, such as the SAFE query federation
engine [
        <xref ref-type="bibr" rid="ref35">35</xref>
        ]. We also plan to provide a more extensive performance analysis and
evaluation in order to show the e ectiveness of this approach.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Abernethy</surname>
            ,
            <given-names>A.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Etheredge</surname>
            ,
            <given-names>L.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ganz</surname>
            ,
            <given-names>P.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wallace</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>German</surname>
            ,
            <given-names>R.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Neti</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bach</surname>
            ,
            <given-names>P.B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Murphy</surname>
            ,
            <given-names>S.B.</given-names>
          </string-name>
          :
          <article-title>Rapid-learning system for cancer care</article-title>
          .
          <source>Journal of Clinical Oncology</source>
          <volume>28</volume>
          (
          <issue>27</issue>
          ),
          <volume>4268</volume>
          {
          <fpage>4274</fpage>
          (
          <year>2010</year>
          ), https://doi.org/10.1200/JCO.
          <year>2010</year>
          .
          <volume>28</volume>
          . 5478, pMID:
          <fpage>20585094</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Bandrowski</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brinkman</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brochhausen</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brush</surname>
            ,
            <given-names>M.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bug</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chibucos</surname>
            ,
            <given-names>M.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Clancy</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Courtot</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Derom</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dumontier</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , et al.:
          <article-title>The ontology for biomedical investigations</article-title>
          .
          <source>PloS one 11(4)</source>
          ,
          <year>e0154556</year>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Basole</surname>
            ,
            <given-names>R.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Russell</surname>
            ,
            <given-names>M.G.</given-names>
          </string-name>
          , Huhtamaki, J.,
          <string-name>
            <surname>Rubens</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Still</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Park</surname>
          </string-name>
          , H.:
          <article-title>Understanding business ecosystem dynamics: a data-driven approach</article-title>
          .
          <source>ACM Transactions on Management Information Systems (TMIS) 6</source>
          (
          <issue>2</issue>
          ),
          <volume>6</volume>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Benedetti</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bergamaschi</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Po</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Online index extraction from linked open data sources</article-title>
          .
          <source>In: LD4IE@ ISWC</source>
          . pp.
          <volume>9</volume>
          {
          <issue>20</issue>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Benedetti</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bergamaschi</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Po</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Visual querying lod sources with lodex</article-title>
          .
          <source>In: Proceedings of the 8th International Conference on Knowledge Capture</source>
          . p.
          <fpage>12</fpage>
          .
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Beredimas</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kilintzis</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chouvarda</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Maglaveras</surname>
          </string-name>
          , N.:
          <article-title>A reusable ontology for primitive and complex hl7 fhir data types</article-title>
          .
          <source>In: Engineering in Medicine and Biology Society (EMBC)</source>
          ,
          <year>2015</year>
          37th Annual International Conference of the IEEE. pp.
          <volume>2547</volume>
          {
          <fpage>2550</fpage>
          .
          <string-name>
            <surname>IEEE</surname>
          </string-name>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Berners-Lee</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hendler</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lassila</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          :
          <article-title>The semantic web</article-title>
          . Scienti c american
          <volume>284</volume>
          (
          <issue>5</issue>
          ),
          <volume>34</volume>
          {
          <fpage>43</fpage>
          (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Brickley</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Rdf vocabulary description language 1.0: Rdf schema</article-title>
          . http://www. w3. org/TR/rdf-schema/ (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Brickley</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Miller</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Foaf vocabulary speci cation 0</article-title>
          .91 (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Cyganiak</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wood</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lanthaler</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Klyne</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Carroll</surname>
            ,
            <given-names>J.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McBride</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Rdf 1.1 concepts and abstract syntax</article-title>
          .
          <source>W3C recommendation 25(02)</source>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Decker</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Melnik</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Van Harmelen</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fensel</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Klein</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Broekstra</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Erdmann</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Horrocks</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>The semantic web: The roles of xml and rdf</article-title>
          .
          <source>IEEE Internet computing 4(5)</source>
          ,
          <volume>63</volume>
          {
          <fpage>73</fpage>
          (
          <year>2000</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Deist</surname>
            ,
            <given-names>T.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jochems</surname>
            , A., van Soest,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nalbantov</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Oberije</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Walsh</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Eble</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bulens</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Coucke</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dries</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dekker</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lambin</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Infrastructure and distributed learning methodology for privacy-preserving multi-centric rapid learning health care: euroCAT</article-title>
          .
          <source>Clinical and Translational Radiation Oncology</source>
          <volume>4</volume>
          ,
          <issue>24</issue>
          {
          <fpage>31</fpage>
          (
          <year>2017</year>
          ), http://linkinghub.elsevier.com/retrieve/pii/S2405630816300271
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Donnelly</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Snomed-ct: The advanced terminology and coding system for ehealth</article-title>
          .
          <source>Studies in health technology and informatics 121</source>
          ,
          <issue>279</issue>
          (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Doshi</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , Je erson,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Del Mar</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          :
          <article-title>The imperative to share clinical study reports: recommendations from the tami u experience</article-title>
          .
          <source>PLoS medicine 9</source>
          (
          <issue>4</issue>
          ),
          <year>e1001201</year>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Dudas</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Svatek</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mynarz</surname>
          </string-name>
          , J.:
          <article-title>Dataset summary visualization with lodsight</article-title>
          .
          <source>In: International Semantic Web Conference</source>
          . pp.
          <volume>36</volume>
          {
          <fpage>40</fpage>
          . Springer (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Dutch</surname>
          </string-name>
          <article-title>Tech Center For Life Sciences: Manifesto of the Personal Health Train consortium (</article-title>
          <year>2017</year>
          ), https://www.dtls.nl/wp-content/uploads/2017/12/ PHT_Manifesto.pdf
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Eipert</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Metadatenextraktion und vorschlagssysteme im visual sparql builder</article-title>
          .
          <source>INFORMATIK</source>
          <year>2015</year>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Embi</surname>
            ,
            <given-names>P.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Payne</surname>
            ,
            <given-names>P.R.:</given-names>
          </string-name>
          <article-title>Clinical research informatics: challenges, opportunities and de nition for an emerging domain</article-title>
          .
          <source>Journal of the American Medical Informatics Association</source>
          <volume>16</volume>
          (
          <issue>3</issue>
          ),
          <volume>316</volume>
          {
          <fpage>327</fpage>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Etheredge</surname>
            ,
            <given-names>L.M.:</given-names>
          </string-name>
          <article-title>A rapid-learning health system</article-title>
          .
          <source>Health a airs</source>
          <volume>26</volume>
          (
          <issue>2</issue>
          ),
          <year>w107</year>
          {
          <fpage>w118</fpage>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Fayyad</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Piatetsky-Shapiro</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Smyth</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>From data mining to knowledge discovery in databases</article-title>
          .
          <source>AI</source>
          magazine
          <volume>17</volume>
          (
          <issue>3</issue>
          ),
          <volume>37</volume>
          (
          <year>1996</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Florenzano</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Parra</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Reutter</surname>
            ,
            <given-names>J.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Venegas</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>A visual aide for understanding endpoint data. Visualization and Interaction for Ontologies and Linked Data (VOILA!</article-title>
          <year>2016</year>
          ) p.
          <fpage>102</fpage>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Friedman</surname>
            ,
            <given-names>C.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wong</surname>
            ,
            <given-names>A.K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Blumenthal</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Achieving a nationwide learning health system</article-title>
          .
          <source>Science translational medicine 2</source>
          (
          <issue>57</issue>
          ),
          <year>57cm29</year>
          {
          <fpage>57cm29</fpage>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Glimm</surname>
            ,
            <given-names>C.O.B.</given-names>
          </string-name>
          :
          <article-title>Sparql 1.1 entailment regimes (</article-title>
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Grafkin</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mironov</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fellmann</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lantow</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sandkuhl</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Smirnov</surname>
            ,
            <given-names>A.V.</given-names>
          </string-name>
          :
          <article-title>Sparql query builders: Overview and comparison</article-title>
          .
          <source>In: BIR Workshops</source>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <surname>Gray</surname>
            ,
            <given-names>A.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baran</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Marshall</surname>
            ,
            <given-names>M.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dumontier</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Dataset descriptions: Hcls community pro le</article-title>
          . Interest group note,
          <source>W3C (May</source>
          <year>2015</year>
          ) http://www. w3. org/TR/hcls-dataset (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26.
          <string-name>
            <surname>Greene</surname>
            ,
            <given-names>S.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Reid</surname>
            ,
            <given-names>R.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Larson</surname>
            ,
            <given-names>E.B.</given-names>
          </string-name>
          :
          <article-title>Implementing the learning health system: from concept to action</article-title>
          .
          <source>Annals of internal medicine 157(3)</source>
          ,
          <volume>207</volume>
          {
          <fpage>210</fpage>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          27.
          <string-name>
            <surname>Guha</surname>
            ,
            <given-names>R.V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brickley</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Macbeth</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          : Schema.
          <article-title>org: evolution of structured data on the web</article-title>
          .
          <source>Communications of the ACM</source>
          <volume>59</volume>
          (
          <issue>2</issue>
          ),
          <volume>44</volume>
          {
          <fpage>51</fpage>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          28.
          <string-name>
            <surname>Haag</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lohmann</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Siek</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ertl</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Queryvowl: Visual composition of sparql queries</article-title>
          .
          <source>In: International Semantic Web Conference</source>
          . pp.
          <volume>62</volume>
          {
          <fpage>66</fpage>
          . Springer (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          29.
          <string-name>
            <surname>Hayes</surname>
            ,
            <given-names>P.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Patel-Schneider</surname>
            ,
            <given-names>P.F.</given-names>
          </string-name>
          :
          <article-title>Rdf 1.1 semantics. w3c recommendation, february 2014</article-title>
          . World Wide Web Consortium. Retrieved from https://www.w3.org/TR/2014/REC-rdf11
          <string-name>
            <surname>-</surname>
          </string-name>
          mt-
          <volume>20140225</volume>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          30.
          <string-name>
            <surname>Heath</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bizer</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Linked data: Evolving the web into a global data space</article-title>
          .
          <source>Synthesis lectures on the semantic web: theory and technology 1(1)</source>
          ,
          <volume>1</volume>
          {
          <fpage>136</fpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          31.
          <string-name>
            <surname>Heimbigner</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McLeod</surname>
            ,
            <given-names>D.:</given-names>
          </string-name>
          <article-title>A federated architecture for information management</article-title>
          .
          <source>ACM Transactions on Information Systems (TOIS) 3</source>
          (
          <issue>3</issue>
          ),
          <volume>253</volume>
          {
          <fpage>278</fpage>
          (
          <year>1985</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          32.
          <string-name>
            <surname>Hepp</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Goodrelations: An ontology for describing products</article-title>
          and
          <article-title>services o ers on the web</article-title>
          .
          <source>In: International Conference on Knowledge Engineering and Knowledge Management</source>
          . pp.
          <volume>329</volume>
          {
          <fpage>346</fpage>
          . Springer (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          33.
          <string-name>
            <surname>Jochems</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Deist</surname>
          </string-name>
          , T.M.,
          <string-name>
            <surname>van Soest</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Eble</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bulens</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Coucke</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dries</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lambin</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dekker</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Distributed learning: Developing a predictive model based on data from multiple hospitals without data leaving the hospital { A real life proof of concept</article-title>
          .
          <source>Radiotherapy and Oncology</source>
          <volume>121</volume>
          (
          <issue>3</issue>
          ),
          <volume>459</volume>
          {
          <fpage>467</fpage>
          (
          <year>2016</year>
          ), http: //dx.doi.org/10.1016/j.radonc.
          <year>2016</year>
          .
          <volume>10</volume>
          .002
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          34.
          <string-name>
            <surname>Kellou-Menouer</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kedad</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          :
          <article-title>Schema discovery in rdf data sources</article-title>
          .
          <source>In: International Conference on Conceptual Modeling</source>
          . pp.
          <volume>481</volume>
          {
          <fpage>495</fpage>
          . Springer (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          35.
          <string-name>
            <surname>Khan</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Saleem</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mehdi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hogan</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mehmood</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rebholz-Schuhmann</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sahay</surname>
          </string-name>
          , R.: Safe:
          <article-title>Sparql federation over rdf data cubes with access control</article-title>
          .
          <source>Journal of biomedical semantics 8(1)</source>
          ,
          <volume>5</volume>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          36.
          <string-name>
            <surname>Klyne</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Carroll</surname>
            ,
            <given-names>J.J.:</given-names>
          </string-name>
          <article-title>Resource description framework (rdf): Concepts and abstract syntax (</article-title>
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          37.
          <string-name>
            <surname>Knublauch</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ryman</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Shapes constraint language (shacl)</article-title>
          .
          <source>W3C Candidate Recommendation</source>
          <volume>11</volume>
          ,
          <issue>8</issue>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          38.
          <string-name>
            <surname>Kostylev</surname>
            ,
            <given-names>E.V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Reutter</surname>
            ,
            <given-names>J.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Romero</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vrgoc</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Sparql with property paths</article-title>
          .
          <source>In: International Semantic Web Conference</source>
          . pp.
          <volume>3</volume>
          {
          <fpage>18</fpage>
          . Springer (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          39.
          <string-name>
            <surname>Lambin</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rios-Velazquez</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Leijenaar</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Carvalho</surname>
            , S., van Stiphout,
            <given-names>R.G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Granton</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zegers</surname>
            ,
            <given-names>C.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gillies</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Boellard</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dekker</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , et al.:
          <article-title>Radiomics: extracting more information from medical images using advanced feature analysis</article-title>
          .
          <source>European journal of cancer 48(4)</source>
          ,
          <volume>441</volume>
          {
          <fpage>446</fpage>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          40.
          <string-name>
            <surname>Lassila</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Swick</surname>
            ,
            <given-names>R.R.</given-names>
          </string-name>
          :
          <article-title>Resource description framework (rdf) model and syntax speci cation (</article-title>
          <year>1999</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref41">
        <mixed-citation>
          41.
          <string-name>
            <surname>Maali</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Erickson</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Archer</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Data catalog vocabulary (dcat)</article-title>
          .
          <source>W3C Recommendation</source>
          <volume>16</volume>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref42">
        <mixed-citation>
          42.
          <string-name>
            <surname>Marsh</surname>
            ,
            <given-names>J.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pane</surname>
            ,
            <given-names>J.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hamilton</surname>
            ,
            <given-names>L.S.:</given-names>
          </string-name>
          <article-title>Making sense of data-driven decision making in education (</article-title>
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref43">
        <mixed-citation>
          43.
          <string-name>
            <surname>McGuinness</surname>
            ,
            <given-names>D.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Van Harmelen</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          , et al.:
          <article-title>Owl web ontology language overview</article-title>
          .
          <source>W3C recommendation</source>
          <volume>10</volume>
          (
          <issue>10</issue>
          ),
          <year>2004</year>
          (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref44">
        <mixed-citation>
          44.
          <string-name>
            <surname>Noy</surname>
            ,
            <given-names>N.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shah</surname>
            ,
            <given-names>N.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Whetzel</surname>
            ,
            <given-names>P.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dai</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dorf</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gri</surname>
            <given-names>th</given-names>
          </string-name>
          , N.,
          <string-name>
            <surname>Jonquet</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rubin</surname>
            ,
            <given-names>D.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Storey</surname>
            ,
            <given-names>M.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chute</surname>
            ,
            <given-names>C.G.</given-names>
          </string-name>
          , et al.:
          <article-title>Bioportal: ontologies and integrated data resources at the click of a mouse</article-title>
          .
          <source>Nucleic acids research 37(suppl 2)</source>
          ,
          <source>W170{ W173</source>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref45">
        <mixed-citation>
          45.
          <string-name>
            <surname>Power</surname>
            ,
            <given-names>D.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sharda</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Burstein</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Decision support systems</article-title>
          . Wiley Online Library (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref46">
        <mixed-citation>
          46.
          <string-name>
            <surname>Prud'hommeaux</surname>
          </string-name>
          , E.,
          <string-name>
            <surname>Labra</surname>
            <given-names>Gayo</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>J.E.</given-names>
            ,
            <surname>Solbrig</surname>
          </string-name>
          , H.:
          <article-title>Shape expressions: an rdf validation and transformation language</article-title>
          .
          <source>In: Proceedings of the 10th International Conference on Semantic Systems</source>
          . pp.
          <volume>32</volume>
          {
          <fpage>40</fpage>
          .
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref47">
        <mixed-citation>
          47.
          <string-name>
            <surname>Prud'hommeaux</surname>
          </string-name>
          , E.,
          <string-name>
            <surname>Seaborne</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , et al.:
          <article-title>Sparql query language for rdf (</article-title>
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref48">
        <mixed-citation>
          48.
          <string-name>
            <surname>Shiboski</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shiboski</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Criswell</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baer</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Challacombe</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lanfranchi</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          , Schi dt, M.,
          <string-name>
            <surname>Umehara</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vivino</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhao</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          , et al.:
          <article-title>American college of rheumatology classi cation criteria for sjogren's syndrome: A data-driven, expert consensus approach in the sjogren's international collaborative clinical alliance cohort</article-title>
          .
          <source>Arthritis care &amp; research 64(4)</source>
          ,
          <volume>475</volume>
          {
          <fpage>487</fpage>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref49">
        <mixed-citation>
          49.
          <string-name>
            <surname>Simchi-Levi</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Om forum|om research: From problem-driven to data-driven research</article-title>
          .
          <source>Manufacturing &amp; Service Operations Management</source>
          <volume>16</volume>
          (
          <issue>1</issue>
          ),
          <volume>1</volume>
          {
          <fpage>22</fpage>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref50">
        <mixed-citation>
          50.
          <string-name>
            <surname>Valencia-Garc</surname>
            <given-names>a</given-names>
          </string-name>
          , R., Garc
          <string-name>
            <surname>a-Sanchez</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Castellanos-Nieves</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , et al.:
          <article-title>Owlpath: An owl ontology-guided query editor</article-title>
          .
          <source>IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans</source>
          <volume>41</volume>
          (
          <issue>1</issue>
          ),
          <volume>121</volume>
          {
          <fpage>136</fpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref51">
        <mixed-citation>
          51.
          <string-name>
            <surname>Vandenbussche</surname>
          </string-name>
          , P.Y.,
          <string-name>
            <surname>Atemezing</surname>
            ,
            <given-names>G.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Poveda-Villalon</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vatant</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Linked open vocabularies (lov): a gateway to reusable semantic vocabularies on the web</article-title>
          .
          <source>Semantic Web</source>
          <volume>8</volume>
          (
          <issue>3</issue>
          ),
          <volume>437</volume>
          {
          <fpage>452</fpage>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref52">
        <mixed-citation>
          52.
          <string-name>
            <surname>Weise</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lohmann</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Haag</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Ld-vowl: Extracting and visualizing schema information for linked data. Visualization and Interaction for Ontologies and Linked Data (VOILA!</article-title>
          <year>2016</year>
          ) p.
          <fpage>120</fpage>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref53">
        <mixed-citation>
          53.
          <string-name>
            <surname>Wilkinson</surname>
            ,
            <given-names>M.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dumontier</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Aalbersberg</surname>
            ,
            <given-names>I.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Appleton</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Axton</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baak</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Blomberg</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Boiten</surname>
            ,
            <given-names>J.W.</given-names>
          </string-name>
          , da Silva Santos,
          <string-name>
            <given-names>L.B.</given-names>
            ,
            <surname>Bourne</surname>
          </string-name>
          ,
          <string-name>
            <surname>P.E.</surname>
          </string-name>
          , et al.:
          <article-title>The fair guiding principles for scienti c data management and stewardship</article-title>
          .
          <source>Scienti c data 3</source>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>