<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>X (B. Meyjohann);</journal-title>
      </journal-title-group>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Data and Permissions from Relational Databases via the Solid Protocol</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Lukas Kubelka</string-name>
          <email>lukas.kubelka@student.kit.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Benjamin Meyjohann</string-name>
          <email>benjamin.meyjohann@student.kit.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Christoph H.-J. Braun</string-name>
          <email>braun@kit.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tobias Käfer</string-name>
          <email>tobias.kaefer@kit.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Linked Data</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Solid</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Web Access Control</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Karlsruhe Institute of Technology (KIT)</institution>
          ,
          <addr-line>Karlsruhe</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>1828</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>We address the challenge of integrating relational databases with the Solid Protocol. The goal: Make legacy tabular data accessible as virtual Web resource representations in RDF graphs, while propagating and re-using the database's existing roles and permissions for access management. We present our solution comprised of an SQL database, an RDF graph and Web resource virtualisation layer using the mapping language R2RML, and the Solid Protocol for authentication, authorization and data access.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>CEUR
ceur-ws.org</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>
        The Solid Protocol [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] aims to break open data silos by decoupling users’ identity, provided
data and consuming applications. But especially in medium-sized and large organizations, data
silos are still commonplace1. As for data management, relational databases (RDBs) are the
most prevalent type of data management system in practice2, having been around for decades.
Supporting RDBs as data sources is therefore a must for any data integration endeavor [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], for
example when adopting the Solid Protocol. When doing so, two questions arise:
1. Data — How can we expose the RDB’s data representation in records and relations using
Solid’s data representation in RDF-based Web resource and collection/Linked Data Platform
container resource representations?
2. Identity — How can we expose the RDB’s Identity and Access Management (IAM) that
gives permissions to the RDB’s users and roles using Solid’s access control mechanism
that gives permissions to WebIDs for agents and groups?
Answering these questions would then allow Solid Apps to consume data from existing RDBs.
      </p>
      <p>
        To bridge this gap between legacy RDBs and the Solid Protocol, we follow the Ontology-Based
Data Access (OBDA) approach [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]: We use the mapping language R2RML [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] to map between
the data that follows a relational schema into an RDF dataset with RDF terms, thereby creating
nEvelop-O
a virtual (i. e. not materialised) layer on top of the actual data. This RDF dataset is then exposed
according to the Solid protocol. Internally, HTTP requests [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] according to the Solid Protocol
are turned into SPARQL queries over the virtual layer (cf. the SPARQL Graph Store Protocol [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]),
which in turn are translated into SQL queries over the actual data layer using the defined R2RML
mappings [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>
        We apply OBDA to both raw data and to roles and permissions as stored in the RDB. The
RDB’s raw data and corresponding tabular information are transformed to be accessible via the
Solid Protocol. Roles are mapped to groups of WebIDs [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] and corresponding permissions are
transformed to Web Access Control rules [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] for checking authorization in the Solid Protocol.
      </p>
      <p>We present our work’s foundations, our architecture, and how we lift data and identities.</p>
    </sec>
    <sec id="sec-3">
      <title>2. Preliminaries</title>
      <p>Relational Databases (RDBs). RDBs organise data in tables whose structure and properties
are specified in a table’s schema. An SQL view is a virtual table that is based on the result of a
SQL query. A role in an RDB is a collection of privileges or permissions to perform specific
actions or operations on database objects.</p>
      <p>
        Ontology-based Data Access (OBDA). OBDA aims to bridge the gap between an
organization’s data layer containing heterogeneous data sources [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. The idea is to build a mediator
access layer [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] using a shared conceptualization of a common domain of interest, i.e. an
ontology [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], for users to interact with [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. Technical details about the underlying data
sources are abstracted away [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] and data access across data sources is homogenised: One way
realizing this approach is Ontology-Mediated Query Rewriting [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], where SPARQL queries to
the conceptual mediator layer, e.g. a SPARQL endpoint, are re-formulated into database specific
queries, on the data access layer, e.g. SQL queries to a legacy RDB. This re-writing of queries is
enabled by declarative mappings between RDF and relational database schemas: The RDB2RDF
Mapping Language (R2RML) [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] is the W3C Recommendation for expressing such mappings.
An R2RML mapping consists of one or more triples maps that are applied to every row of a
logical database table. They express the rules by which zero or more RDF triples are generated
from each such row. A logical table is either an existing database table or view, or it defines a
virtual view (R2RML view) through a valid SQL query.
      </p>
      <p>
        The Solid Protocol. The Solid Protocol [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] is a bundle of specifications covering agent
identiifcation, authentication, authorization and data interaction. For agent identification, Solid relies
on WebIDs [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], a HTTP URI that identifies an agent. For agent authentication, Solid relies on
Solid-OIDC [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], a modified version of OpenID Connect [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. For agent authorization, Solid
relies on specifying access control rules following the Web Access Control (WAC)
specification [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] or the recently introduced but not widely adopted Access Control Policies (ACP) [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ].
Using these access control rules, once an agent is authenticated at a Web server or service, the
server or service will determine if the agent is allowed to proceed with a certain action on data.
Finally, for data interaction, Solid borrows from the Linked Data Platform (LDP) [16], efectively
extending LDP with access control mechanisms. LDP specifies a RESTful [ 17] resource interface
SQL2Solid
      </p>
      <p>Relational Database
(e.g. PostgreSQL)</p>
      <p>OBDA Middleware
(e.g. Ontop)
custom table views</p>
      <p>SQL
and defines behaviour and RDF representation of collection resources as LDP Containers. That
is, Solid adopts the document-centric style of data management from LDP, where (information)
resources, i.e. documents or containers, are contained in containers. Solid does not, however,
specify how the actual data is to be stored, e.g. in a triple store or a relational database. A Web
server that implements the Solid Protocol is called a Solid Pod (Personal Online Datastore).</p>
    </sec>
    <sec id="sec-4">
      <title>3. System Architecture</title>
      <p>To leverage data from a legacy RDB, we combine the components presented in Section 2. The
RDB provides access to legacy data, roles and permissions via an SQL interface, which is
consumed by an OBDA middleware. The OBDA middleware uses R2RML mappings to translate
incoming SPARQL queries into SQL queries and to lift the SQL results to SPARQL results, e.g.
RDF graphs. The SPARQL interface provided by the middleware is consumed by a Solid Pod.
Following the Solid Protocol, the Pod provides an LDP-based data interaction interface under
access control to clients. Figure 1 illustrates this system architecture.</p>
      <p>Consider an HTTP request being handled by this system architecture: Upon receiving an
HTTP request, the Solid Pod authenticates the client according to Solid-OIDC. After successful
authentication, the Pod generates SPARQL queries to (a) check the Web Access Control rules
on the requested resource and then (b) to retrieve the requested data. The ODBA middleware
receives the SPARQL requests, translates them into SQL queries using the R2RML mappings,
and issues the SQL queries to the RDB. Lifting the SQL query results using the mappings again,
the middleware provides the results of the Pod’s SPARQL queries. The Pod checks if the client
is authorized according to the retrieved access control rules, and grants or denies data access.</p>
      <p>
        In our demo, see Figure 1, we use PostgreSQL3 as the relational database, Ontop4 with R2RML
mappings [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] for OBDA, and the Community Solid Server (CSS)5 for the Solid-based interaction
interface that is publicly exposed to clients via the Web.
4. From Legacy Data, Roles and Permissions to the Solid Protocol
In this section, we present how the mapping works to use RDB legacy data in the Solid Protocol.
3https://www.postgresql.org/
4https://ontop-vkg.org/
5https://github.com/CommunitySolidServer/CommunitySolidServer
      </p>
      <p>Data. To lift the relational legacy data to RDF, usual R2RML mappings are defined using
domain-specific ontologies. To also make them accessible using the Solid Protocol, we also map
the structural information of the table itself to LDP Containers and Resources.</p>
      <p>Consider the following example: The RDB contains a table named characters, where its rows
represent data on characters of Pokémon trainers. We choose to define the R2RML mappings
such that the table is interpreted as an LDP Container and the single table rows as resources
contained in this container, see Figure 2a.</p>
      <p>The container is made accessible through the Solid Pod using the table’s name as the path
of the URI, e.g. http://example.org/characters/. The resources corresponding to table
rows are accessible using their row ID, e.g. http://example.org/characters/{rowID}. Using
diferent mapping approaches is possible, of course, depending on the use case at hand.
Roles and Permissions. To also re-use legacy permissions with the Solid Protocol, the RDB’s
existing roles are interpreted as groups of users having the same set of access privileges: RDB
roles are mapped to URIs that identify groups of WebIDs. Managing members of these WebID
groups is then independent from the RDB. Assigned permissions of that group are still specified
by the RDB. In this way, any WebID may be assigned a legacy RDB role. This approach decouples
identity of users from the data source they access.</p>
      <p>Under the hood, the Role-WebID mapping is defined in a custom SQL view using a
simple SQL query. This custom SQL view is then used in R2RML mappings to lift the legacy
roles and permissions to be made accessible as .acl files via the Solid-based interface, e.g.
http://example.org/characters/.acl. Using the R2RML mappings, the middleware
transforms the RDB’s tabular data about a role’s permitted operations on a specific RDB table into
an RDF graph representing a resource’s access control list, see Figure 2b.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>We presented an approach to integrate legacy RDBs with the Solid Protocol: We set up OBDA
to the raw data as well as to roles and permissions for usage according to the Solid Protocol.
While our solution currently only supports reading legacy data under pre-existing roles and
permission; supporting writing new data including access rules is ongoing work.
[16] S. Speicher, J. Arwe, A. Malhotra, Linked Data Platform 1.0, W3C Recommendation, W3C,
2015. URL: https://www.w3.org/TR/ldp/.
[17] R. T. Fielding, Architectural styles and the design of network-based software architectures,
Ph.D. thesis, University of California, Irvine, CA, USA, 2000.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>Capadisli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Berners-Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Verborgh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Kjernsmo</surname>
          </string-name>
          ,
          <source>Solid Protocol, Version 0.9</source>
          .0,
          <issue>W3C</issue>
          Solid Community Group,
          <year>2021</year>
          . URL: https://solidproject.org/TR/protocol.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Rodriguez-Muro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Rezk</surname>
          </string-name>
          ,
          <article-title>Eficient SPARQL-to-SQL with R2RML mappings</article-title>
          ,
          <source>Journal of Web Semantics</source>
          <volume>33</volume>
          (
          <year>2015</year>
          )
          <fpage>141</fpage>
          -
          <lpage>169</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Poggi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Lembo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Calvanese</surname>
          </string-name>
          , G. De Giacomo,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lenzerini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Rosati</surname>
          </string-name>
          ,
          <article-title>Linking data to ontologies</article-title>
          , in
          <source>: Journal on data semantics X</source>
          , Springer,
          <year>2008</year>
          , pp.
          <fpage>133</fpage>
          -
          <lpage>173</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S.</given-names>
            <surname>Das</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sundara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Cyganiak</surname>
          </string-name>
          ,
          <article-title>R2RML: RDB to RDF Mapping Language</article-title>
          ,
          <source>W3C Recommendation, W3C</source>
          ,
          <year>2012</year>
          . URL: https://www.w3.org/TR/r2rml/.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>R.</given-names>
            <surname>Fielding</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Reschke</surname>
          </string-name>
          ,
          <article-title>Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing, Internet Standards Track document</article-title>
          ,
          <source>IETF</source>
          ,
          <year>2014</year>
          . URL: https://www.ietf.org/rfc/ rfc7230.txt.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>C.</given-names>
            <surname>Ogbuji</surname>
          </string-name>
          , SPARQL
          <volume>1</volume>
          .
          <article-title>1 Graph Store HTTP Protocol, W3C Recommendation</article-title>
          ,
          <year>W3C</year>
          ,
          <year>2013</year>
          . URL: https://www.w3.org/TR/sparql11-graph
          <string-name>
            <surname>-</surname>
          </string-name>
          store-protocol/.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>G.</given-names>
            <surname>Xiao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Calvanese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kontchakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Lembo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Poggi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Rosati</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zakharyaschev</surname>
          </string-name>
          ,
          <article-title>Ontology-based data access: A survey</article-title>
          ,
          <source>International Joint Conferences on Artificial Intelligence</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>A.</given-names>
            <surname>Sambra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Story</surname>
          </string-name>
          ,
          <string-name>
            <surname>T.</surname>
          </string-name>
          Berners-Lee,
          <source>WebID 1</source>
          .
          <fpage>0</fpage>
          -
          <string-name>
            <given-names>Web</given-names>
            <surname>Identity</surname>
          </string-name>
          and Discovery,
          <source>W3C Editor's Draft, W3C</source>
          ,
          <year>2014</year>
          . URL: https://www.w3.org/2005/Incubator/webid/spec/identity/.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>S.</given-names>
            <surname>Capadisli</surname>
          </string-name>
          , Web Access Control,
          <source>Editor's Draft</source>
          , W3C Solid Community Group,
          <year>2022</year>
          . URL: https://solid.github.
          <article-title>io/web-access-control-spec/.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>G.</given-names>
            <surname>Wiederhold</surname>
          </string-name>
          ,
          <article-title>Mediators in the architecture of future information systems</article-title>
          ,
          <source>Computer</source>
          <volume>25</volume>
          (
          <year>1992</year>
          )
          <fpage>38</fpage>
          -
          <lpage>49</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>N.</given-names>
            <surname>Guarino</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Oberle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Staab</surname>
          </string-name>
          ,
          <article-title>What is an ontology?</article-title>
          , Handbook on ontologies (
          <year>2009</year>
          )
          <fpage>1</fpage>
          -
          <lpage>17</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>D.</given-names>
            <surname>Calvanese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Cogrel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Komla-Ebri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kontchakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Lanti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Rezk</surname>
          </string-name>
          , M. RodriguezMuro, G. Xiao,
          <article-title>Ontop: Answering SPARQL queries over relational databases</article-title>
          ,
          <source>Semantic Web</source>
          <volume>8</volume>
          (
          <year>2017</year>
          )
          <fpage>471</fpage>
          -
          <lpage>487</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>A.</given-names>
            <surname>Coburn</surname>
          </string-name>
          , elf
          <string-name>
            <surname>Pavlik</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Zagidulin</surname>
          </string-name>
          , Solid-OIDC,
          <source>W3C Editor's Draft</source>
          , W3C Solid Community Group,
          <year>2022</year>
          . URL: https://solidproject.org/TR/oidc.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>N.</given-names>
            <surname>Sakimura</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bradley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Jones</surname>
          </string-name>
          , B. de Medeiros, C. Mortimore,
          <source>OpenID Connect Core</source>
          <volume>1</volume>
          .0,
          <string-name>
            <surname>Final</surname>
            <given-names>Specification</given-names>
          </string-name>
          ,
          <year>2014</year>
          . URL: https://openid.net/specs/openid-connect
          <article-title>-core-1_0</article-title>
          .html.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>M.</given-names>
            <surname>Bosquet</surname>
          </string-name>
          , Access Control Policy (ACP),
          <source>Editor's Draft</source>
          , W3C Solid Community Group,
          <year>2022</year>
          . URL: https://solid.github.io/authorization-panel/acp-specification/.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>