<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>RML-star: A Declarative Mapping Language for RDF-star Generation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Thomas Delva</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Juli´an Arenas-Guerrero</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ana Iglesias-Molina</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Oscar Corcho</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>David Chaves-Fraga</string-name>
          <email>david.chavesg@upm.es</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anastasia Dimou</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>IDLab, Department of Electronics and Information Systems, Ghent University - imec</institution>
          ,
          <country country="BE">Belgium</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Ontology Engineering Group, Universidad Polit ́ecnica de Madrid</institution>
          ,
          <country country="ES">Spain</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>RDF-star was recently proposed as a convenient representation to annotate statements in RDF with metadata by introducing the so-called RDF-star triples, bridging the gap between RDF and property graphs. However, even though there are many solutions to generate RDF graphs, there is no systematic approach so far to generate RDFstar graphs from heterogeneous data sources. In this paper, we propose RML-star, an extension of the RML mapping language to generate RDFstar. We introduce the extension of the RML ontology and the associated specification with representative examples. URL: https://w3id.org/kg-construct/rml-star RDF-star was proposed as a compact representation to annotate statements in RDF with metadata [4]. For instance, the following declares that Bob claims Alice was born in 1996: :bob :claims &lt;&lt;:alice :birthYear 1996&gt;&gt;. Following the uptake of the proposed solution, a W3C Community Group was formed3 and a W3C Draft Report [5] was recently released with improvements over the original proposal. By now, several RDF-related programming libraries, e.g., Eclipse RDF4J, Apache Jena, RDF.rb, and N3.js, and RDF graph database systems, e.g., Blazegraph, AnzoGraph, Stardog and GraphDB, have adopted RDF-star4. However, no mapping language supports the generation of RDF-star graphs so far. Most data are still heterogeneous, represented in different formats (e.g., relational databases, CSV, JSON, or XML). One of the most common approaches nowadays to integrate them into RDF graphs is the use of declarative mapping</p>
      </abstract>
      <kwd-group>
        <kwd>RML</kwd>
        <kwd>R2RML</kwd>
        <kwd>RDF-star</kwd>
        <kwd>Knowledge Graphs</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>
        languages such as R2RML [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] and RML [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. R2RML is the W3C
Recommendation mapping language to generate RDF graphs from relational databases. RML
is a superset of R2RML that generates RDF graphs from data formats beyond
relational databases, such as CSV, JSON, or XML. Extending a mapping
language to specify how RDF-star datasets can be generated from heterogeneous
data sources can potentially increase the amount of available RDF-star datasets
and, thus, foster the adoption of the RDF-star proposal.
      </p>
      <p>In this paper, we propose RML-star, an extension of RML to generate
RDFstar graphs from heterogeneous data sources. We introduce a set of new classes
and properties that allow describing how RDF-star datasets can be created from
heterogeneous data sources in a systematic manner, using the same mapping
language as to generate RDF datasets. We also introduce the RML-star specification
that explains in detail how these extensions should be used and implemented.
2</p>
    </sec>
    <sec id="sec-2">
      <title>RML-star</title>
      <p>The aim of RML-star is to generate RDF-star triples by applying a set of
additions and extensions over RML. The changes over the RML vocabulary include
two new classes and three object properties, and the modification of one
object property (Figure 1). The specification of RML-star with the corresponding
ontology is available online at https://w3id.org/kg-construct/rml-star.</p>
      <p>Throughout this section, we rely on an example to demonstrate RML-star.
We consider two data sources: the CSV file in Listing 1 and the JSON file in
Listing 2. The RML-star mapping for these data sources is given in Listings 3
and 4. Finally, in Listing 5 we show the generated RDF-star graph.
RML. Before we explain the RML-star extensions, we summarize how an RML
mapping is defined. RML consist of a set of Triples Maps which include a Logical
Source (lines 2 and 11 of Listing 3) to access the data sources, a Subject Map
to generate the subjects of the triples (lines 3-4 and 12-13 of Listing 3), and
multiple Predicate-Object Maps to generate the predicates and objects (lines 5-9
and 14-17 of Listing 3). Predicate-Object Maps are in turn composed of Predicate
Maps and (Referencing) Object Maps (lines 6 and 7-9 of Listing 3 respectively).
A Referencing Object Map uses the Subject Map of another Triples Map to
generate the objects. Since a Referencing Object Map may involve two different
data sources, join conditions can be specified.</p>
      <p>PERSON , BIRTHYEAR, CLAIMER, CONFIDENCE
alice , 1996 , bob , 0.9
charlie, 2002 , daniel , 0.3
[ { "PATIENT": "alice",</p>
      <p>"HOSPITAL": "Juan Ramon Jimenez" },
{ "PATIENT": "charlie",
"HOSPITAL": "AZ Maria-Middelares" } ]</p>
      <sec id="sec-2-1">
        <title>Listing 1: Contents of the logical</title>
        <p>source :birthyears (CSV).</p>
        <p>Listing 2: Contents of the logical
source :hospitalrecords (JSON).
:innerTM a rml:NonAssertedTriplesMap ;
rml:logicalSource :birthyears ;
rml:subjectMap [</p>
        <p>rr:template ":{PERSON}" ] ;
rr:predicateObjectMap [
rr:predicate :birthYear ;
rml:objectMap [
rml:reference "BIRTHYEAR" ;
rr:dataType xsd:integer ]] .
:outerTM a rr:TriplesMap ;
rml:logicalSource :birthyears ;
rml:subjectMap [</p>
        <p>rr:template ":{CLAIMER}" ] ;
rr:predicateObjectMap [
rr:predicate :claims ;
rml:objectMap [</p>
        <p>rml:embeddedTriplesMap :innerTM ]] .</p>
      </sec>
      <sec id="sec-2-2">
        <title>Listing 3: Example of an RML-star</title>
        <p>mapping.</p>
        <p>It
creates
embedded
triples that are not asserted.
:outerOuterTM a rr:TriplesMap ;
rml:logicalSource :birthyears ;
rml:subjectMap [</p>
        <p>rml:embeddedTriplesMap :outerTM ] ;
rr:predicateObjectMap [
rr:predicate :confidence ;
rml:objectMap [
rml:reference "CONFIDENCE" ;
rr:dataType xsd:float ]] .
:joiningTM a rr:TriplesMap ;
rml:logicalSource :hospitalrecords ;
rml:subjectMap [
rml:embeddedTriplesMap :innerTM ;
rr:joinCondition [
rr:child "PATIENT" ;
rr:parent "PERSON" ]] ;
rr:predicateObjectMap [
rr:predicate :recordedBy ;
rml:objectMap [</p>
        <p>rml:reference "HOSPITAL" ]] .</p>
      </sec>
      <sec id="sec-2-3">
        <title>Listing</title>
        <p>4:</p>
        <p>Mapping</p>
        <p>extension
of Listing 3, containing nested
triples and multiple data sources.
that reason, it belongs to the domain of rml:subjectMap and rml:objectMap
properties. The original properties, rr:subjectMap and rr:objectMap had
cardinality restrictions that prevent extending them to include Star Map in their
domain. These additions are used exactly as the original ones in any other sense.</p>
        <p>The object property rml:embeddedTriplesMap connects the Star Map to the
Triples Map that defines how the RDF-star triples will be generated. A simple
example of a Star Map is shown on lines 16-17 of Listing 3: it embeds triples
generated by the Triples Map :innerTM in the objects of the triples generated
by the Triples Map :outerTM. This results in the triples shown on lines 1-2 of
Listing 5 when given Listing 1 as input.</p>
        <p>Non-Asserted Triples Map. An asserted RDF-star triple is a triple that is
an element of an RDF-star graph, as opposed to an embedded RDF-star triple,
that only appears in the subject or object of another RDF-star triple. In
RMLstar, all generated triples are considered by default asserted RDF-star triples.
To specify that a generated triple is embedded but not asserted, we introduce
the Non-Asserted Triples Map (rml:NonAssertedTriplesMap) as a subclass of
Triples Map (rr:TriplesMap). This Triples Map has the same expressiveness as
every other Triples Map and just adds the information of being non-asserted.
For instance, :innerTM (line 1 of Listing 3) is declared to be a Non-Asserted
Triples Map and, as a result, the :birthYear triples it generates are not present
in Listing 5 as asserted triples: they only occur as embedded triples.</p>
        <p>This structure allows the recursion of Triples Maps to nest as many
embedded triples as needed. For example, the Triples Map :outerOuterTM generates
triples that have embedded triples generated by :outerTM as their subject, and
:outerTM in turn generates triples with embedded triples from :innerTM as
their object. As a result, :outerOuterTM generates triples containing two levels
of embedded triples (lines 3-4 of Listing 5).</p>
        <p>An Embedded Triples Map can generate triples using different data sources.
Thus, the Star Map needs to have join conditions to combine such data sources.
To achieve this, the property rr:joinCondition is extended to include Star Map
in its domain. This property, in contrast to rr:objectMap and rr:subjectMap,
is easily extended due to the lack of restrictions in the original vocabulary. On
lines 12-16 of Listing 4 a Star Map is declared which joins the data sources in
Listings 1 and 2 on equal values of the PERSON column and the PATIENT attribute.
It creates the triples on lines 5-6 of Listing 5.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Conclusions and Next Steps</title>
      <p>In this paper, we present RML-star, an extension of RML, which allows
generating RDF-star graphs from heterogeneous data sources. We include a set of new
classes and properties while maintaining the general structure of R2RML and
RML. With this proposal, we aim at promoting the adoption of RDF-star and
pave the way so other mapping languages provide similar extensions. RML-star is
discussed within the Knowledge Graph Construction W3C Community Group5
5 https://w3id.org/kg-construct
and will be part of the specifications’ suite developed by the group. Thanks
to this solution, we devise a promising future work line on the development of
efficient and scalable systems to generate RDF-star graphs.</p>
      <p>Acknowledgments
This research is financially supported by Ministerio de Ciencia e Innovaci´on,
Spain, under grant Knowledge Spaces (PID2020-118274RB-I00) and by an FPI
grant (BES-2017-082511).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Das</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sundara</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cyganiak</surname>
            ,
            <given-names>R.:</given-names>
          </string-name>
          <article-title>R2RML: RDB to RDF Mapping Language</article-title>
          .
          <source>W3C Recommendation</source>
          ,
          <source>W3C</source>
          (
          <year>2012</year>
          ), http://www.w3.org/TR/r2rml/
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Dimou</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vander</surname>
            <given-names>Sande</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Colpaert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Verborgh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Mannens</surname>
          </string-name>
          , E., Van de Walle, R.:
          <article-title>RML: A Generic Language for Integrated RDF Mappings of Heterogeneous Data</article-title>
          .
          <source>In: Proceedings of the 7th Workshop on Linked Data on the Web. CEUR Workshop Proceedings</source>
          , vol.
          <volume>1184</volume>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Feria</surname>
            ,
            <given-names>S.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Garc</surname>
          </string-name>
          <article-title>´ıa-</article-title>
          <string-name>
            <surname>Castro</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          , Poveda-Villalo´n, M.:
          <article-title>Converting UML-based ontology conceptualizations to OWL with Chowlk</article-title>
          .
          <source>In: The Semantic Web: ESWC 2021 Satellite Events</source>
          . pp.
          <fpage>44</fpage>
          -
          <lpage>48</lpage>
          . Springer International Publishing (
          <year>2021</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Hartig</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          :
          <article-title>Foundations of RDF* and SPARQL* (An Alternative Approach to Statement-Level Metadata in RDF)</article-title>
          .
          <source>In: Proceedings of the 11th Alberto Mendelzon International Workshop on Foundations of Data Management and the Web. CEUR Workshop Proceedings</source>
          , vol.
          <source>1912</source>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Hartig</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Champin</surname>
            ,
            <given-names>P.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kellogg</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Seaborne</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>RDF-star and SPARQLstar</article-title>
          .
          <source>W3C Draft Community Group Report, W3C</source>
          (
          <year>2021</year>
          ), https://w3c.github. io/rdf-star/cg-spec/
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>