<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">RML-star: A Declarative Mapping Language for RDF-star Generation</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Thomas</forename><surname>Delva</surname></persName>
							<email>thomas.delva@ugent.be</email>
							<idno type="ORCID">0000-0001-9521-2185</idno>
							<affiliation key="aff0">
								<orgName type="department">Department of Electronics and Information Systems</orgName>
								<orgName type="laboratory">IDLab</orgName>
								<orgName type="institution">Ghent University -imec</orgName>
								<address>
									<country key="BE">Belgium</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Julián</forename><surname>Arenas-Guerrero</surname></persName>
							<email>julian.arenas.guerrero@upm.es</email>
							<idno type="ORCID">0000-0002-3029-6469</idno>
							<affiliation key="aff1">
								<orgName type="laboratory">Ontology Engineering Group</orgName>
								<orgName type="institution">Universidad Politécnica de Madrid</orgName>
								<address>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Ana</forename><surname>Iglesias-Molina</surname></persName>
							<idno type="ORCID">0000-0001-5375-8024</idno>
							<affiliation key="aff1">
								<orgName type="laboratory">Ontology Engineering Group</orgName>
								<orgName type="institution">Universidad Politécnica de Madrid</orgName>
								<address>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Oscar</forename><surname>Corcho</surname></persName>
							<email>oscar.corcho@upm.es</email>
							<idno type="ORCID">0000-0002-9260-0753</idno>
							<affiliation key="aff1">
								<orgName type="laboratory">Ontology Engineering Group</orgName>
								<orgName type="institution">Universidad Politécnica de Madrid</orgName>
								<address>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">David</forename><surname>Chaves-Fraga</surname></persName>
							<idno type="ORCID">0000-0003-3236-2789</idno>
							<affiliation key="aff1">
								<orgName type="laboratory">Ontology Engineering Group</orgName>
								<orgName type="institution">Universidad Politécnica de Madrid</orgName>
								<address>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Anastasia</forename><surname>Dimou</surname></persName>
							<email>anastasia.dimou@ugent.be</email>
							<idno type="ORCID">0000-0003-2138-7972</idno>
							<affiliation key="aff0">
								<orgName type="department">Department of Electronics and Information Systems</orgName>
								<orgName type="laboratory">IDLab</orgName>
								<orgName type="institution">Ghent University -imec</orgName>
								<address>
									<country key="BE">Belgium</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">RML-star: A Declarative Mapping Language for RDF-star Generation</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">D061796F04F61CB8DBBCE079F6076B68</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T01:34+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>RML</term>
					<term>R2RML</term>
					<term>RDF-star</term>
					<term>Knowledge Graphs</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>RDF-star was recently proposed as a convenient representation to annotate statements in RDF with metadata by introducing the so-called RDF-star triples, bridging the gap between RDF and property graphs. However, even though there are many solutions to generate RDF graphs, there is no systematic approach so far to generate RDFstar graphs from heterogeneous data sources. In this paper, we propose RML-star, an extension of the RML mapping language to generate RDFstar. We introduce the extension of the RML ontology and the associated specification with representative examples. URL: https://w3id.org/kg-construct/rml-star</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>RDF-star was proposed as a compact representation to annotate statements in RDF with metadata <ref type="bibr" target="#b4">[4]</ref>. For instance, the following declares that Bob claims Alice was born in 1996: :bob :claims &lt;&lt;:alice :birthYear 1996&gt;&gt;. Following the uptake of the proposed solution, a W3C Community Group was formed 3 and a W3C Draft Report <ref type="bibr" target="#b5">[5]</ref> was recently released with improvements over the original proposal. By now, several RDF-related programming libraries, e.g., Eclipse RDF4J, Apache Jena, RDF.rb, and N3.js, and RDF graph database systems, e.g., Blazegraph, AnzoGraph, Stardog and GraphDB, have adopted RDF-star 4 .</p><p>However, no mapping language supports the generation of RDF-star graphs so far. Most data are still heterogeneous, represented in different formats (e.g., relational databases, CSV, JSON, or XML). One of the most common approaches nowadays to integrate them into RDF graphs is the use of declarative mapping Fig. <ref type="figure">1</ref>: The RML-star extension (Chowlk notation <ref type="bibr" target="#b2">[3]</ref>). Orange classes and dark orange object properties show the additions to the RML ontology, light orange object properties represent extensions (i.e., change in domain and/or range). languages such as R2RML <ref type="bibr" target="#b0">[1]</ref> and RML <ref type="bibr" target="#b1">[2]</ref>. R2RML is the W3C Recommendation mapping language to generate RDF graphs from relational databases. RML is a superset of R2RML that generates RDF graphs from data formats beyond relational databases, such as CSV, JSON, or XML. Extending a mapping language to specify how RDF-star datasets can be generated from heterogeneous data sources can potentially increase the amount of available RDF-star datasets and, thus, foster the adoption of the RDF-star proposal.</p><p>In this paper, we propose RML-star, an extension of RML to generate RDFstar graphs from heterogeneous data sources. We introduce a set of new classes and properties that allow describing how RDF-star datasets can be created from heterogeneous data sources in a systematic manner, using the same mapping language as to generate RDF datasets. We also introduce the RML-star specification that explains in detail how these extensions should be used and implemented.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">RML-star</head><p>The aim of RML-star is to generate RDF-star triples by applying a set of additions and extensions over RML. The changes over the RML vocabulary include two new classes and three object properties, and the modification of one object property (Figure <ref type="figure">1</ref>). The specification of RML-star with the corresponding ontology is available online at https://w3id.org/kg-construct/rml-star.</p><p>Throughout this section, we rely on an example to demonstrate RML-star. We consider two data sources: the CSV file in Listing 1 and the JSON file in Listing 2. The RML-star mapping for these data sources is given in Listings 3 and 4. Finally, in Listing 5 we show the generated RDF-star graph.</p><p>RML. Before we explain the RML-star extensions, we summarize how an RML mapping is defined. RML consist of a set of Triples Maps which include a Logical Source (lines 2 and 11 of Listing 3) to access the data sources, a Subject Map to generate the subjects of the triples (lines 3-4 and 12-13 of Listing 3), and multiple Predicate-Object Maps to generate the predicates and objects (lines 5-9 and 14-17 of Listing 3  Star Map. We introduce the Star Map class (rml:StarMap) to generate RDFstar triples. A Star Map can be either at the place of a Subject Map (lines 3-4 and 12-16 of Listing 4) or an Object Map (lines 16-17 of Listing 3), generating RDF-star triples in either the subject or object positions, following the RDFstar specification <ref type="bibr" target="#b5">[5]</ref>. A Star Map can form either a subject or an object. For that reason, it belongs to the domain of rml:subjectMap and rml:objectMap properties. The original properties, rr:subjectMap and rr:objectMap had cardinality restrictions that prevent extending them to include Star Map in their domain. These additions are used exactly as the original ones in any other sense.</p><p>The object property rml:embeddedTriplesMap connects the Star Map to the Triples Map that defines how the RDF-star triples will be generated. A simple example of a Star Map is shown on lines 16-17 of Listing 3: it embeds triples generated by the Triples Map :innerTM in the objects of the triples generated by the Triples Map :outerTM. This results in the triples shown on lines 1-2 of Listing 5 when given Listing 1 as input.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Non-Asserted Triples</head><p>Map. An asserted RDF-star triple is a triple that is an element of an RDF-star graph, as opposed to an embedded RDF-star triple, that only appears in the subject or object of another RDF-star triple. In RMLstar, all generated triples are considered by default asserted RDF-star triples. To specify that a generated triple is embedded but not asserted, we introduce the Non-Asserted Triples Map (rml:NonAssertedTriplesMap) as a subclass of Triples Map (rr:TriplesMap). This Triples Map has the same expressiveness as every other Triples Map and just adds the information of being non-asserted. For instance, :innerTM (line 1 of Listing 3) is declared to be a Non-Asserted Triples Map and, as a result, the :birthYear triples it generates are not present in Listing 5 as asserted triples: they only occur as embedded triples.</p><p>This structure allows the recursion of Triples Maps to nest as many embedded triples as needed. For example, the Triples Map :outerOuterTM generates triples that have embedded triples generated by :outerTM as their subject, and :outerTM in turn generates triples with embedded triples from :innerTM as their object. As a result, :outerOuterTM generates triples containing two levels of embedded triples (lines 3-4 of Listing 5).</p><p>An Embedded Triples Map can generate triples using different data sources. Thus, the Star Map needs to have join conditions to combine such data sources. To achieve this, the property rr:joinCondition is extended to include Star Map in its domain. This property, in contrast to rr:objectMap and rr:subjectMap, is easily extended due to the lack of restrictions in the original vocabulary. On lines 12-16 of Listing 4 a Star Map is declared which joins the data sources in Listings 1 and 2 on equal values of the PERSON column and the PATIENT attribute. It creates the triples on lines 5-6 of Listing 5.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Conclusions and Next Steps</head><p>In this paper, we present RML-star, an extension of RML, which allows generating RDF-star graphs from heterogeneous data sources. We include a set of new classes and properties while maintaining the general structure of R2RML and RML. With this proposal, we aim at promoting the adoption of RDF-star and pave the way so other mapping languages provide similar extensions. RML-star is discussed within the Knowledge Graph Construction W3C Community Group 5 and will be part of the specifications' suite developed by the group. Thanks to this solution, we devise a promising future work line on the development of efficient and scalable systems to generate RDF-star graphs.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Listing 4 :</head><label>4</label><figDesc>Mapping extension of Listing 3, containing nested triples and multiple data sources. 1 :bob :claims &lt;&lt; :alice :birthyear 1996 &gt;&gt; . 2 :daniel :claims &lt;&lt; :charlie :birthyear 2002 &gt;&gt; . 3 &lt;&lt; :bob :claims &lt;&lt; :alice :birthyear 1996 &gt;&gt; &gt;&gt; :confidence 0.9 . 4 &lt;&lt; :daniel :claims &lt;&lt; :charlie :birthyear 2002 &gt;&gt; &gt;&gt; :confidence 0.3 . 5 &lt;&lt; :alice :birthyear 1996 &gt;&gt; :recordedBy "Juan Ramon Jimenez" . 6 &lt;&lt; :charlie :birthyear 2002 &gt;&gt; :recordedBy "AZ Maria Middelares" .Listing 5: RDF-star triples generated by the RML-star mappings in Listings 3 and 4 from data sources in Listings 1 and 2.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head></head><label></label><figDesc>). Predicate-Object Maps are in turn composed of Predicate Maps and (Referencing) Object Maps (lines 6 and 7-9 of Listing 3 respectively). A Referencing Object Map uses the Subject Map of another Triples Map to generate the objects. Since a Referencing Object Map may involve two different data sources, join conditions can be specified.</figDesc><table><row><cell cols="2">1 PERSON , BIRTHYEAR, CLAIMER, CONFIDENCE 2 alice , 1996 , bob , 0.9 3 charlie, 2002 , daniel , 0.3</cell><cell>[ { "PATIENT": "alice", "HOSPITAL": "Juan Ramon Jimenez" }, { "PATIENT": "charlie", "HOSPITAL": "AZ Maria-Middelares" } ]</cell><cell>1 2 3 4</cell></row><row><cell></cell><cell>Listing 1: Contents of the logical</cell><cell>Listing 2: Contents of the logical</cell><cell></cell></row><row><cell></cell><cell>source :birthyears (CSV).</cell><cell>source :hospitalrecords (JSON).</cell><cell></cell></row><row><cell></cell><cell></cell><cell>:outerOuterTM a rr:TriplesMap ;</cell><cell>1</cell></row><row><cell cols="2">1 :innerTM a rml:NonAssertedTriplesMap ;</cell><cell>rml:logicalSource :birthyears ;</cell><cell>2</cell></row><row><cell>2</cell><cell>rml:logicalSource :birthyears ;</cell><cell>rml:subjectMap [</cell><cell>3</cell></row><row><cell>3</cell><cell>rml:subjectMap [</cell><cell>rml:embeddedTriplesMap :outerTM ] ;</cell><cell>4</cell></row><row><cell>4</cell><cell>rr:template ":{PERSON}" ] ;</cell><cell>rr:predicateObjectMap [</cell><cell>5</cell></row><row><cell>5</cell><cell>rr:predicateObjectMap [</cell><cell>rr:predicate :confidence ;</cell><cell>6</cell></row><row><cell>6</cell><cell>rr:predicate :birthYear ;</cell><cell>rml:objectMap [</cell><cell>7</cell></row><row><cell>7</cell><cell>rml:objectMap [</cell><cell>rml:reference "CONFIDENCE" ;</cell><cell>8</cell></row><row><cell>8</cell><cell>rml:reference "BIRTHYEAR" ;</cell><cell>rr:dataType xsd:float ]] .</cell><cell>9</cell></row><row><cell>9</cell><cell>rr:dataType xsd:integer ]] .</cell><cell>:joiningTM a rr:TriplesMap ;</cell><cell>10</cell></row><row><cell cols="2">10 :outerTM a rr:TriplesMap ;</cell><cell>rml:logicalSource :hospitalrecords ;</cell><cell>11</cell></row><row><cell>11</cell><cell>rml:logicalSource :birthyears ;</cell><cell>rml:subjectMap [</cell><cell>12</cell></row><row><cell>12</cell><cell>rml:subjectMap [</cell><cell>rml:embeddedTriplesMap :innerTM ;</cell><cell>13</cell></row><row><cell>13</cell><cell>rr:template ":{CLAIMER}" ] ;</cell><cell>rr:joinCondition [</cell><cell>14</cell></row><row><cell>14</cell><cell>rr:predicateObjectMap [</cell><cell>rr:child "PATIENT" ;</cell><cell>15</cell></row><row><cell>15</cell><cell>rr:predicate :claims ;</cell><cell>rr:parent "PERSON" ]] ;</cell><cell>16</cell></row><row><cell>16</cell><cell>rml:objectMap [</cell><cell>rr:predicateObjectMap [</cell><cell>17</cell></row><row><cell>17</cell><cell>rml:embeddedTriplesMap :innerTM ]] .</cell><cell>rr:predicate :recordedBy ;</cell><cell>18</cell></row><row><cell></cell><cell></cell><cell>rml:objectMap [</cell><cell>19</cell></row><row><cell></cell><cell></cell><cell>rml:reference "HOSPITAL" ]] .</cell><cell>20</cell></row><row><cell></cell><cell>Listing 3: Example of an RML-star</cell><cell></cell><cell></cell></row><row><cell></cell><cell>mapping. It creates embedded</cell><cell></cell><cell></cell></row><row><cell></cell><cell>triples that are not asserted.</cell><cell></cell><cell></cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_0">https://w3id.org/kg-construct</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>This research is financially supported by Ministerio de Ciencia e Innovación, Spain, under grant Knowledge Spaces (PID2020-118274RB-I00) and by an FPI grant (BES-2017-082511).</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">R2RML: RDB to RDF Mapping Language</title>
		<author>
			<persName><forename type="first">S</forename><surname>Das</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Sundara</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Cyganiak</surname></persName>
		</author>
		<ptr target="http://www.w3.org/TR/r2rml/" />
	</analytic>
	<monogr>
		<title level="m">W3C Recommendation</title>
				<meeting><address><addrLine>W3C</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">RML: A Generic Language for Integrated RDF Mappings of Heterogeneous Data</title>
		<author>
			<persName><forename type="first">A</forename><surname>Dimou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Vander Sande</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Colpaert</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Verborgh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Mannens</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Van De Walle</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 7th Workshop on Linked Data on the Web. CEUR Workshop Proceedings</title>
				<meeting>the 7th Workshop on Linked Data on the Web. CEUR Workshop Proceedings</meeting>
		<imprint>
			<date type="published" when="2014">2014</date>
			<biblScope unit="volume">1184</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Converting UML-based ontology conceptualizations to OWL with Chowlk</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">C</forename><surname>Feria</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>García-Castro</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Poveda-Villalón</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">The Semantic Web: ESWC</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<title level="m">Satellite Events</title>
				<imprint>
			<publisher>Springer International Publishing</publisher>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="44" to="48" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Foundations of RDF* and SPARQL* (An Alternative Approach to Statement-Level Metadata in RDF)</title>
		<author>
			<persName><forename type="first">O</forename><surname>Hartig</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 11th Alberto Mendelzon International Workshop on Foundations of Data Management and the Web. CEUR Workshop Proceedings</title>
				<meeting>the 11th Alberto Mendelzon International Workshop on Foundations of Data Management and the Web. CEUR Workshop Proceedings</meeting>
		<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="volume">1912</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<monogr>
		<author>
			<persName><forename type="first">O</forename><surname>Hartig</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">A</forename><surname>Champin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Kellogg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Seaborne</surname></persName>
		</author>
		<ptr target="https://w3c.github.io/rdf-star/cg-spec/" />
		<title level="m">W3C Draft Community Group Report</title>
				<meeting><address><addrLine>W3C</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
	<note>RDF-star and SPARQLstar</note>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
