<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">rudof: A Rust Library for handling RDF data models and Shapes</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Jose-Emilio</forename><surname>Labra-Gayo</surname></persName>
							<affiliation key="aff0">
								<orgName type="laboratory">WESO Lab</orgName>
								<orgName type="institution">University of Oviedo</orgName>
								<address>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Angel</forename><surname>Iglesias-Préstamo</surname></persName>
							<affiliation key="aff0">
								<orgName type="laboratory">WESO Lab</orgName>
								<orgName type="institution">University of Oviedo</orgName>
								<address>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Diego</forename><surname>Martín-Fernández</surname></persName>
							<affiliation key="aff0">
								<orgName type="laboratory">WESO Lab</orgName>
								<orgName type="institution">University of Oviedo</orgName>
								<address>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Marc-Antoine</forename><surname>Arnaud</surname></persName>
							<affiliation key="aff1">
								<orgName type="institution">Lum</orgName>
							</affiliation>
						</author>
						<title level="a" type="main">rudof: A Rust Library for handling RDF data models and Shapes</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">437322FEBC435A4F522A9741D69A14B7</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T16:47+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>invent</term>
					<term>France RDF</term>
					<term>ShEx</term>
					<term>SHACL</term>
					<term>Data Quality</term>
					<term>Rust</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>In this paper we present rudof, a Rust library for RDF and RDF Data shapes. The library can be used as a command line tool but it can also be invoked from different systems. The flexibility of Rust enables the creation of binaries in Windows, Linux and Mac, as well as offering Python bindings and WebAssembly components. The library supports ShEx and SHACL, as well as other RDF data modeling languages like DCTAP, offering conversion mechanisms between them. It can also be used to generate UML-like visualizations of those data models.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>With the increasing adoption of RDF data shapes, there is a need for efficient libraries and tools which can be used to validate and process RDF. ShEx <ref type="bibr" target="#b0">[1]</ref> was introduced in 2014 as a human-readable and concise language for RDF validation which was later adopted by Wikidata in 2019, and SHACL was accepted as a W3C recommendation in 2017 1 . Both ShEx and SHACL have been increasingly adopted to increase the quality of RDF data. At the same time, other technologies have appeared to help domain experts declare their expectations on RDF-based knowledge graphs like DCTAP 2 , a tabular template that can be used to define data shapes.</p><p>The Rust programming language 3 offers some interesting features which are intended to increase safety with static typing while keeping performance competing with other low-level languages. Another advantage of Rust is the possibility of developing Python bindings that allow Python programmers to invoke Rust libraries that have better performance and even the Posters, Demos, and Industry Tracks at ISWC 2024, November 13-15, 2024, Baltimore, USA Envelope labra@uniovi.es (J. Labra-Gayo); angel.iglesias.prestamo@gmail.com (A. Iglesias-Préstamo); diegomartinfnz@gmail.com (D. Martín-Fernández); marc-antoine.arnaud@luminvent.com (M. Arnaud) GLOBE http://labra.weso.es/ (J. Labra-Gayo); https://angelip2303.github.io/ (A. Iglesias-Préstamo); https://luminvent.com/ (M. Arnaud) Orcid 0000-0001-8907-5348 (J. Labra-Gayo); 0009-0004-0686-4341 (A. Iglesias-Préstamo); 0009-0003-6640-9474 (D. Martín-Fernández); 0009-0004-2130-3366 (M. Arnaud) compilation to Web Assembly <ref type="foot" target="#foot_0">4</ref> , that can interoperate with Javascript code.</p><p>In this paper, we present a new library implemented in Rust, called rudof<ref type="foot" target="#foot_1">5</ref> which offers support for both ShEx and SHACL, conversion between different data models like DCTAP, generation of UML-like visualizations and RDF data validation. The different components of the library are published at crates.io<ref type="foot" target="#foot_2">6</ref> , the Rust module registry, and the library publishes binary releases in Windows, Linux, and MacOS, as well as Debian packages, Docker images, and Python bindings<ref type="foot" target="#foot_3">7</ref> .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">rudof modules and features</head><p>The library consists of different modules which are also published as Rust crates. In order to minimize external dependencies to other libraries, we created a simple RDF trait called SRDF which offers the basic RDF functionalities required for validation (mainly accessing the neighborhood of a node). We provide two implementations of SRDF, one based on RDF files like Turtle, and another one based on SPARQL, which allows the library to validate RDF graphs obtained either from files or through SPARQL endpoints.</p><p>The library defines other crates that contain the abstract syntax tree representation of both ShEx and SHACL called shex_ast and shacl_ast, as well as their corresponding parsers and validators. There is a module called shapes_converter which contains converters between different data models like DCTAP to ShEx, ShEx to UML visualizations, etc. A special module is rudof_cli, which implements the command line tool which is later published as a binary called rudof in different platforms like Linux, Mac and Windows.</p><p>The library already supports the following features<ref type="foot" target="#foot_4">8</ref> :</p><p>• Show information about RDF data and convert between different formats like Turtle, NTriples, RDF/XML, etc. • Support for data shapes languages like ShEx and SHACL: show information about shapes and schemas, and validate RDF data to check conformance. • Parsing DCTAP data models and conversion to shapes schemas. As an example, Figure <ref type="figure">1a</ref> contains a DCTAP file which can be obtained from an spreadsheet in CSV and figure <ref type="figure">1b</ref> shows the result of converting it to ShEx. • Generating UML visualizations of shapes data models. As an example, Figure <ref type="figure">1c</ref> shows the UML generated from 1b. • Generating HTML representations of those schemas, which can be useful when the schemas contain a large number of shapes. In these cases, the UML visualizations can be too big and become unusable, while representing each shape in its own web page makes it possible to browse the shapes in the schema. • Obtaining information about the neighborhood of a node in an RDF graph (either incoming or outgoing arcs), which can be useful to create a schema or to debug the validation results.</p><p>• Other conversions: we are exploring the conversion between ShEx/SHACL and SPARQL as it is a feature that can be useful to create queries based on some shapes. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Performance benchmarks.</head><p>Using Rust as an implementation language can improve the performance compared to Java or Python. Some preliminary benchmarks have been conducted<ref type="foot" target="#foot_5">9</ref> to compare the tool against several state-of-the-art SHACL implementations. As of September 2, 2024, the tool demonstrates performance improvements in SHACL validation when compared to Apache Jena<ref type="foot" target="#foot_6">10</ref> and TopQuadrant<ref type="foot" target="#foot_7">11</ref> , although it exhibits longer execution times than rdf4j<ref type="foot" target="#foot_8">12</ref> . In addition to these SHACL validation benchmarks, the Python bindings provided by rudof can also be used to compare performance with other Python libraries for SHACL like pySHACL <ref type="bibr" target="#b1">[2]</ref>. In this case, measuring loading, parsing and validating time for the same graph. A summary of the performance results is available in Table <ref type="table" target="#tab_1">1</ref>  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Related work.</head><p>Although there are several libraries for ShEx, SHACL or DCTAP, most of them are focused on one of those technologies, and do not offer a common mechanism for conversion between shapes-based data models. The main exception could be the SHaclEX library<ref type="foot" target="#foot_9">14</ref> written by the first author of this paper in Scala.</p><p>In the Rust ecosystem, Sophia <ref type="bibr" target="#b2">[3]</ref> is a toolkit for RDF and linked data which contains several traits although it does not support for shapes yet. Oxigraph<ref type="foot" target="#foot_10">15</ref> is an graph database that supports SPARQL written in Rust that also publishes several crates related with RDF and doesn't support shapes yet. Recently, an W3C RDF Rust Common Crates community group<ref type="foot" target="#foot_11">16</ref> has been created and we are planning to align the dependencies with the expected results of that group.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Conclusions and future work</head><p>Although rudof is still work-in-progress, we consider that it can fill a need in the RDF data shapes tools Rust ecosystem. The Rust programming language offers some advantages in terms of performance and memory safety. It also offers the possibility to generate binaries for different operating systems like Windows, Linux and Max, as well as Python bindings. We are exploring the use of the library in WebAssembly<ref type="foot" target="#foot_12">17</ref> . Our goal is to gradually migrate the code of our RDFShape playground <ref type="bibr" target="#b3">[4]</ref> which was implemented in React and Scala to a new version based on Web Assembly and Rust.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 1</head><label>1</label><figDesc>13 . Performance comparison of SHACL validation across state-of-the-art implementations against the rudof. Execution times are reported in milliseconds, using the LUBM dataset with the same SHACL shape across all comparisons.</figDesc><table><row><cell>Dataset</cell><cell cols="3">rudof rdf4j Apache Jena TopQuadrant</cell><cell>pyrudof</cell><cell>pySHACL</cell></row><row><cell>10-LUBM</cell><cell>7.8971 1.6447</cell><cell>60.3583</cell><cell>85.7421</cell><cell cols="2">39,364.2842 72,227.2940</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_0">https://webassembly.org/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_1">https://rudof-project.github.io/rudof/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_2">https://crates.io/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="7" xml:id="foot_3">https://github.com/rudof-project/rudof/releases</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="8" xml:id="foot_4">The project Wiki page contains instructions and how-to guides</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="9" xml:id="foot_5">https://github.com/weso/shacl-validation-benchmarks</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="10" xml:id="foot_6">https://jena.apache.org/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="11" xml:id="foot_7">https://github.com/TopQuadrant/shacl</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="12" xml:id="foot_8">https://rdf4j.org/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="14" xml:id="foot_9">https://www.weso.es/shaclex/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="15" xml:id="foot_10">https://github.com/oxigraph/oxigraph</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="16" xml:id="foot_11">https://www.w3.org/community/r2c2/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="17" xml:id="foot_12">A prototype based on WebAssembly is available at https://uo271080.github.io/TFG_UO271080/</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Shape Expressions: An RDF Validation and Transformation Language</title>
		<author>
			<persName><forename type="first">E</forename><surname>Prud'hommeaux</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">E</forename><surname>Labra Gayo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Solbrig</surname></persName>
		</author>
		<idno type="DOI">10.1145/2660517.2660523</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 10th International Conference on Semantic Systems, SEMANTICS 2014</title>
				<editor>
			<persName><forename type="first">H</forename><surname>Sack</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Filipowska</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Lehmann</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Hellmann</surname></persName>
		</editor>
		<meeting>the 10th International Conference on Semantic Systems, SEMANTICS 2014<address><addrLine>Leipzig, Germany</addrLine></address></meeting>
		<imprint>
			<publisher>ACM Press</publisher>
			<date type="published" when="2014">September 4-5, 2014. 2014</date>
			<biblScope unit="page" from="32" to="40" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<title level="m" type="main">pyshacl</title>
		<author>
			<persName><forename type="first">A</forename><surname>Sommer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Car</surname></persName>
		</author>
		<idno type="DOI">10.5281/zenodo.10958008</idno>
		<ptr target="https://doi.org/10.5281/zenodo.10958008.doi:10.5281/zenodo.10958008" />
		<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Sophia: A Linked Data and Semantic Web toolkit for Rust</title>
		<author>
			<persName><forename type="first">P.-A</forename><surname>Champin</surname></persName>
		</author>
		<ptr target="https://www2020devtrack.github.io/site/schedule" />
	</analytic>
	<monogr>
		<title level="m">The Web Conference 2020: Developers Track</title>
				<editor>
			<persName><forename type="first">E</forename><surname>Wilde</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Amundsen</surname></persName>
		</editor>
		<meeting><address><addrLine>Taipei, TW</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">RDFShape: An RDF Playground Based on Shapes</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">E</forename><surname>Labra Gayo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Fernández Álvarez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>García-González</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the ISWC 2018 Posters and Demonstrations, Industry and Blue Sky Ideas Tracks, co-located with 17th International Semantic Web Conference</title>
				<meeting>the ISWC 2018 Posters and Demonstrations, Industry and Blue Sky Ideas Tracks, co-located with 17th International Semantic Web Conference</meeting>
		<imprint>
			<date type="published" when="2018">2018</date>
			<biblScope unit="volume">2180</biblScope>
		</imprint>
	</monogr>
	<note>CEUR Workshop Proceedings</note>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
