<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Towards Declarative Linked Data Backends. Generating the RELEVEN Graph API from RDF Path Expressions.</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Lukas Plank</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kevin Stadler</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Austrian Center for Digital Humanities and Cultural Heritage</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>The RELEVEN project implements a heavily reified CIDOC-CRM-based knowledge graph to represent historical claims with full contextual provenance. Exposing such complex semantic data to consuming applications presents significant technical challenges. Our demonstration showcases a model-driven approach for building REST APIs on top of SPARQL endpoints by combining RDFProxy-a Python library that maps SPARQL query results to Pydantic models-with WissKAS, a command-line tool that automatically generates RDFProxy endpoints from a declarative RDF path expression language.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;RDF Graph</kwd>
        <kwd>REST API</kwd>
        <kwd>Python</kwd>
        <kwd>Pydantic</kwd>
        <kwd>SPARQL</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>●
●</p>
      <sec id="sec-1-1">
        <title>GraphDB as the triplestore persistence layer, WissKI [4], a LOD-focused virtual research environment, that serves as the CMS and data management interface and also provides a declarative path expression language for defining semantic data shapes,</title>
        <p>
          RDFProxy [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ], a Python library developed within the RELEVEN project for mapping SPARQL
result sets to Pydantic models [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ], enabling a FastAPI [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] powered REST layer.
        </p>
        <p>Although WissKI path expressions do not enforce and validate graph constraints like e.g. SHACL,
they facilitate a concise means for formally declaring semantic shapes and serve as the structural
modelling framework for the core RELEVEN knowledge graph implementation.</p>
        <p>
          Building on this, another RELEVEN-developed tool, WissKAS (the WissKI Adapter Serializer) [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ],
enables the automatic generation of RDFProxy-compliant SPARQL queries and corresponding
Pydantic models directly from WissKI path definitions. This allows for the seamless, declarative
construction of of the entire RELEVEN REST API, fully aligned with the underlying data model. The
proposed demo will showcase this pipeline in action, highlighting the integration of semantic
modelling, data access, and declarative API generation within the RELEVEN graph architecture.
        </p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. RDFProxy</title>
      <p>While Knowledge Graphs and CRM-based ontologies provide the semantic expressiveness
required for modeling contextualized historical data, the RDF technology stack offers only limited
support for delivering graph data to consuming applications. SPARQL, the W3C-standard query
language and specification for retrieving data from RDF graphs, is primarily designed for
patternbased querying and returns result sets of whatever binding projections satisfy a given graph pattern
in flat, basically tabular structures. Although effective for graph navigation and data extraction,
SPARQL provides no inherent means of declarative result shape transformation and does not
natively support the retrieval of potentially deeply nested data structures, subset aggregation, result
pagination, or schema validation. As a result, the expressive power of RDF models and reified graphs
in particular stands in contrast to the essentially row-based structure of SPARQL result sets, and
integrating RDF-based data into client applications typically requires extensive and
endpointspecific post-processing logic to convert SPARQL results into consumable data shapes.</p>
      <p>
        The RDFProxy Python library [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] addresses these limitations by implementing a mapping
mechanism that allows the projection of SPARQL result sets onto Pydantic models, enabling the
structured, type-enforced representation of RDF graph data based on Python's type annotation
system. This integration allows RDFProxy to leverage the powerful capabilities of Pydantic and, by
extension, FastAPI for working with RDF datasets, including model pre- and post-validation hooks,
along with custom data serializers, native support for asynchronous API calls and automatic
generation of detailed API documentation according to the OpenAPI standard.
      </p>
      <p>Internally, RDFProxy dynamically modifies the incoming SPARQL query - for instance, by
injecting certain clauses and subqueries to implement purely SPARQL-based pagination - and
utilizes a dataframe abstraction for highly efficient grouping and aggregation operations over
SPARQL result sets before mapping the processed data onto a given Pydantic model definition.</p>
      <p>The primary code interface for defining API routes with RDFProxy is the
rdfproxy.SPARQLModelAdapter class, which is designed to establish and handle a connection
between a triplestore, a SPARQL query and a Pydantic model on a per-route basis.</p>
      <p>To summarize, the RDFProxy Python library aims to provide a generic solution for building
modern REST APIs on top of SPARQL endpoints, allowing backend implementers to leverage the
powerful model-driven abstractions of Pydantic and enabling performant transformations of
SPARQL query results into structured, type-safe API responses suitable for production workloads.</p>
    </sec>
    <sec id="sec-3">
      <title>3. WissKAS</title>
      <p>Building an RDFProxy endpoint requires two main components:
●
●
a SPARQL SELECT query for retrieving data from a triplestore
a (potentially nested) Pydantic model that defines the desired API response shape and data
validation rules.</p>
      <p>Typically, RDFProxy-compliant SPARQL queries and their corresponding Pydantic models are
implemented manually by backend implementers. However, given a sufficiently expressive
structural model-to-RDF mapping, RDFProxy endpoints can also be derived automatically.</p>
      <p>
        WissKI [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], a Drupal-based content management system that serves as the data management
and CMS layer within the RELEVEN graph technology stack allows users to define hierarchical data
models through a web interface, where each field specifies and maps a sequence (or "path") of RDF
classes and predicates. A group of such paths originating from the same RDF class forms a model
definition, which can be nested and include relational references to other model types.
      </p>
      <p>
        These model-to-RDF path relations — called Pathbuilder Definitions in WissKI — are internally
represented as XML structures, which makes them amenable to automated processing. The WissKI
Adapter Serializer (WissKAS) [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] is a RELEVEN-developed Python command-line tool designed to
generate complex RDFProxy endpoints directly from such Pathbuilder Definitions. Based on CLI
options that allow users to specify the desired REST API response shape using a simple filter syntax,
WissKAS can automatically derive RDFProxy endpoints that provide data views of configurable
granularity, depending on how relational boundaries between model elements are traversed. This
flexibility allows dynamic generation of multiple views of the same data at different levels of detail,
going beyond the fixed structures defined in the ontology.
      </p>
      <p>Given a Pathbuilder Definition file and a set of endpoint filter specifications, the WissKAS
command-line tool is able to construct:
●
●
for each desired endpoint:
○ a Pydantic model (often composed of nested classes) with types inferred from the</p>
      <p>WissKI definitions.
○ a SPARQL query with a projection that includes all variables required by the model,
reflecting the union of all RDF paths defined for the selected fields.
a FastAPI entry point with Python function definitions for all derived routes, each of which
instantiates an rdfproxy.SPARQLModelAdapter and calls its methods to execute queries
against a Triplestore upon endpoint invocation.</p>
      <p>By automating the generation of RDFProxy models and queries from RDF path declarations,
WissKAS significantly reduces the effort required to create and maintain structured API endpoints.
Within the RELEVEN project, WissKAS serves as a central integration facility in a broader semantic
modeling workflow, linking domain-specific data modeling in WissKI to accessible,
standardscompliant RESTful interfaces.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Demo Proposal</title>
      <p>Our demonstration will showcase the RDFProxy-based REST layer of the RELEVEN graph
implementation and automatic API generation from declarative RDF path expressions using the
WissKAS backend serializer, highlighting the model-based transformation of complex RDF data into
typed API endpoints.</p>
      <sec id="sec-4-1">
        <title>Declaration on Generative AI</title>
        <p>During the preparation of this work, the author(s) used GPT-4o for grammar and spelling checks.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>RELEVEN</given-names>
            <surname>Project</surname>
          </string-name>
          . University of Vienna. Available at: https://releven.univie.ac.at/ (visited on 2025-
          <volume>07</volume>
          -03).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>CIDOC</given-names>
            <surname>Conceptual Reference</surname>
          </string-name>
          <article-title>Model (CIDOC CRM)</article-title>
          .
          <article-title>International Committee for Documentation of the International Council of Museums (ICOM-CIDOC)</article-title>
          . Available at: https://cidoc-crm.
          <source>org/ (visited on 2025-07-03).</source>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Andrews</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          et al. (
          <year>2024</year>
          ). “
          <article-title>Re-Evaluating the Eleventh Century through Linked Events and Entities”</article-title>
          .
          <source>Historical Studies on Central Europe</source>
          ,
          <volume>4</volume>
          (
          <issue>1</issue>
          ), pp.
          <fpage>217</fpage>
          -
          <lpage>245</lpage>
          . DOI:
          <volume>10</volume>
          .47074/HSCE.2024-
          <volume>1</volume>
          .
          <fpage>12</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>WissKI. Wissenschaftliche</surname>
          </string-name>
          <article-title>Kommunikationsinfrastruktur (Scientific Communication Infrastructur)</article-title>
          . Available at: https://wiss-ki.eu/de (visited on 2025-
          <volume>07</volume>
          -03).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>RDFProxy. ACDH-CH</surname>
          </string-name>
          ,
          <article-title>Austrian Academy of Sciences. GitHub repository</article-title>
          . Available at: https://github.com/acdh-oeaw/rdfproxy (visited
          <source>on 2025-07-03).</source>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Pydantic</given-names>
            <surname>Documentation</surname>
          </string-name>
          . Available at: https://docs.pydantic.dev/latest/ (visited on 2025-
          <volume>07</volume>
          -03).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>FastAPI</given-names>
            <surname>Documentation</surname>
          </string-name>
          . Available at: https://fastapi.tiangolo.com/ (visited on 2025-
          <volume>07</volume>
          -03).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>WissKAS. ACDH-CH</surname>
          </string-name>
          ,
          <article-title>Austrian Academy of Sciences. GitHub repository</article-title>
          . Available at: https://github.com/acdh-oeaw/wisskas/ (visited on 2025-
          <volume>07</volume>
          -03).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>