<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Federated Vocabulary Hubs as a Building Block for Semantic Layers in Data Spaces</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Robert David</string-name>
          <email>robert.david@graphwise.ai</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vladimir Alexiev</string-name>
          <email>vladimir.alexiev@graphwise.ai</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Petar Ivanov</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Wouter van den Berg</string-name>
          <email>wouter.vandenberg@tno.nl</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jan Pieter Wijbenga</string-name>
          <email>jan_pieter.wijbenga@tno.nl</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michiel Stornebrink</string-name>
          <email>michiel.stornebrink@tno.nl</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Ontotext AD</institution>
          ,
          <country country="BG">Bulgaria</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Semantic Web Company GmbH</institution>
          ,
          <country country="AT">Austria</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>TNO</institution>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Vienna University of Economics and Business</institution>
          ,
          <country country="AT">Austria</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper proposes a layered architecture for federating vocabulary hubs, the component within a single data space responsible for shared agreement on semantics. Until now, each data space has their own vocabulary hub as prescribed by the IDS-RAM. The proposed design addresses the vocabulary hub federation problem. Moreover, it enables a semantic layer that provides semantic services over these federated data spaces. The European Commission's recent standardization request for a European Trusted Data Framework underlines the relevance of this work. This paper discusses relevance, requirements, like scalability and data sovereignty, and a layered architecture that integrates IDS and Semantic Web standards. Existing tool implementations are discussed that implement parts of the architecture and that can form a basis for future implementation of the full architecture for this semantic layer over data spaces.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;federation</kwd>
        <kwd>dataspaces</kwd>
        <kwd>vocabulary hub</kwd>
        <kwd>vocabularies</kwd>
        <kwd>semantic layer</kwd>
        <kwd>semantic interoperability</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Data space (DS) [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] architectures, like defined by IDS in the IDS-RAM 1 are designed for distributed
decentralized data sharing. To make best use of data in a DS, data assets require interoperability. While
this is very well solved on the syntactic side by established standards, like JSON2, it still remains a
challenge on the semantic side. To cope with this challenge, IDS defines vocabulary hubs 3 as central
components for DSs, which provide standardized semantic descriptions to describe metadata and data
of shared assets. IDS defines these vocabularies to be machine-readable and to use RDF as the data
model for representation. Vocabulary hubs host, maintain, publish, and document vocabularies and
make them available to all DS participants.
      </p>
      <p>
        However, having one central component comes with challenges to govern and operate vocabularies
in a DS. In this paper, we propose an architecture for federated vocabulary hubs, which address these
challenges, and provide a flexible and extendable approach to scale vocabulary and interoperability
services, while preserving sovereignty for governing parties. Our work builds on earlier research on
federated vocabulary hubs[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and extending their role for semantic layers[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], and combines industry
expertise from TNO and Graphwise (the merger of Semantic Web Company and Ontotext) to expand the
architectural vision with concrete implementation experiences from both open-source and commercial
vocabulary hub platforms.
      </p>
      <p>This work is especially timely because it aligns with the European Commission’s recent
standardization request for a European Trusted Data Framework. Our proposed architecture for federated
vocabulary hubs directly addresses the need for "an implementation framework for trusted ontologies
and data models" as outlined in the standardization request to CEN/CENELEC. This framework is
currently being developed under the newly established CEN/CLC JTC 25 ’Data management, Dataspaces,
Cloud and Edge’ technical committee4 .</p>
      <sec id="sec-1-1">
        <title>1.1. Use Cases and Requirements from Projects</title>
        <p>We investigate use cases of EU-funded DS projects to identify specific requirements for vocabulary
hubs.</p>
        <p>DataBri-X - Legal Data Space The Horizon Europe project DataBri-X5 looks into three use cases
where DSs provide value and enable to operate data management and processing tools in a distributed
and sovereign environment. One of these use cases is the Legal Data Space, which aims to develop
a preliminary version of a European DS, which fulfills legal requirements related to AI and which
generally benefits European industry. The following three scenarios were identified in the project:
• Creation of a European Legal Data Space Nucleus.
• Enrichment and analysis of Legal Data.</p>
        <p>• Addressing legal requirements on AI and data.</p>
        <p>All these challenges require governed legal vocabularies, which are commonly shared among the users,
but where authors retain sovereignty, to enable clear and unambiguous semantic descriptions to address
legal requirements.</p>
        <p>
          UNDERPIN - Time Series Schemas The Digital Europe project UNDERPIN6 develops a DS for
manufacturing to drive mainly predictive maintenance on refineries and wind farms. The data sources
of UNDERPIN provide time series data for training machine learning algorithms. on consolidated data.
For consolidation of this data, diferent schema mapping and transformation approaches can be used
[
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. Schemas created for this purpose are created to give semantic meaning to time series data and
enable integration. Often the meaning of data is dificult to determine from the data itself, so schemas
are provided by experts, which are often the providers of data, to be shared and used by participants in
the DS. However, the sovereignty and governance of the schemas should remain with these experts.
Digital Product Passports (DPP) The European DPP initiative is one of the most urgent use cases
for federated vocabulary hubs. Products often need to conform to multiple passport templates based on
their product type(s), which requires the integration of various semantic models and vocabularies.
DPPs of physical products can bring to light the life cycle of the product and its compliance to relevant
regulations, whereas DPPs of software products, as employed in the UNDERPIN project, can inform
about licensing and the conformity to standards. A federated vocabulary search capability would help
manufacturers and regulators identify and implement the right combination of passport templates for
specific products.
4https://standards.cencenelec.eu/dyn/www/f?p=205:7:0::::FSP_ORG_ID:3485479&amp;cs=1EF27AE97B5DBDA9B990D3DAF8BD63366
5https://databri-x.eu/
6https://underpinproject.eu/
All these use cases show the need for vocabularies in a broader scope which we aim to cover
with our proposed architecture for federated vocabulary hubs. The following section describes the
specific challenges identified for vocabulary hubs.
        </p>
      </sec>
      <sec id="sec-1-2">
        <title>1.2. Challenges for Vocabulary Hubs</title>
        <p>Vocabulary sovereignty A key principle for DSs is the sovereignty of data (owners) to decide how
and when their data is shared and used. The same principle applies to vocabularies as well. Usually,
there is a governing party responsible for development and maintenance of a vocabulary. They need to
decide how and when they expose vocabularies to the DS and change requests from outside need to be
governed by them as well. Exposure still stays as defined by the IDS-RAM and every DS participant can
access published vocabularies and retrieve semantic descriptions.</p>
        <p>Vocabulary governance Another important aspect is the governance of vocabularies. Besides having
an internal working process and only exposing vocabularies once they are ready to be shared and used,
we face the challenge of versioning of published vocabularies. DS participants should be able to use
specific versions of a vocabulary and they should be supported with services when switching versions
to help them manage the changes that come with these diferent versions.</p>
        <p>Vocabulary findability A significant challenge for vocabulary hubs is ensuring that valuable
vocabularies can be discovered across diferent data spaces. If a data space participant is not aware of
the existence of an important vocabulary that happens to be maintained in a diferent data space,
they might miss opportunities for semantic interoperability or end up recreating similar vocabularies.
Federated vocabulary hubs can help address this challenge by enabling participants to discover and
access vocabularies from other hubs as if they were present in their own hub.</p>
        <p>Scalability &amp; High Availability Scalability and high availability are two important technical
attributes for vocabulary hubs. As central components, they need to provide a high uptime in order for
the DS to use their services. Scalability might also be an issue because of large data sets or load of
access of DS participants, depending on the use case.</p>
        <p>Extending the Vocabulary Ecosystem Another challenge is to make vocabulary hubs easily
extendable. Partially this is covered by scalability on the technical side. However, on the data governance
side, this is a diferent challenge. We want to easily and transparently extend the vocabulary ecosystem
and introduce new governing parties and their vocabularies without changes to the existing ones.
Services Finally, exposing vocabularies to a DS is an important part to achieve semantic
interoperability, but often this requires additional services, like data integration, mapping, or reasoning services,
to fully leverage their potential. Such services should also be able to build on a scalable and sovereign
vocabulary infrastructure and provide transparent access to DS participants.</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. Preliminaries</title>
      <p>Preliminaries include the IDS architecture for DSs, which denfies vocabulary hubs, the DSSC blueprint,
the concept of linked data to create federated data structures, and technologies like SPARQL to implement
federated use of these structures.</p>
      <sec id="sec-2-1">
        <title>2.1. IDS Data Spaces</title>
        <p>The International Data Spaces Association (IDSA)7 is an organisation which aims to standardize DS
architectures for sovereign data sharing to drive the digital economy in Europe and beyond. IDSA
publishes the IDS Reference Architecture Model (IDS-RAM)8, which defines the architecture of a DS
and the components it uses. One of these components is the vocabulary hub9, which is a central service
for providing standardized vocabularies to all DS participants to enable semantic interoperability.
The IDS vocabulary hub is defined on a high level and does not go into details of governance for
vocabularies or even federation. Federation is currently not supported for any services in IDS, which
limits the vocabulary hub, being a central component, regarding flexibility, scalability, and sovereignty
of vocabularies.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. DSSC blueprint</title>
        <p>The Data Space Support Centre (DSSC)10 blueprint provides guidelines for data interoperability in data
spaces via vocabulary services, which ensure consistent use of common data models based on semantic
standards within a data space. The blueprint also suggests linking data sets to their corresponding data
models through the use of the Data Catalog Vocabulary (DCAT)11, which is the W3C Recommendation
for describing data sets and services using RDF.</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. RDF and (inter)linking</title>
        <p>
          The Semantic Web [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] is a set of standards for knowledge representation based on the World Wide Web.
The basic data model, defined by the Resource Description Framework (RDF), is a knowledge graph,
where nodes and edges are represented as resources on the Web. Information about resources can be
retrieved via the URI, and one can follow edges, which are hyperlinks, between nodes to navigate this
web of data. Furthermore, by creating links to published resources, it can be expanded in a decentralized
and open way.
        </p>
      </sec>
      <sec id="sec-2-4">
        <title>2.4. SPARQL federation</title>
        <p>SPARQL12 is the query language for RDF data. It can be used to query distributed data sources exposed
via SPARQL endpoints. These endpoints need to conform to the defined W3C recommendation to be
interoperable, but otherwise they can be driven by databases, which store RDF natively or by services,
which map non-RDF to RDF and expose it in the context of queries. SPARQL queries for graph patterns,
supports mandatory and optional patterns, conjunction, disjunction, aggregation, negation, and many
more capabilities. SPARQL endpoints can support federation13 (since SPARQL 1.1), which enables them
to delegate parts of a query to other SPARQL endpoints and thus query the web of data as a distributed
knowledge base.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. An Architecture for Federated Vocabulary Hubs</title>
      <p>Vocabulary hubs, as defined by IDS, represent central DS components, which host and expose
vocabularies. These vocabularies are intended to be used by participants for common semantic descriptions
with the aim of interoperability. However, this architecture limits the DS to a centralized approach
and does not address the challenges identified above. Therefore, we propose an extended architecture,
7https://internationaldataspaces.org/
8https://docs.internationaldataspaces.org/ids-ram-4/
9https://docs.internationaldataspaces.org/ids-knowledgebase/ids-ram-4/layers-of-the-reference-architecture-model/
3-layers-of-the-reference-architecture-model/3_5_0_system_layer/3_5_6_vocabulary_hub
10https://dssc.eu/
11https://www.w3.org/TR/vocab-dcat/
12https://www.w3.org/TR/sparql11-query/
13https://www.w3.org/TR/sparql11-federated-query/
which is built on top of the IDS-RAM, and which implements a decentralized approach to enable
federated vocabulary hub services. DS participants can use any hub in the federation and hubs can
also be used across multiple DSs. The architecture aims to address the challenges described above and
provide concrete technical solutions. On the technical level, we reuse existing standards, like SPARQL
federation. We call this decentralized architecture consisting of multiple loosely coupled vocabulary
hubs a vocabulary hub ecosystem.</p>
      <p>In the following, we introduce our proposed architecture for federated vocabulary hubs. We build this
architecture incrementally by starting with a i) basic architecture built on standards, then expanding it
towards a ii) service-oriented architecture, and then finally concluding with a iii) Semantic layer built
on a federated vocabulary hub ecosystem.</p>
      <sec id="sec-3-1">
        <title>3.1. Connecting the Nodes - A basic Architecture</title>
        <p>As described above in section 2.3, RDF can be used in a flexible manner to link together graphs on the
Web. This open approach works decentralized, and there is no central or governing authority which
needs to establish or confirm such links. Using SPARQL, query endpoints expose the vocabularies
and by enabling federation on the query engines we can implement a basic vocabulary federation
architecture purely built on Semantic Web standards. Figure 1 shows the basic architecture. Data
is exposed as RDF via SPARQL query endpoints, which support federation. Links between diferent
vocabularies can be established by referring to resources in vocabularies, which can be hosted locally
or remotely, and can be retrieved by query federation. With this basic architecture, we implement
vocabulary federation based on existing standards, i.e. W3C recommendations.</p>
        <p>The architecture naturally addresses findability via linked resources between vocabularies. It
addresses the challenge of vocabulary sovereignty, because vocabulary hub operators manage their hosted
vocabularies and decide for exposure, while still being able to interconnect to vocabularies hosted
elsewhere via links. It also partially addresses the challenge of vocabulary governance, because the exposed
vocabularies, being RDF data, can use common practices for web service versioning, e.g. providing the
version number as part of the IRI. Scalability is also addressed by this decentralized architecture, which
allows for high flexibility in building the hosting ecosystem. Finally, extendability is also provided
because of the flexibility of RDF to interconnect vocabularies independent of the hosting location.</p>
        <p>Our basic architecture addresses many challenges well. However, there are still gaps regarding
governance and extension. We also aim to add services to support easy integration and use of vocabulary
data on a higher abstraction level than SPARQL endpoints provide.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Added Value via Federation Services</title>
        <p>To expand the basic architecture, we introduce a service layer on top of the technical components, like
the SPARQL query endpoint. These services abstract the technical details. They expose web services
via commonly used standards for integration, like OpenAPI14, JSON15 and GraphQL16 and they provide
a facade for transparent access to the vocabulary ecosystem. Specific details of the federation are
hidden, and accessing one vocabulary hub node provides users with access to find and use the full set
of vocabularies of the ecosystem. Also, services can manage challenges of having diferent versions
of vocabularies in in diferent hubs. Such challenges arise when using service endpoints for exposure
instead of linked data. Furthermore, services, like reasoning over combined data sets, are provided.</p>
        <p>
          This architecture improves vocabulary governance, because services can decide for the functionality
they expose. They can collect, pre-process, and transform RDF data sources and present them in a way
that is convenient for integrators to consume. Extendability is better addressed by services, because the
service layer makes accessing the vocabulary hub ecosystem transparent. Finally, we can add arbitrary
services on top of RDF vocabularies to provide functionality which cannot be covered by SPARQL
endpoints and RDF directly. These services raise the role of an IDS vocabulary hub [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] to provide
value-added services, which can solve challenges of semantic interoperability within data spaces.
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. The Vocabulary Hub Ecosystem as a Building Block for Semantic Layers</title>
        <p>As the final step in our architectural evolution, we discuss how the vocabulary hub ecosystem relates
to the principle of semantic layers. Vocabulary hubs drive semantic interoperability. In the context of
data spaces, this primarily means semantic interoperability of metadata of assets. However, in many
scenarios of data integration or consolidation, we require semantic interoperability at the data level
to make best use of distributed, and possibly heterogeneous, data sources. With federated vocabulary
hubs, we provide an essential building block for a unified semantic view on data via vocabularies.
The connection between vocabulary hubs and data space catalogs is crucial for enabling semantic
14https://github.com/OAI/OpenAPI-Specification
15https://www.json.org/
16https://graphql.org/
interoperability at the data level. If every data set or data service catalog entry contains a link to its
semantic specification in a vocabulary hub (e.g. through the use of dcterms:conformsTo in DCAT), it
becomes possible to query across the catalog and discover relevant data assets based on their semantic
descriptions. Data consumers can then search for data sets that use specific concepts from known
vocabularies. Additionally, semantic relationships between diferent vocabularies can be used to expand
searches across related concepts (e.g. through relations like owl:equivalentProperty).</p>
        <p>In this way, semantic layers drive business processes and associated applications and thereby enable
enterprises to make best use of their data sources. Figure 3 shows how a semantic layer semantically
connects isolated data repositories and provides a unified semantic view on data as input to applications
and business processes.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Implementing a Vocabulary Hub Ecosystem</title>
      <p>In the following, we present software products that are already used as vocabulary hubs in data spaces.
We describe how they can be used to implement the described federated vocabulary hub architecture.</p>
      <sec id="sec-4-1">
        <title>4.1. Semantic Treehouse</title>
        <p>Semantic Treehouse17 is an open-source18 vocabulary hub implementation that provides comprehensive
management of shared data models while supporting federation through DCAT integration. The
platform implements support for multiple abstraction layers of data models, ranging from basic vocabularies
to complex ontologies, application profiles, and technical schemas. Through its adoption of DCAT,
it enables standardized vocabulary exchange, making data models discoverable and accessible across
diferent data spaces. Collaborative governance is supported with built-in version control, issue tracking,
and review processes that facilitate community-driven vocabulary development and maintenance, and
ifne-grained access control enables sovereignty in a federated ecosystem.</p>
        <p>The platform generates multiple technical artifacts from semantic models. These include JSON
Schema, XML Schema, OpenAPI, and RDF/SHACL shapes. This allows data space participants to
implement the vocabularies using familiar technologies while preserving semantic consistency.</p>
        <p>Semantic Treehouse demonstrates how vocabulary hubs can evolve from simple repositories to active
participants in a federated ecosystem. It has been in development and use since 2016. Data space
17https://www.semantic-treehouse.nl
18https://gitlab.com/semantic-treehouse
projects that have applied Semantic Treehouse include Enershare19, ZeroW20 and CIRPASS 221, among
others. The implementation of DCAT has been the first step towards the vision of a decentralized
vocabulary ecosystem that enables semantic interoperability across data spaces.</p>
        <p>Semantic Treehouse facilitates federation through the uniform exports in DCAT of each vocabulary
hub instance through an API endpoint. Merges of content can be achieved manually through unions
of the triple data coming from diferent Semantic Treehouse environments, e.g. by loading multiple
DCAT exports into a triple store. Future work includes exposing this knowledge graph by means of an
SPARQL endpoint, thereby ofering a query and viewing facility that allows searching for vocabularies
based on keywords, descriptions, or content.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. PoolParty Thesaurus Server</title>
        <p>
          PoolParty Thesaurus Server22 – part of the novel Graphwise Platform23 – provides management of
taxonomies and ontologies for AI applications. Taxonomies and ontologies are developed using SKOS
24 and OWLstrict [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ], an OWL25 subset for unambiguous semantic descriptions, and can be exposed
via Web APIs, as linked data on the Web and via SPARQL endpoints to create a federated graph
data architecture. Regarding the software architecure of Graphwise platform, PoolParty is backed
by the Ontotext GraphDB26 graph database, which is highly scalable, supports a cluster architecture
and provides ACID compliance, thereby fulfilling scalability &amp; high availability requirements. It also
features SPARQL 1.1 support and reasoning capabilities.
        </p>
        <p>With these features, PoolParty provides several services to implement a vocabulary hub with
federation support. The basic architecture is covered by exposing a SPARQL endpoint with enabled federation
and by publishing vocabularies and ontologies via linked data on the Web. By supporting these means
of publication, PoolParty fulfills the role of an IDS vocabulary hub. With added services, we develop it
towards a federated vocabulary hub node supporting semantic layer strategies. PoolParty is designed as
a middleware and provides a RESTful Web API for vocabulary and ontology management functionalities,
which makes it easy to integrate with other services. Finally, PoolParty ofers various capabilities around
data consolidation and integration, like ETL and schema mapping features, which makes it fully support
semantic layer strategies.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Complementary Approaches to Vocabulary Hub Implementation</title>
        <p>While Semantic Treehouse and PoolParty Thesaurus Server have slightly diferent technical approaches
to vocabulary management, they share architectural principles that align with the federated vocabulary
hub vision. Both platforms adopt and adhere to open standards, like XML, JSON, SKOS and RDFS/OWL,
so they are interoperable by design. They are both developed by organizations that bridge research
and practice, which means the use of state-of-the-art semantic technologies is combined with practical
applicability in real-world data space implementations. Indeed, both tools have successfully been
deployed in data space projects, as described in section 1.1.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusions</title>
      <p>In this paper, we presented an architecture for federated vocabulary hubs as an evolution of IDS
vocabulary hubs. We identified major challenges for vocabulary hubs in data spaces and discussed how
we can address them with the proposed architecture. We concluded with a presentation of existing
19https://enershare.eu/
20https://www.zerow-project.eu/
21https://cirpass2.eu/
22https://www.poolparty.biz/
23https://graphwise.ai/
24https://www.w3.org/2009/08/skos-reference/skos.html
25https://www.w3.org/TR/owl-features/
26https://www.ontotext.com/products/graphdb/
software products, which form building blocks to implement the proposed architecture. The federated
vocabulary hub architecture is the first step in the evolution of IDS data spaces towards implementing
decentralized vocabulary hub ecosystems.The key to vocabulary hub federation lies in a) the adoption of
DCAT as means to catalog vocabularies, and b) exposing the catalogue of vocabularies as a knowledge
graph by means of a SPARQL endpoint. This loosely couples vocabulary hubs, links vocabularies
and allows performing federated SPARQL queries to enable the functionalities of the service layer as
described in the architecture of the vocabulary hub ecosystem.</p>
      <p>The timing of this work is particularly relevant given the European Commission’s recent
standardization request for a European Trusted Data Framework. Our proposed architecture for federated
vocabulary hubs provides a foundation for implementing the "trusted ontologies and data models"
framework required by this request; it specifically calls for technical specifications to "specify criteria
for the selection of semantic assets" and "specify methods for the semantic annotation of shared data".
Our federated vocabulary hub architecture directly supports these requirements and positions our work
as a potential building block for the standards being developed under CEN/CLC JTC 25. Additionally,
when we look forward, we see that federated vocabulary hubs will play a crucial role in supporting
AI-powered systems, including impacting semantic interoperability via GraphRAG and agentic AI.
These systems will leverage the federation of vocabulary hubs to automatically discover and adopt the
most appropriate semantic models for specific use cases.</p>
      <p>Our next steps include to incrementally implement the proposed architecture based on the presented
software products, to apply it to (further) use cases, and to contribute to standardization for data spaces.
Following our architectural proposal, we will expand the services of Semantic Treehouse and PoolParty
Thesaurus Server accordingly and enable federation support for building vocabulary hub ecosystems to
be tested in real-world projects. Furthermore, we will contribute with standardization proposals for the
federated architecture to extend IDS.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgements</title>
      <p>This work is partially supported by the Digital Europe programme project UNDERPIN (grant agreement
101123179) and the HORIZON Europe programme project DataBri-X (grant agreement 101070069).
Support has also been provided by the Centre of Excellence for Data Sharing and Cloud (CoE-DSC).</p>
    </sec>
    <sec id="sec-7">
      <title>Declaration on Generative AI</title>
      <p>The author(s) have not employed any Generative AI tools.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>B.</given-names>
            <surname>Otto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hompel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Wrobel</surname>
          </string-name>
          , Designing Data Spaces: The Ecosystem Approach to Competitive Advantage, Springer International Publishing,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bootsma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Wijbenga</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Oosterheert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Stornebrink</surname>
          </string-name>
          , W. van den Berg,
          <article-title>Establishing semantic interoperability across data spaces: a solution for sharing vocabularies</article-title>
          ,
          <source>Technical Report, TNO</source>
          ,
          <year>2024</year>
          . URL: https://coe-dsc.nl/knowledge-base/original-content/deliverables/.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>R.</given-names>
            <surname>David</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Ivanov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Alexiev</surname>
          </string-name>
          ,
          <article-title>Raising the Role of Vocabulary Hubs for Semantic Data Interoperability in Dataspaces</article-title>
          , in: Third workshop on Semantic Interoperability in Data Spaces, Budapest, Hungary,
          <year>2024</year>
          . URL: https://semantic.internationaldataspaces.org/wp-content/uploads/2024/10/ presentation.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Andresel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Siska</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>David</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Schlarb</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Weißenfeld</surname>
          </string-name>
          ,
          <article-title>Adapting ontology-based data access for data spaces</article-title>
          , in: The Second International Workshop on Semantics in Dataspaces, co
          <article-title>-located with the Extended Semantic Web Conference</article-title>
          , May
          <volume>26</volume>
          -27,
          <year>2024</year>
          , Hersonissos, Greece,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>O.</given-names>
            <surname>Lassila</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hendler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Berners-Lee</surname>
          </string-name>
          ,
          <source>The Semantic Web, Scientific American</source>
          <volume>284</volume>
          (
          <year>2001</year>
          )
          <fpage>34</fpage>
          -
          <lpage>43</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>R.</given-names>
            <surname>David</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ahmeti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ahmetaj</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . Polleres,
          <article-title>OWLstrict: A Constrained OWL Fragment to avoid Ambiguities for Knowledge Graph Practitioners</article-title>
          , in: The Semantic Web: 22th International Conference, ESWC 2025, Portorož, Slovenia, June 1 - June 5,
          <year>2025</year>
          , Proceedings,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>