<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.5220/0012551300003690</article-id>
      <title-group>
        <article-title>Knowledge in Dataspaces</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Paul Moosmann</string-name>
          <email>paul.moosmann@fit.fraunhofer.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Rohit A. Deshmukh</string-name>
          <email>rohit.deshmukh@fit.fraunhofer.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Christoph Lange</string-name>
          <email>christoph.lange-bever@fit.fraunhofer.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Johannes Theissen-Lipp</string-name>
          <email>theissen-lipp@dbis.rwth-aachen.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Workshop</string-name>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Dataspaces, Data Spaces, Knowledge Representation, Semantic Interoperability, Mutual Understanding</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Fraunhofer Institute for Applied Information Technology FIT</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>RWTH Aachen University</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2021</year>
      </pub-date>
      <volume>1</volume>
      <fpage>0000</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>Dataspaces are increasingly critical for enabling collaboration among diverse participants (individuals, institutions, or machines) in distributed environments. However, the efective sharing of data, knowledge, and services depends on a mutual understanding, which often remains poorly defined. Existing approaches focus on individual aspects of knowledge representation, but lack a holistic framework tailored to the requirements of dataspaces. This gap hinders seamless integration and mutual understanding across diverse participants, of both dataspace-specific and domain-specific content.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Dataspaces [
        <xref ref-type="bibr" rid="ref1 ref2 ref3">1, 2, 3</xref>
        ] have emerged as a key paradigm for managing diverse and distributed data, ofering
lfexible data sharing without requiring unification or centralization. The promise of
data sovereignty,
often referred to as power to control [own] data even when sharing it with others, is a major advantage
of dataspaces over other data sharing approaches. Dataspaces thereby facilitate collaboration among
diverse participants, which may include machines, software agents, or humans acting as end users
or on behalf of institutions [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. To operate efectively, all participants of a dataspace must share a
mutual understanding of the data, knowledge, and services to be exchanged. This mutual understanding
enables seamless interaction and supports interoperability by storing knowledge in the metadata of the
actual data that is exchanged in a dataspace. Despite advances in knowledge representation in areas
such as the Semantic Web or building blocks of the Dataspaces Support Centre (DSSC) Blueprint [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ],
the concept of “knowledge” in dataspaces remains poorly defined. Existing approaches often focus
on individual components, such as specific use cases or isolated applications, schema matching, or
ontology alignment. These approaches are not properly addressing the broader requirements for holistic
knowledge representation across the entire dataspace ecosystem.
      </p>
      <p>This lack of clarity creates several challenges. First, there is no consensus on what constitutes
knowledge in dataspaces, making it dificult to standardize or evaluate methods for representing it.
Second, there is limited guidance on where or how knowledge should be represented. Finally, current
frameworks often overlook gaps in knowledge representation tools and building blocks that are critical
for ensuring seamless collaboration and data sharing among dataspace participants. Addressing these</p>
      <p>CEUR</p>
      <p>ceur-ws.org
gaps is essential for realizing the full potential of dataspaces in fulfilling their promises, such as data
sovereignty, in complex or distributed systems.</p>
      <p>This paper addresses these gaps by providing a clear definition of knowledge in the scope of dataspaces
and identifying key considerations for its representation. We explore why, what, where, and how
knowledge needs to be represented, and present an overview of essential representations required for
dataspace operations. We also analyze existing Semantic Web artifacts and DSSC components, such
as the DSSC Building Blocks, Dataspace Services, and Toolbox, identifying strengths, limitations, and
opportunities for advancement. Through this work, we aim to provide a foundation for improving
knowledge representation in dataspaces and to guide future research and practical implementations.</p>
      <p>It is important to note that this paper focuses on declarative knowledge, which captures structured
facts, rules, constraints, and relationships within dataspaces. In contrast, procedural knowledge –
concerned with workflows, processes, and sequences of actions – is not included, as it is highly
contextdependent and often requires specialized execution mechanisms beyond static knowledge representation.
Therefore, procedural knowledge is beyond the scope of this paper.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Knowledge in Dataspaces</title>
      <p>
        Current dataspace initiatives are developing design principles, architectures, and technologies for
dataspaces. For instance, the DSSC is developing a Blueprint that defines terminologies and identifies
building blocks for dataspaces [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. At this stage in the evolution of dataspaces, defining the concept
of knowledge is an essential prerequisite to developing standards and technologies that facilitate its
efective management. In addition, some initiatives are working towards making dataspaces FAIR
(Findable, Accessible, Interoperable, and Reusable) [
        <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
        ], and having a clear definition of knowledge
can help in streamlining and harmonizing their activities.
      </p>
      <p>
        The ultimate goal of creating and using knowledge is to enable efective decision-making with
minimal risk [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] which also applies to dataspaces. For example, in an industrial setting, predictive
maintenance for factory machines relies on knowledge derived from historical patterns and real-time
data, such as temperature, humidity, vibrations, and pressure. By leveraging this knowledge, potential
failures can be anticipated, allowing for timely interventions and minimizing operational disruptions.
      </p>
      <p>
        In computer science, hierarchical models such as the Data-Information-Knowledge-Wisdom (DIKW)
pyramid [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] are commonly used to illustrate the relationships between raw data, structured
information, contextualized knowledge, and ultimately, wisdom. We adapt this DIKW model to dataspaces to
explore how data, through the addition of various forms of semiotics (syntax, semantics, and
pragmatics) [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], progresses to higher levels of abstraction, ultimately transforming raw data into knowledge
that supports decision-making. The table in Figure 1 maps diferent aspects of semiotics – syntax,
semantics, and pragmatics – to the layers of the DIKW pyramid in the context of dataspaces. Data (D)
consists of raw values, their data types, and their representation using basic syntax. Information (I)
emerges when structured, machine-readable syntax (e.g., JSON or XML) and contextual semantics are
introduced, enabling interpretability but allowing for multiple possible meanings. Knowledge (K) is
formed by enriching information with formal specifications of concepts, their semantics and
relationships, application-specific constraints, business rules, reasoning mechanisms, historical patterns, and
pragmatics, making it actionable for decision-making. Wisdom (W) extends knowledge by evaluating
and refining decisions, assessing their correctness, eficiency, and long-term impact, ultimately enabling
continuous learning and optimization. Building on this perspective, we derive a working definition of
knowledge in dataspaces as follows:
      </p>
      <p>Knowledge in dataspaces is the state that emerges when data, progressively enriched with
semiotics (syntax, semantics, and pragmatics), attains an unambiguous, machine-understandable
form that contributes to informed decision-making while minimizing the risk of errors,
ineficiencies, or unintended consequences in a given use case scenario. Knowledge can be represented
as a complete framework or as modular components that support specific aspects.</p>
      <p>Wisdom</p>
      <p>Knowledge
Information</p>
      <p>Data</p>
      <p>La Semiotics (Syntax,
yer Semantics, Pragmatics)
W • Decision Evaluation
• Impact Analysis
• Decision Refinement</p>
      <p>Example / Explanation
(Please read from bottom to top)
Evaluation and refinement of the effectiveness of decisions such as ‘screening only
familyfriendly movies on weekend afternoons’ (e.g., assessing and improving correctness, efficiency,
and long-term impact).</p>
      <p>K • Historical Data Patterns Historical insights: Historical data suggesting that‘screening of family-friendly movies is
• Causal Semantics usually profitable on weekend afternoons’can be used to make decisions.
• Pragmatics Cause and effect: Higher ticket prices generally lead to lower attendance
• Business Rule and Pragmatics: Adjusting interpretations of rules based on real-world factors. E.g., Interpretation</p>
      <p>Policy Semantics of profitability in relation to ticket rates and occupancy; interpretation of family-friendly movies
• Reasoning/Inference based on country.</p>
      <p>Semantics Business rule: Screening at Theater A is profitable only if occupancy &gt; 50.
• Application-specific Reasoning: Subsumption Reasoning can used to infer that accessibility features modelled for</p>
      <p>Semantics a place can be applied to theatres because theatres are modelled as places.
• Ontological Semantics Application profiles (constraints): Theater A has only one screen, with a max occupancy of
with Formal, 100.</p>
      <p>Unambiguous, Machine- Universal knowledge representation: Formal specification of concepts (Theater, Address,
understandable Syntax Movie, Screening, Screen, Data/Time, Occupancy, Audience) and their relationships.</p>
      <p>I • Contextual Semantics
• Property/Attribute</p>
      <p>Semantics
• Machine-readable</p>
      <p>Object/Record Syntax
D • Data Type Semantics
• Data Type Syntax
• Raw Data</p>
      <p>Context: 21 is the occupancy of Theater A at address B for movie C’s screening on date/time D
Syntax: Data format such as JSON, XML, or CSV, etc.
(Here, even if the syntax and context are known, the exact definitions of these concepts and
their relationships are not explicitly defined. Multiple interpretations are possible.)
Data type semantics: Type interpretation: The number“21”denotes the quantity twenty-one.</p>
      <p>Data type syntax: How numbers are represented using symbols (digits, punctuation, +/- sign).</p>
      <p>
        Raw data: “21”
Web Ontology Language (OWL) [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] is based on Description Logic, a decidable fragment of First-Order
Predicate Logic, making it a possible candidate for representing knowledge in dataspaces. While OWL
ontologies and vocabularies are widely used for universal knowledge representation, applying them to
dataspaces raises challenges, particularly in reconciling their open-world assumption with the need for
an application-oriented representation [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. The role of application profiles [
        <xref ref-type="bibr" rid="ref13 ref14">13, 14</xref>
        ] in addressing these
challenges remains an active area of research. We further discuss the current state of technologies for
knowledge representation in dataspaces in Section 4.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Requirements for Knowledge Representation in Dataspaces</title>
      <p>
        To derive the requirements for knowledge representation in dataspaces, we consider the functional
requirements that dataspaces must fulfill. In Section 4, we then map existing technologies to these
requirements, highlighting how current solutions can support the identified knowledge representation
needs. We survey existing dataspace specifications to establish a basis for our analysis. These
specifications include the Gaia-X Architecture Model [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] and the IDS Reference Architecture Model (IDS
RAM) [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. While the Gaia-X Architecture Model focuses on a more constrained model, the IDS RAM
provides a comprehensive overview of general functional requirements as well as specific instances of
these requirements. We use the IDS RAM as the basis for our analysis because it is subdivided more
ifnely when considering the technical requirements. The IDS RAM is a comprehensive model, and we
focus on the first four (of six) functional layers [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]:
      </p>
      <sec id="sec-3-1">
        <title>1. Trust 2. Security &amp; Data Sovereignty 3. Ecosystem of Data 4. Standardized Interoperability</title>
        <p>We decided to constrain the scope of our analysis to these four layers, since from our point of
view, they represent the most fundamental technical aspects required to establish a dataspace. On</p>
        <p>
          What knowledge needs to be represented?
(1) Dataspace User Roles such as Dataspace Participant (an Organization/
a Person on behalf of an Organization) who can take further roles such as
Data Product (aka Ofering / Resource / Asset) Owner, Data Product
Provider,and Data Product Consumer, etc. [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]
(2) Dataspace Administrator Roles such as Dataspace Governance
Authority [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]
(3) Dataspace Intermediary Roles are of three types: Core Intermediaries
and an Operator, Participant Agent Intermediaries, and Personal Data
Intermediaries [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]
Specification of an identity scheme and a format, and assignment of
identities to:
(1) Dataspace Participants (Persons, Organizations, Groups of Persons
or Organizations)
(2) Data Product (aka Ofering / Resource / Asset)
(3) Connector (aka Participant Agent)
(4) Various entities such as instances of semantic metadata models in
knowledge graphs
Certification-related metadata: Conformity schema, Conformity criteria,
Trust levels, Conformity/Compliance status, Attestation
Authentication: Same as “Identity Management” + proof of identity
Authorization: Access &amp; Usage Policies and their relationships
with Dataspace participants, data products, etc.
        </p>
        <p>Data product usage policies.</p>
        <p>Contracts between the participants.</p>
        <p>Persistent links to participant identities.</p>
        <p>Technical Certification</p>
        <p>Requirements for successful certification.</p>
        <p>Data Source Description</p>
        <p>Semantic metadata model capable of describing the data source.</p>
        <p>Semantic models to describe the catalogue(s) of dataspace participants,
data oferings, and service oferings and their interfaces.</p>
        <p>Knowledge modelled by the vocabularies.</p>
        <p>Metadata information of the vocabularies.</p>
        <p>Idendity of the user of the connector.</p>
        <p>Knowledge of data flow for accurate logging.</p>
        <p>Data Exchange</p>
        <p>See Data Source Description and Usage Policies.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Existing Solutions supporting Knowledge Representation in</title>
    </sec>
    <sec id="sec-5">
      <title>Dataspaces</title>
      <sec id="sec-5-1">
        <title>In this section, we assess how existing Semantic</title>
      </sec>
      <sec id="sec-5-2">
        <title>Web artifacts and dataspace components and services</title>
        <p>address the requirements for knowledge representation in dataspaces we derived in the previous section.
We introduce key solutions relevant to dataspaces, each followed by an evaluation of their efectiveness
in</p>
        <p>meeting these requirements.</p>
        <sec id="sec-5-2-1">
          <title>4.1. Semantic Web Artifacts</title>
        </sec>
      </sec>
      <sec id="sec-5-3">
        <title>We group artifacts from the areas of Semantic</title>
      </sec>
      <sec id="sec-5-4">
        <title>Web technologies and evolving Semantic</title>
      </sec>
      <sec id="sec-5-5">
        <title>Web standards</title>
        <p>into categories of application in dataspaces and discuss their efectiveness to address the requirements
derived in the previous section:
Defining What Exists : Taxonomies, vocabularies, and ontologies (e.g., DCAT or Schema.org) provide
structured ways to define entities and concepts within dataspaces. Taxonomies establish hierarchical
relationships, vocabularies define controlled terms, and ontologies ofer rich, formalized descriptions
with logical constraints. These artifacts contribute to knowledge representation across all the functional
requirements listed in Table 1, since they provide or at least contribute to the information models
needed to implement the functional requirements of dataspaces.</p>
        <p>
          Specifying How to Use Things: Application profiles, particularly when expressed in SHACL (Shapes
Constraint Language [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]) and ShEx (Shape Expressions [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ]), enable the definition of constraints on
the structure of data instances. These mechanisms ensure that data adheres to expected structures,
supporting interoperability and validation across diverse dataspace participants. These artifacts are
paramount for dataspace actions, where it is necessary to adhere to certain constraints. This is the
case when creating semantic descriptions of dataspace participants, services, and resources, or when
designing usage policies [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ].
        </p>
        <p>
          Linking Data with Other Data: Linked Data principles [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ] provide mechanisms for interconnecting
datasets across distributed environments. By utilizing RDF and URIs, Linked Data fosters semantic
interoperability, enabling the integration of heterogeneous data sources. The Linked Data approach
can be used for the data exchange by providing an identifier of a data ofering that is dereferenceable as
a URL where the data can be accessed. However, other solutions exist, such as listing the data ofering
in a metadata catalogue that contains the endpoint information of the connector through which it is
accessible.
        </p>
        <p>
          Representing and Integrating Knowledge: Knowledge Graphs [
          <xref ref-type="bibr" rid="ref20">21, 22, 20</xref>
          ] serve as a foundational
technology for organizing, structuring, and linking information in dataspaces. They encapsulate
semantic relationships between entities, supporting context-aware data access and reasoning [23]. In
the context of dataspaces, knowledge graphs can be used to integrate data from heterogeneous data
sources or to store the meta-data information of participants, services, and data oferings. It can be
relevant for brokering, data source description, and data exchange.
        </p>
        <p>Querying Knowledge: SPARQL [24] and GraphQL [25, 26] for RDF enable querying and retrieving
knowledge from structured data sources. While SPARQL provides expressive querying capabilities
based on graph patterns, GraphQL for RDF allows flexible and eficient data retrieval tailored to user
needs. These artifacts are relevant for dataspaces in order to ensure that data and service oferings
can be queried in order for participants to efectively find what they are looking for. In the functional
dataspace requirements, this is particularly relevant to the brokering aspect.</p>
        <p>Identity, Trust, and Provenance: The PROV-O (Provenance Ontology [27]) model supports capturing,
representing, and reasoning over provenance information. This ensures transparency, accountability,
and trustworthiness of data exchanges in dataspaces. The Verifiable Credentials Data Model [ 28]
provides a solution for tamper-evident credentials whose authorship can be cryptographically verified.
Decentralized identifiers (DIDs) are a type of identifier that enables a verifiable, decentralized digital
identity. DIDs have been designed so that they may be decoupled from centralized registries, identity
providers, and certificate authorities [ 29]. These artifacts provide solutions to fulfill the functional
requirements of dataspaces regarding trust, security, and data sovereignty.</p>
        <p>Policy Representation and Enforcement: The Open Digital Rights Language (ODRL) [30] provides a
policy expression framework for defining access control and usage permissions. Its role in dataspaces is
critical for enforcing governance and regulatory compliance. This artifact helps with the implementation
of usage policies attached to a data exchange action. This helps to achieve data sovereignty, a key
requirement for dataspaces.</p>
        <p>Mapping Heterogeneous Data to RDF: RML (RDF Mapping Language) [31, 32] facilitates the
transformation of diverse data formats into RDF. This mapping process is essential for harmonizing
structured and semi-structured data sources within dataspaces [33]. This artifact is relevant to the
goal of dataspaces of exchanging data from heterogeneous data sources. After mapping the data to
RDF it can be stored in a knowledge graph, as described in Representing and Integrating Knowledge above.</p>
        <p>The existing Semantic Web artifacts could fill an important role in enabling knowledge representation
in dataspaces. They enable structured data representation, interoperability, and governance through
taxonomies, ontologies, and Linked Data principles. Knowledge Graphs and RDF mappings facilitate
data integration from heterogeneous sources, while query languages enable discoverability. While
several artifacts exist to help enable knowledge representation, they often serve only as a starting point
and do not necessarily provide all required functionalities. E.g., existing ontologies or vocabularies can
be used to create models to represent knowledge, but their capabilities might be limited, especially when
aiming to model very specific domain knowledge. Extending Semantic Web artifacts (and even using
existing ones) poses a challenge in most cases, since not everyone is proficient in the use of Semantic
Web technologies. So, while there are possibilities to assist in the implementation of the functional
dataspace requirements, using them is often only done reluctantly since their high complexity poses an
entry threshold. Therefore, further research is needed regarding user-friendly tooling assisting with
the use of Semantic Web technologies, to enable the broader adoption of Semantic Web technologies.</p>
        <sec id="sec-5-5-1">
          <title>4.2. Dataspace Components and Services</title>
          <p>This section identifies functional building blocks and components that leverage Semantic Web artifacts
to represent knowledge in dataspaces.</p>
          <p>
            The DSSC project’s partners, along with its extensive network of stakeholders, include all major
dataspace development and deployment initiatives, as well as key dataspace practitioners in Europe.
As a result, the DSSC Blueprint [
            <xref ref-type="bibr" rid="ref4">4</xref>
            ] serves as a convergence point for dataspaces while also advancing
their maturity [34]. We therefore use it as a de facto reference point for assessing the current dataspace
components and tools in relation to knowledge representation.
          </p>
          <p>
            The DSSC Blueprint [
            <xref ref-type="bibr" rid="ref4">4</xref>
            ] defines business, organizational, and technical building blocks for dataspaces.
Business and organizational building blocks define knowledge elements such as business models,
governance rules (rulebook), roles and responsibilities of administrators, users, and intermediaries,
as well as legal frameworks for dataspaces. Technical building blocks define functional requirements
related to data interoperability, sovereignty, trust, and data value creation enablers. The Blueprint
recommends the use of foundational standards such as W3C Verifiable Credentials, DCAT v3, and ODRL
for representing associated knowledge. The requirements outlined by the building blocks are realized
through Dataspace Services [35], which fall into three high-level categories:
Federation Services: Provide infrastructure components that facilitate the discovery and interaction
of participants and their data products. These services represent knowledge such as dataspace user
roles, identities, certifications, access and usage policies, and semantic metadata models (ontologies,
vocabularies, and application profiles).
          </p>
          <p>Participant Agent Services: Enable participants to interface with a dataspace, share their data, and
attach policies to their data products. These services instantiate knowledge models, including user
roles, identities, certifications, metadata models, and policies.</p>
          <p>Value Creation Services: Facilitate value generation on top of shared data, instantiate business models,
and enable data valorization. These services combine various types of knowledge to support
decisionmaking. Examples include data marketplaces, data analytics services, and training and education
services.</p>
          <p>These categories encompass various conceptual components, enabling flexible implementations. The
DSSC Toolbox [36] ofers a public catalogue of dataspace tools, curated by the DSSC to support these
functionalities. While these technical frameworks provide a solid foundation, significant challenges
remain in the implementation and adoption of knowledge representation artifacts and approaches
within dataspaces. Even six months after its release, the DSSC Toolbox lists only 20 tools as of March
2025. Other dataspace tools remain at various Technology Readiness Levels and are often dificult to
discover. Some widely used tools, such as the Eclipse Dataspace Connector (EDC) [37], lack support for
proper representation of semantic models [38], while the XFSC Federated Catalogue [39], a Metadata
Broker implementation, lacks an RDF-based backend, limiting its ability to support SPARQL queries [40].</p>
          <p>
            Beyond the availability of tools, dataspaces also face significant usability challenges that directly
impact knowledge representation. The Semantic Web is often perceived as complex by developers and
domain experts [
            <xref ref-type="bibr" rid="ref12">12</xref>
            ], requiring expert-level capabilities for onboarding and interfacing with dataspaces.
There is a pressing need for user-friendly, open-source tools such as collaborative Semantic Web IDEs
(Integrated Development Environments), as well as Generative AI-powered assistants that can aid in the
creation of semantic models by leveraging existing ontologies, vocabularies, and application profiles [
            <xref ref-type="bibr" rid="ref12">12</xref>
            ].
Additionally, application profiles – which play a crucial role in representing semantic models, enforcing
closed-world assumptions in dataspaces, and enabling real-world use cases – remain underexplored.
Open research questions persist regarding formats, technologies, and practical implementation strategies
for application profiles. Thus, further research is needed to advance tooling, application-specific
knowledge representation technologies, and user-friendly interfaces for knowledge representation and
exploitation in dataspaces. Addressing these gaps is essential for enabling efective decision-making
with minimal risk.
          </p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>5. Summary and Outlook</title>
      <p>In this paper, we highlighted the importance of knowledge for enabling informed decision-making in
dataspaces and systematically identified requirements for knowledge representation. First, we outlined
key gaps, including the lack of a clear definition of knowledge in dataspaces and the fact that existing
tools, technologies, building blocks, and frameworks each follow their own implicit understanding
of knowledge. This lack of standardization leads to interoperability challenges, inconsistencies, and
incomplete knowledge representation. Second, we illustrated how knowledge emerges by enriching raw
data and structured information with various forms of semiotics (syntax, semantics, and pragmatics).
We mapped diferent types of semiotics to the layers of the DIKW pyramid and presented a working
definition of knowledge in the context of dataspaces. Third, we analyzed the functional requirements
for dataspaces as defined by the IDS RAM and systematically identified the relevant types of knowledge
needed to fulfill each requirement. Finally, we examined how existing Semantic Web artifacts, dataspace
components, services, and tools can support knowledge representation and identified critical gaps that
hinder efective integration and knowledge sharing within dataspaces. In future work, we plan to
investigate the role of application profiles in representing semantic models, enforcing closed-world
assumptions, and enabling real-world use cases in dataspaces. More importantly, future work will
involve investigating whether the DSSC blueprint outlines additional functional requirements when
compared to the IDS-RAM we referenced, followed by a more fine-grained mapping of the existing
knowledge representation solutions discussed in sections 4.1 (Semantic Web artifacts) and 4.2 (DSSC
components) to the DIKW pyramid introduced in section 2, as well as a systematic evaluation of these
solutions based on this mapping. At a higher level, a long-term goal is to develop easy-to-use tools
for intuitive semantic model building and to explore generative AI as a means to increase productivity,
simplify user interaction, and improve engagement in knowledge creation and representation tasks.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>Funded by the Deutsche Forschungsgemeinschaft under Germany’s Excellence Strategy – EXC-2023
Internet of Production – 390621612, the Data Spaces Support Centre project within the European Union
Digital Europe Programme, under grant agreement number 101083412, and supported by a Fraunhofer
ICON grant through the Next Generation Dataspaces Initiative.</p>
    </sec>
    <sec id="sec-8">
      <title>Declaration on Generative AI</title>
      <p>During the writing of this paper, the author(s) used DeepL and GPT-4o in order to: Grammar, translation
and spelling check. After using these tool(s)/service(s), the author(s) reviewed and edited the content as
needed and take(s) full responsibility for the publication’s content.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Halevy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Franklin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Maier</surname>
          </string-name>
          ,
          <source>Principles of Dataspace Systems, in: Proceedings of the TwentyFifth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems</source>
          , PODS '06,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2006</year>
          , p.
          <fpage>1</fpage>
          -
          <lpage>9</lpage>
          . doi:
          <volume>10</volume>
          .1145/1142351. 1142352.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>B.</given-names>
            <surname>Otto</surname>
          </string-name>
          , M. ten
          <string-name>
            <surname>Hompel</surname>
          </string-name>
          , S. Wrobel,
          <article-title>International data spaces</article-title>
          ,
          <source>in: Digital Transformation</source>
          , Springer Berlin Heidelberg, Berlin, Heidelberg,
          <year>2019</year>
          , pp.
          <fpage>109</fpage>
          -
          <lpage>128</lpage>
          . doi:
          <volume>10</volume>
          .1007/978- 3-
          <fpage>662</fpage>
          - 58134-
          <issue>6</issue>
          _
          <fpage>8</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J.</given-names>
            <surname>Theissen-Lipp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kocher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Lange</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Decker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Paulus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Pomp</surname>
          </string-name>
          , E. Curry, Semantics in Dataspaces: Origin and Future Directions,
          <source>in: Companion Proceedings of the ACM Web Conference</source>
          <year>2023</year>
          , WWW '23 Companion, Association for Computing Machinery, New York, NY, USA,
          <year>2023</year>
          , p.
          <fpage>1504</fpage>
          -
          <lpage>1507</lpage>
          . doi:
          <volume>10</volume>
          .1145/3543873.3587689.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Data</given-names>
            <surname>Spaces Support Centre</surname>
          </string-name>
          ,
          <source>Data Spaces Blueprint v1.5</source>
          ,
          <string-name>
            <surname>Technical</surname>
            <given-names>Report</given-names>
          </string-name>
          ,
          <year>2024</year>
          . URL: https: //dssc.eu/space/bv15e/766061169/Data+Spaces+Blueprint+
          <year>v1</year>
          .5+-+
          <string-name>
            <surname>Home</surname>
          </string-name>
          , accessed on 2025-
          <volume>02</volume>
          -25.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>M. D.</given-names>
            <surname>Wilkinson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dumontier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. J.</given-names>
            <surname>Aalbersberg</surname>
          </string-name>
          , G. Appleton,
          <string-name>
            <given-names>M.</given-names>
            <surname>Axton</surname>
          </string-name>
          , et al.,
          <article-title>The FAIR guiding principles for scientific data management and stewardship</article-title>
          ,
          <source>Scientific Data</source>
          <volume>3</volume>
          (
          <year>2016</year>
          ). doi:
          <volume>10</volume>
          .1038/sdata.
          <year>2016</year>
          .
          <volume>18</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Hauf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. M.</given-names>
            <surname>Comet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Moosmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Lange</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Chrysakis</surname>
          </string-name>
          , J. Theissen-Lipp,
          <article-title>FAIRness in Dataspaces: The Role of Semantics for Data Management</article-title>
          , in: The Second International Workshop on Semantics in Dataspaces, co-located
          <source>with the Extended Semantic Web Conference</source>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A.</given-names>
            <surname>Aamodt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Nygård</surname>
          </string-name>
          ,
          <article-title>Diferent roles and mutual dependencies of data, information, and knowledge-an ai perspective on their integration</article-title>
          ,
          <source>Data &amp; Knowledge Engineering</source>
          <volume>16</volume>
          (
          <year>1995</year>
          )
          <fpage>191</fpage>
          -
          <lpage>222</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>J.</given-names>
            <surname>Rowley</surname>
          </string-name>
          ,
          <article-title>The wisdom hierarchy: representations of the dikw hierarchy</article-title>
          ,
          <source>Journal of information science 33</source>
          (
          <year>2007</year>
          )
          <fpage>163</fpage>
          -
          <lpage>180</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>S.</given-names>
            <surname>Decker</surname>
          </string-name>
          ,
          <article-title>Semantic web methods for knowledge management</article-title>
          ,
          <source>Ph.D. thesis</source>
          , Karlsruhe, Univ., Diss.,
          <year>2002</year>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>O.</given-names>
            <surname>Signore</surname>
          </string-name>
          , et al.,
          <article-title>Representing knowledge in the semantic web</article-title>
          , in: Open Culture Conference (organised by the
          <source>Italian ofice of W3C)</source>
          ,
          <year>2005</year>
          , pp.
          <fpage>27</fpage>
          -
          <lpage>29</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Calvanese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. C.</given-names>
            <surname>Grau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dzbor</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Fokoue</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Golbreich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Hawke</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Herman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Hoekstra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Horrocks</surname>
          </string-name>
          , E. Kendall,
          <string-name>
            <given-names>M.</given-names>
            <surname>Krötzsch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Lutz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. L.</given-names>
            <surname>McGuinness</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Motik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Pan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Parsia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. F.</given-names>
            <surname>Patel-Schneider</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Rudolph</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ruttenberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            <surname>Sattler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Schneider</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Smith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Wallace</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Zimmermann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Carroll</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hendler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Kashyap</surname>
          </string-name>
          , OWL 2
          <string-name>
            <given-names>Web</given-names>
            <surname>Ontology</surname>
          </string-name>
          <string-name>
            <surname>Language</surname>
          </string-name>
          , W3C Recommendation, W3C OWL Working Group,
          <year>2012</year>
          . URL: https://www.w3.org/TR/owl2-overview/, accessed on 2024-
          <volume>05</volume>
          -23.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>R. A.</given-names>
            <surname>Deshmukh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Collarana</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gelhaar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Theissen-Lipp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Lange</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. T.</given-names>
            <surname>Arnold</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Curry</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Decker</surname>
          </string-name>
          ,
          <article-title>Challenges and Opportunities for Enabling the Next Generation of Cross-Domain Dataspaces</article-title>
          , in: The Second International Workshop on Semantics in Dataspaces, co-located
          <source>with the Extended Semantic Web Conference</source>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>R.</given-names>
            <surname>Heery</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Patel</surname>
          </string-name>
          ,
          <article-title>Application profiles: mixing and matching metadata schemas</article-title>
          ,
          <source>Ariadne</source>
          <volume>25</volume>
          (
          <year>2000</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>SEMIC</surname>
          </string-name>
          ,
          <string-name>
            <surname>Application</surname>
            <given-names>Profiles</given-names>
          </string-name>
          :
          <article-title>What are they and how to model and reuse them properly? A look through the DCAT-AP example</article-title>
          .,
          <year>2023</year>
          . URL: https://europa.eu/!DmBHvH.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Gaia-X European</surname>
          </string-name>
          <article-title>Association for Data and Cloud</article-title>
          ,
          <string-name>
            <surname>Gaia-X Architecture</surname>
          </string-name>
          Document -
          <volume>24</volume>
          .04 Release,
          <source>Technical Report</source>
          ,
          <year>2024</year>
          . URL: https://docs.gaia-x.
          <source>eu/technical-committee/architecture-document/ 24</source>
          .04/, accessed on 2025-
          <volume>03</volume>
          -06.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>B.</given-names>
            <surname>Otto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Steinbuß</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Teuscher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bader</surname>
          </string-name>
          ,
          <source>Reference Architecture Model Version 4.0</source>
          ,
          <string-name>
            <surname>Technical</surname>
            <given-names>Report</given-names>
          </string-name>
          , International Data Spaces Association,
          <year>2022</year>
          . URL: https://docs.internationaldataspaces. org/ids-ram-
          <volume>4</volume>
          /.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>H.</given-names>
            <surname>Knublauch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Kontokostas</surname>
          </string-name>
          ,
          <article-title>Shapes constraint language (SHACL), W3C Recommendation</article-title>
          , W3C Working Group,
          <year>2017</year>
          . URL: https://www.w3.org/TR/shacl/, accessed on 2024-
          <volume>05</volume>
          -23.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>E.</given-names>
            <surname>Prud</surname>
          </string-name>
          <article-title>'hommeaux</article-title>
          ,
          <string-name>
            <given-names>J. E. Labra</given-names>
            <surname>Gayo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Solbrig</surname>
          </string-name>
          ,
          <article-title>Shape expressions: an RDF validation and transformation language</article-title>
          ,
          <source>in: SEMANTICS</source>
          ,
          <year>2014</year>
          , pp.
          <fpage>32</fpage>
          -
          <lpage>40</lpage>
          . doi:
          <volume>10</volume>
          .1145/2660517.2660523.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>P.</given-names>
            <surname>Moosmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Theissen-Lipp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Lange</surname>
          </string-name>
          ,
          <article-title>Enhanced and Scalable RDF Validation Techniques for Dataspaces</article-title>
          ,
          <source>in: International Conference on Semantic Systems</source>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>C.</given-names>
            <surname>Bizer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Heath</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Berners-Lee</surname>
          </string-name>
          ,
          <article-title>Linked Data - The Story So Far</article-title>
          ,
          <source>Int. J. Semantic Web Inf. Syst. 5</source>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>