<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Journal of Web Semantics 37</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.1145/3308558.3313711</article-id>
      <title-group>
        <article-title>Smart City Urban Heat Monitoring using a Solid-based Dataspace</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Florian Hölken</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alexander Paulus</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tobias Meisen</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>André Pomp</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Institute for Technologies and Management of Digital Transformation, University of Wuppertal</institution>
          ,
          <addr-line>Wuppertal</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <volume>9981</volume>
      <fpage>0000</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>The exponential growth of urban environmental data, originating from IoT sensors and weather stations, presents both opportunities and challenges, particularly in cities, where addressing climate-related issues requires the integration of heterogeneous datasets. This paper explores the use of a Solid dataspace to enhance data interoperability, sovereignty, and integration for urban heat island (UHI) monitoring. For that, we introduce a Heat Monitoring App that leverages a Solid-based dataspace to integrate and provide decentralized temperature data from various possible stakeholders, such as government agencies, environmental authorities, and citizen-generated sources. Our approach employs semantic modeling on each data source to address interoperability challenges, enabling efective data discovery and integration while preserving data sovereignty. Through an experimental evaluation, we demonstrate the feasibility of this approach in a simulated urban setting. The findings highlight the potential of Solid-based dataspace for smart city applications, while also identifying limitations related to real-time data processing, automation in schema alignment, and privacy-preserving access control. Future work aims to enhance automation, scalability, and real-time capabilities, ultimately bridging the gap between theoretical advancements in dataspaces and their practical implementation in urban climate resilience.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Solid</kwd>
        <kwd>Dataspaces</kwd>
        <kwd>Urban Heat Islands</kwd>
        <kwd>Semantic Web</kwd>
        <kwd>Data Interoperability</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The volume of urban environmental data is growing exponentially, driven by the proliferation of
IoT devices and digitalization of public services. For instance, the number of connected sensors is
projected to exceed 30 billion by 2030 [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], creating vast opportunities for innovation in areas such as
climate resilience. However, much of this data remains underutilized due to a lack of structured and
interoperable infrastructures. Without properly designed data infrastructures, data remains trapped in
isolated silos, or its utilization requires extensive efort, limiting its accessibility and impact.
      </p>
      <p>
        To address this challenge, dataspaces have gained increasing attention in recent years, fostering
datadriven innovation across various domains [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. These structured data ecosystems enable controlled data
sharing and interoperability, promoting eficient data exchange between stakeholders. Dataspaces have
emerged in multiple sectors, including industry, energy and utilities, agriculture as well as smart cities.
Enterprise solutions such as Catena-X, International Data Spaces (IDS) or Gaia-X provide frameworks
for secure and scalable data sharing within these sectors. Solid-based dataspaces leverage the Solid
technology, a web standard for decentralized data storage and access control, allowing users to store their
personal data in so-called Solid Pods. These Pods are user-controlled data containers that ensure data
sovereignty and portability [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. These kind of dataspaces can be utilized in both enterprise and public
dataspaces, supporting various use cases that require enhanced data sovereignty and interoperability
[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. In the context of smart cities, Solid-based dataspaces are particularly promising as they enable
citizen participation, a feature often lacking in enterprise-driven solutions.
      </p>
      <p>The Solid project in general is initiated by Tim Berners-Lee and aims to reshape the current web
by putting users in control of their data. It introduces a decentralized architecture for storing and
managing personal and organizational data via user-controlled Pods, promoting data interoperability,
privacy, and data sovereignty. By decoupling data from applications, Solid enables flexible and ethical
data reuse across systems and domains.</p>
      <p>Despite the rising trend of dataspaces and related initiatives, there is still a lack of concrete applications
that leverage dataspaces as a fundamental infrastructure for developing solutions for concrete problems.
While many dataspaces focus on data governance and access control, practical implementations that
utilize these infrastructures for specific urban challenges remain scarce.</p>
      <p>
        A critical urban challenge where heterogeneous public-sector data is crucial is the identification
of urban heat islands (UHIs). The rapid urbanization of the 21st century has led to unprecedented
environmental challenges, with cities becoming hotspots for climate change efects. Among these,
UHIs are particularly concerning, as they exacerbate extreme heat events, disproportionately afecting
urban populations and straining essential resources such as energy and water [
        <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
        ]. The increasing
frequency and intensity of heatwaves, as projected by the IPCC, underscore the urgent need for improved
urban heat monitoring and mitigation strategies [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. While digital tools for environmental monitoring
exist, they often struggle with integrating heterogeneous data from multiple sources, reducing their
efectiveness in providing actionable insights [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Addressing these challenges necessitates innovative
approaches that enhance data interoperability, contextual understanding, and analytical capabilities to
support informed urban planning and decision-making.
      </p>
      <p>To efectively monitor UHIs, data from diverse public-sector stakeholders is required. High-resolution
temperature data, alongside other environmental parameters, must be sourced from meteorological
agencies, environmental authorities, research institutions, and even citizen-generated data from personal
weather stations and IoT devices. However, the disparate formats and standards of these datasets pose
challenges to integration and analysis.</p>
      <p>In this paper, we demonstrate a concrete application for UHI identification by implementing an urban
heat monitoring application leveraging a Solid-based dataspace. Figure 1 provides an overview of the
developed application. In order to develop the application, we therefore introduce a first prototype of
a decentralized Solid-based dataspace consisting of multiple solid servers. Using this dataspace our
application integrates and visually presents heterogeneous temperature data from various stakeholders.
By leveraging the principles of dataspaces, we enhance the interoperability and usability of urban heat
data for decision-makers and citizens.</p>
      <p>In order to enable this interoperability, a key contribution of this work is to show the role of semantics
in dataspaces. For that, we examine how semantic technologies can enhance the the integration,
interpretation, and usability of data in dataspaces and for the corresponding application development.
By bridging the gap between raw data and actionable insights, our approach advances the discourse on
smart city innovations and contributes to more efective and sustainable urban planning.</p>
      <p>This work is conducted as part of the funded project Gesundes Tal, which aims to improve health
prevention in Wuppertal by addressing environmental challenges, such as UHIs, through digitally
integrated solutions. The project explores innovative approaches to environmental monitoring, leveraging
Semantic Web technologies and decentralized data infrastructures to bridge the gap between theoretical
advancements and practical applications in urban data ecosystems.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>The increasing interest in smart city solutions has led to a growing focus on data interoperability and
dataspaces as a key enabling infrastructure. This section examines existing dataspaces, the role of
semantic technologies, and real-world applications that leverage these technologies within dataspaces.</p>
      <sec id="sec-2-1">
        <title>2.1. (Solid) Dataspaces</title>
        <p>
          Dataspaces have emerged as an approach for managing and integrating heterogeneous data from
multiple stakeholders, particularly in domains requiring cross-organizational data exchange, such as
smart cities and environmental monitoring. Originally introduced as a flexible strategy for integrating
diverse data sources [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ], the concept has evolved into real-world implementations such as Gaia-X and
IDS, which provide frameworks for secure, interoperable data sharing across industries [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ].
        </p>
        <p>
          However, many of these existing dataspace solutions primarily cater to enterprise use cases,
emphasizing industrial data governance and sector-specific interoperability [
          <xref ref-type="bibr" rid="ref10 ref11">11, 10</xref>
          ]. In contrast, smart
city applications present a fundamentally diferent challenge: they require active citizen participation,
decentralized governance and interoperable data infrastructures to efectively integrate public-sector,
private-sector, and citizen-generated data. This distinction highlights the limitations of centralized
dataspace architectures in smart cities and underscores the potential of Solid-based dataspaces as a
more suitable alternative.
        </p>
        <p>
          Unlike traditional dataspace architectures, a Solid-based dataspace leverages web-native (semantic)
standards to enable decentralized, user-controlled data management [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. Key standards include the
Resource Description Framework (RDF), Linked Data principles, which are a set of best practices
proposed by Tim Berners-Lee for publishing and interlinking structured data on the Web [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ], and
WebID, a decentralized identity mechanism for user authentication and access control in Solid Pods.
This decentralized approach aligns closely with the requirements of smart city applications, particularly
in domains such as urban heat monitoring, where data originates from multiple heterogeneous sources,
including:
• Government agencies (meteorological ofices, city planning departments)
• Environmental authorities (air quality and climate research institutions)
• Private organizations (utility providers, sensor manufacturers)
• Citizen-generated data (personal weather stations, IoT devices)
        </p>
        <p>The ability of a Solid-based dataspace to facilitate direct citizen participation, while ensuring data
sovereignty and privacy, presents a unique advantage over enterprise-focused dataspace solutions like
Gaia-X or IDS. By allowing individuals and organizations to store their own environmental data in
personal Solid Pods, a collaborative, yet decentralized model for urban data management can emerge.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Semantic Technologies in Dataspaces</title>
        <p>Semantic technologies play a pivotal role in enhancing data interoperability, integration, and
discoverability within dataspaces. By leveraging ontologies, semantic annotations like semantic labels or
models, and Linked Data principles, these technologies enable automated and contextual understanding
of heterogeneous datasets [13]. In the context of urban heat monitoring and smart city applications,
semantic modeling helps us to align disparate environmental data sources, facilitating simple data
model understanding and informed decision-making.</p>
        <p>In general, semantic modeling provides expressive meaning and context to data by establishing
mappings between data attributes and ontology concepts [14]. This process is crucial in heterogeneous
data environments, such as smart cities, where data originates from multiple stakeholders with varying
formats and standards. The semantic modeling workflow consists of two primary phases:
• Semantic Labeling – Establishes initial mappings between structured datasets and concepts
within a predefined ontology [15].
• Semantic Refinement – Enhances these mappings by incorporating contextual relationships
and improving alignment across diverse data sources [16, 17].</p>
        <p>To facilitate the semantic mapping process, semi-automatic approaches have been developed. For
example, Knoblock et al. [18] proposed methodologies that leverage structured data sources and
integrate them into the Semantic Web, reducing manual eforts and improving the eficiency of data
annotation.</p>
        <p>Recent advances have focused on improving the creation of semantic models, making large-scale
semantic integration more feasible. Systems like PLASMA [19, 16] provide platforms that combine
semantic labeling with tools for manual refinement, streamlining the creation of semantic models.
Similarly, scalable approaches such as those proposed by Taheriyan et al. [15, 20] demonstrate how
Semantic Web principles can be used to infer semantic relationships automatically, thereby enhancing
data alignment across diferent domains.</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Real World Applications utilizing Dataspaces</title>
        <p>While the concept of dataspaces has been widely discussed in research and industry, relatively few
realworld applications demonstrate their efectiveness in solving domain-specific challenges. Most dataspace
initiatives focus on data interoperability, governance, and security, but their concrete application to
specific urban or environmental problems remains limited. This subsection examines existing dataspace
implementations across diferent domains, highlighting their relevance to smart cities and the role of
semantic technologies in enhancing their capabilities.</p>
        <p>One of the most prominent real-world applications of dataspaces is Catena-X, a Gaia-X initiative
aimed at creating an interoperable data infrastructure for the automotive industry [21]. This dataspace
facilitates secure, cross-company data exchange, ensuring supply chain transparency and eficient
resource allocation. Similarly, IDS provides a framework for standardized and secure data sharing across
multiple industrial sectors [22]. However, while these initiatives demonstrate the feasibility of data
interoperability, they primarily focus on enterprise and supply chain use cases, lacking applications
that directly benefit end users.</p>
        <p>In environmental monitoring, Semantic Web technologies have been explored to facilitate
crossdomain data integration and improve the discoverability of environmental resources. Schimak et al. [23]
demonstrated in 2013 how a semantic discovery framework, leveraging web-based data interlinking
principles, can enhance access to heterogeneous environmental datasets by aligning domain-specific
ontologies and enabling cross-domain search. However, no application was developed to showcase the
practical implementation of this approach.</p>
        <p>Dataspaces have been recognized as a promising concept in agriculture and healthcare, yet they
remain largely unexplored in practical implementations. Wolfert et al. [24] analyzed the role of big
data and semantic data models in precision farming, demonstrating how improved data sharing among
farmers, agribusinesses, and policymakers could enhance decision-making and eficiency. However,
while these approaches highlight the need for interoperable data ecosystems, dataspaces as an enabling
infrastructure have not yet been realized in this domain nor any applications.</p>
        <p>Despite the growing adoption of dataspaces in various domains, there is a clear gap in their application
to urban heat monitoring and smart city climate resilience. Existing implementations predominantly
focus on governance, industry, and large-scale data infrastructures, while solutions that empower cities
and citizens to leverage decentralized, interoperable climate data remain underdeveloped. This study
addresses this gap by demonstrating an urban heat monitoring application that leverages a Solid-based
dataspace, combining the semantic interoperability advantages of Linked Data with the decentralized,
privacy-preserving nature of Solid Pods.</p>
        <p>By leveraging these technologies, we aim to show that dataspaces can be a viable solution for
integrating and utilizing heterogeneous environmental data, ultimately contributing to more actionable
climate adaptation strategies in smart cities among diferent participants.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Methodology &amp; Implementation</title>
      <p>As discussed above, this study aims to develop a Heat Monitoring App for identifying UHIs using
a Solid-based dataspace, demonstrating a practical application within the context of dataspaces. By
leveraging semantic technologies in the form of semantic models, our approach aims especially for better
data discovery and understanding for application developers. It also helps the developer to transform
fragmented and heterogeneous datasets into meaningful representations, supporting advanced analysis
and actionable insights for urban heat monitoring.</p>
      <sec id="sec-3-1">
        <title>3.1. Setting up a Solid-based Dataspace</title>
        <p>As mentioned above, we decided to use Solid as the foundational technology for setting up our dataspace.
The choice of Solid for our work is rooted in the need for a participatory model that accommodates
diverse stakeholders for our real-world application. In addressing urban heat challenges, a dataspace
must support collaboration among city administrations, organizations, and citizens.</p>
        <p>
          Compared with the functionalities described in the International Data Spaces Association (IDSA)
reference architecture, Solid ofers several features directly out of the box, as highlighted by Harth et al.
[
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. These include decentralized data storage through Pods, access control mechanisms, and Semantic
Web compatibility for machine-readable data representation and integration.
        </p>
        <p>Our envisioned Solid-based dataspace is a distributed environment where each stakeholder manages
their data independently within their Solid Pods. These Pods serve as modular, secure storage units,
allowing users to maintain full control over their data. Using WebID, the system ensures decentralized
identity management, streamlining authentication and enabling flexible Access Control Policies (ACP)
for privacy-preserving data sharing. We chose to adopt ACP because it ofers finer-grained, more
lfexible, and extensible access control capabilities, and it is aligned with the ongoing evolution of the
Solid standard.</p>
        <p>To implement this infrastructure, we utilize the Community Solid Server (CSS) as the backend for
hosting and managing Solid Pods. The CSS provides a flexible and open-source implementation of the
Solid protocol, allowing for decentralized data storage and retrieval while ensuring compliance with
Solid standards.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Implementation of a data catalog</title>
        <p>However, while Solid provides many essential building blocks, one key component is missing: a data
catalog for organizing and discovering datasets eficiently. In general, a data catalog within a dataspace
serves as a central component for enabling advanced data management. Its primary goal is to organize
and enrich metadata to facilitate data discovery, contextual understanding and interoperability across
diverse datasets. Semantics play an important role in achieving this by providing a shared vocabulary
and structure that allows heterogeneous data sources to be discovered and integrated. To enable this,
two categories of information must be modeled: (1) metadata about datasets, and (2) the semantic
content of the datasets themselves.</p>
        <p>For the first category, we employ the DCAT (Data Catalog Vocabulary) standard to manage and
describe the metadata of data sources. DCAT, developed by W3C, is widely adopted for cataloging
datasets, distributions and services in data ecosystems. It provides a structured approach to metadata
representation, supporting data discovery and alignment with dataspace guidelines.</p>
        <p>For the second category, we utilize semantic models, as outlined by Knoblock et al. [25, 14], to
represent the content of the data in a semantically rich and interoperable manner. While DCAT forms
the backbone of the metadata structure, it is extended with semantic models to capture contextual
information about the data assets. These semantic models, expressed in RDF, provide additional layers
of meaning, allowing users to understand the relationships, provenance and constraints of the datasets.
Figure 2 illustrates the integration of the semantic model within the data catalog.</p>
        <p>A key design decision for our Solid-based dataspace is the retention of original data formats. Unlike
traditional semantic systems that convert all data to RDF, we decided for our dataspace that all data
remains in its native formats, such as JSON or CSV. This decision was motivated by two factors:
First to ensure practical integration with existing sensor infrastructures and public datasets, which
predominantly provide non-RDF data, and second to lower the technical entry barriers for stakeholders
who might not be familiar with RDF. We designed our data catalog accordingly to enable semantic
enrichment and integration without requiring full data format conversion. This approach aligns with
practical requirements and industry standards, such as those outlined by the IDSA, which generally do
not mandate RDF-based data transformation [22]. Instead, the semantic layer focuses on representing
contextual information in RDF, utilizing semantic modeling to bridge the gap between non-RDF data
formats and semantic reasoning capabilities.</p>
        <p>Although the catalog currently focuses on indexing publicly available datasets from Solid Pods, we
acknowledge that exposing dataset URLs may raise minor privacy concerns, such as the potential
linkability or re-identification of information. To mitigate these risks, future work will explore
accesscontrolled catalog entries that balance discoverability with the principles of data sovereignty and</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Heat Monitoring App</title>
        <p>The Heat Monitoring App is designed to function as a consumer of data from the dataspace and
contributor of visually presented information to the end user. By integrating into the decentralized
environment, the app developer leverages the data catalog to discover and access relevant data assets.
The use of semantic models enables the developer, for instance, to understand the data models and their
context or to discover datasets with specific characteristics, such as temperature data from particular
regions or datasets enriched with geospatial information. These capabilities enable the app to retrieve
only the most relevant and contextually appropriate data for its use.</p>
        <p>Monitoring UHIs requires fine-granular data to identify localized temperature variations efectively. In
this paper, the app uses the city of Wuppertal’s neighborhood categorization, a fine-grained subdivision
of the city into smaller neighborhoods, based on the publicly available datasets provided by Wuppertal’s
open data platform. This categorization allows the system to associate temperature readings with
specific districts, enabling a more precise analysis of urban heat patterns. By integrating data at the
district level, the app ensures that insights are actionable and tailored to localized urban planning and
public health interventions.</p>
        <p>To enable secure and eficient data retrieval, the app employs Solid authentication via the
@inrupt/solid-client library, ensuring authorized access to decentralized data storage. Upon
authentication, datasets stored in Solid Pods are retrieved using the getFile() function, supporting
CSV and JSON formats. The application processes these files by converting them into structured
formats using a parsing library, enabling integration of multiple datasets. Data normalization ensures
consistency, allowing dynamic visualization of temperature readings at the neighborhood level. By
leveraging metadata descriptions and semantic models, the developer is able to find relevant datasets and
understand the data structure to integrate them into one meaningful representation. The methodology
was validated through systematic integration testing, focusing on authentication, file retrieval, and the
processing of heterogeneous data sources. The tests ensured that data from diverse Solid Pods could
be accurately retrieved, correctly parsed, semantically aligned, and reliably integrated into the Heat
Monitoring App.</p>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Evaluation Setting</title>
        <p>To evaluate our Solid-based dataspace and our Heat Monitoring App, we constructed a first minimal
experimental setting that replicates a real-world scenario involving decentralized data management and
heterogeneous datasets. The Solid-based dataspace includes three server instances, each contributing
and accessing data assets stored in diverse formats. The goal of the evaluation is to demonstrate the
efective discovery, integration and utilization of statistical urban heat data using semantic technologies.</p>
        <p>The evaluation setting comprises three data sources from a city, each representing temperature
readings from district-level sensors with geolocation, but provided in diferent formats and using
varying headers. Each of these data sources is thereby located on a diferent Solid server. The first
dataset, in CSV format, includes attributes such as city, QUARTIER, lat, lng, temp, and activated.
The second dataset, in JSON format, represents similar data with attributes such as location, district,
latitude, longitude, t, and activated. The third dataset, also in CSV format, uses c, q, lat, lng, t,
and activation. This heterogeneity demonstrates the challenges of integrating decentralized and
inconsistent datasets, highlighting the necessity of semantic technologies when working with these
datasets. Table 1 provides an overview of the diferent data sources used in the evaluation and highlights
their structural diferences.</p>
        <p>Additionally, the dataspace incorporates three primary datasets while also including contributions
from ten users, each maintaining a personal data Pod across three servers. While the presented datasets
serve as the core data sources, the additional user-contributed data is not directly relevant for the
application, demonstrating the nature of dataspaces, which are designed to accommodate a wide range
of heterogeneous data.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Test Setting &amp; Results</title>
      <p>A simulated scenario was chosen as there is an inherent causality dilemma in the introduction of
such systems. A fully operational dataspace must first exist to attract real-world data and users, but
its adoption hinges on proving its functionality and value. Simulating the environment allows us to
validate the core principles and interactions between components before real-world deployment.</p>
      <p>The Heat Monitoring App is developed as part of the Gesundes Tal research project, which focuses on
leveraging digital technologies to improve health and environmental resilience in urban areas. Based on
apps like the urban heat monitoring, the project aims to provide actionable insights for policymakers,
urban planners, and citizens, contributing to sustainable and data-driven decision-making.</p>
      <p>Currently, the sensors used in the research project are being set up, and we are actively collaborating
with the necessary departments to ensure that our simulated data accurately reflects real-world
conditions. This close alignment between simulated and actual sensor data helps validate the feasibility of
our approach while ensuring that the system remains adaptable to future real-world deployments.</p>
      <sec id="sec-4-1">
        <title>4.1. Dataspace Structure</title>
        <p>As highlighted in Section 3, the constructed Solid-based dataspace comprises three servers, each hosting
multiple individual Solid Pods. Each server operates as an independent node within the dataspace,
where users or institutions register their Pods to contribute and access data. While the data remains in
its original format, such as JSON or CSV, semantic technologies are leveraged by underpinning it with
RDF-based semantic models. This approach allows the system to retain the advantages of semantic
interoperability and contextual understanding without requiring data conversion to RDF.</p>
        <p>A CRUD-based mechanism facilitates the seamless addition of new Solid servers and Pods. New
servers are registered within the data catalog, ensuring compliance with interoperability standards.
Users can create, manage, and delete the information about their Pods through the interface of the data
catalog.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Data Catalog</title>
        <p>Solid servers and their associated Pods were registered manually within the catalog to ensure their
metadata is indexed and accessible. The catalog supports the discovery of data, leveraging the DCAT
standard to describe metadata.</p>
        <p>To enhance the organization and discoverability of data stored in Solid Pods, we developed a
semantically enriched data catalog (cf. Section 3). Participants from the dataspace can register their datasets
with their corresponding semantic model from their Solid Pods using the interface of the catalog. We
are capturing metadata for format, provenance, and access rights while linking them to RDF-based
descriptions. This enables users and developers to eficiently search for and retrieve data through
standardized interfaces.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Semantic Models</title>
        <p>The semantic models for our test setting were created using PLASMA, a platform designed for the
creation of semantic models, which is illustrated in Figure 3. Within PLASMA, we used the VC-Slam
ontology [26] to semantically represent temperature data and its associated geospatial attributes,
ensuring consistent and machine-readable descriptions.</p>
      </sec>
      <sec id="sec-4-4">
        <title>4.4. Heat Monitoring App</title>
        <p>The Heat Monitoring App was developed as a client application that integrates data sources from the
Solid Dataspace. Its focus is on identifying and integrating temperature data from the city of Wuppertal.
When the developer has found matching data with a certain criterion, he can retrieve the data and
integrate it into the app’s visualization. In terms of semantics, the use of DCAT ensures that the data
format is clearly identified, while the semantic models enable the developer to determine that the data
pertains to temperature observations from Wuppertal with geolocation.</p>
        <p>The developed application successfully enables secure and eficient data retrieval from Solid Pods,
leveraging Solid authentication and decentralized storage. Through systematic integration testing,
authentication, file retrieval, and data processing workflows were validated, ensuring accurate and
consistent integration of heterogeneous data sources. The Heat Monitoring App is currently capable of
processing CSV and JSON files into a unified internal data model, normalizing heterogeneous inputs for
seamless aggregation and visualization.</p>
        <p>One of the key challenges is integrating heterogeneous datasets into the Heat Monitoring App.
The semantic models provide information on how the individual files are structured. Based on these
information, the developer can then align the diferent data schemas, particularly for temperature values,
which were represented using varying attribute names and units across datasets. For instance, some
datasets used temp to denote temperature readings, while others used t or temperature_value.</p>
        <p>To address these inconsistencies, a first simple syntactic schema alignment mechanism was
implemented within the Heat Monitoring App. This mechanism leverages semantic annotations in the
semantic models to normalize attribute names. By mapping dataset attributes to a unified ontology —
specifically, the VC-Slam ontology — the system ensures that all temperature values are interpreted
consistently regardless of their original schema. The alignment process is illustrated in Figure 4,
showcasing how disparate data sources can be harmonized before integration into the Heat Monitoring
App.</p>
        <p>This schema alignment approach plays a crucial role in enabling seamless data integration, ensuring
that datasets with varying structures can be automatically processed without manual intervention. It
enhances interoperability within the Solid-based dataspace, allowing diverse temperature datasets to be
leveraged efectively for urban heat monitoring.</p>
        <p>However, this approach is currently limited to simple schemas where mappings remain relatively
straightforward. In cases involving, e.g., nested file structures, it falls short, as hierarchical relationships
and complex dependencies require more sophisticated strategies. Currently, most alignment relies
on manually identifying corresponding field names and implementing normalization mechanisms.
However, our goal is to automate this process further in the future. By incorporating automated
techniques for identifying relevant datasets and fully automated mechanisms for schema alignment
shall automate seamless dataset integration in app development for tasks like these. Overcoming these
challenges remains therefore a key direction for future research.</p>
        <p>Although the integration of private datasets within the Solid-based dataspace is possible, our app
can currently only interact with public data, which is a further limitation of this paper. While the app
demonstrates data discovery and integration, it operates under the assumption that data is publicly
available. In a fully functional dataspace, private data should also be discoverable and requestable. The
correct sequence would involve the app identifying private data, sending a request to the data owner
for access and, upon approval, integrating the data into its workflows. Data owners would retain the
ability to revoke access rights, causing the data to disappear from the app. This dynamic interaction,
while not yet implemented, underscores the potential of the proposed system to balance privacy and
utility in real-world applications.</p>
        <p>This proof of concept highlights contributions and added value, especially in the context of addressing
UHIs. The integration of an initial data catalog for Solid enriched with semantic models complements
existing approaches by enabling meaningful descriptions of data assets. The showcase of the data
catalog is illustrated in Figure 5. By employing the VC-Slam ontology, the system adheres to widely
recognized standards, promoting interoperability and reuse. The current focus on statistical batch data
showcases its utility for retrospective analyses, although it is limited by its inability to handle streaming
data efectively.</p>
        <p>In this context, it is worth noting advancements in RDF stream processing, such as the concept of
Stream Containers proposed by Schraudner and Harth [27], which ofer a scalable and web-aligned
approach to streaming data integration. These advancements could enhance the system’s ability
to handle real-time data streams, bridging the current gap between batch processing and dynamic,
continuous data integration.</p>
        <p>As a result, visualizations of temperature readings at the neighborhood level were achieved as
illustrated in Figure 1. This demonstrates the practical applicability of the approach and enables the
identification of UHIs through the developed application.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>This paper presented a novel approach to urban heat island (UHI) monitoring by leveraging a
decentralized Solid-based dataspace enriched with a data catalog and semantic modeling. Through the
development of a Heat Monitoring App, we demonstrated the feasibility of interoperable, user-centric
urban climate data management. Our approach addresses key challenges of urban data ecosystems,
promoting data sovereignty, interoperability, and contextual understanding via semantic modeling and
Linked Data principles.</p>
      <p>However, several limitations remain. The system primarily relies on batch data, limiting its capability
for real-time environmental monitoring. Although semantic enrichment facilitates integration, schema
alignment still requires significant manual efort, hindering scalability. Furthermore, the approach
assumes that datasets are publicly available or accessible via predefined permissions, lacking mechanisms
for dynamic access negotiation. Ensuring data consistency and reliability across multiple distributed
Pods remains challenging, and currently, only temperature data with geospatial information is
considered, restricting broader environmental analysis. These areas highlight important directions for future
improvements to enhance the robustness, scalability, and applicability of the proposed approach in
real-world urban settings.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Future Work</title>
      <p>Future research will address these limitations along several directions. First, we aim to integrate
realtime data streams by leveraging RDF Stream Processing (RSP) techniques, allowing continuous ingestion
of temperature and environmental data from IoT sensors and weather stations. Second, enhancing
automation in schema alignment and semantic enrichment—potentially using Large Language Models
(LLM) will reduce manual efort and increase scalability. Third, managing access to private datasets
while preserving data sovereignty remains a key challenge. To address this, we will explore privacy
preserving access control mechanisms within the Solid framework, enabling dynamic data request and
negotiation functionalities. Finally, optimizing the scalability, system performance, and user experience
through improved interfaces, intuitive visualization tools, and active stakeholder engagement will be
essential for successful deployment in larger urban settings.</p>
      <p>By addressing these challenges, future research aims to bridge the gap between theoretical
advancements in dataspaces and their practical application in urban climate resilience, ultimately fostering a
more decentralized, participatory, and data-driven approach to mitigating urban heat islands.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgements</title>
      <p>This work has been supported as part of the research project Gesundes Tal in collaboration with the city
of Wuppertal, funded by the Federal Ministry of Housing, Urban Development and Building (BMWSB)
and the Reconstruction Loan Corporation (Kf W) through the funding program “Modellprojekte Smart
Cities: Stadtentwicklung und Digitalisierung” (grant number 19454890).</p>
    </sec>
    <sec id="sec-8">
      <title>Declaration on Generative AI</title>
      <p>During the writing of this paper, the author(s) used DeepL and GPT-4o in order to: Grammar, translation
and spelling check. After using these tool(s)/service(s), the author(s) reviewed and edited the content as
needed and take(s) full responsibility for the publication’s content.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J.</given-names>
            <surname>Manyika</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Chui</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Brown</surname>
          </string-name>
          , J. Bughin,
          <string-name>
            <given-names>R.</given-names>
            <surname>Dobbs</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Roxburgh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Byers</surname>
          </string-name>
          ,
          <article-title>Big data: The next frontier for innovation, competition</article-title>
          , and productivity,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Hupperz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gieß</surname>
          </string-name>
          ,
          <article-title>The interplay of data-driven organizations and data spaces: Unlocking capabilities for transforming organizations in the era of data spaces</article-title>
          ,
          <year>2024</year>
          . doi:
          <volume>10</volume>
          .24251/HICSS.
          <year>2024</year>
          .
          <volume>541</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>Meckler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Dorsch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Henselmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Harth</surname>
          </string-name>
          ,
          <article-title>The web and linked data as a solid foundation for dataspaces</article-title>
          ,
          <source>in: Companion Proceedings of the ACM Web Conference</source>
          <year>2023</year>
          , WWW '23 Companion, Association for Computing Machinery, New York, NY, USA,
          <year>2023</year>
          , p.
          <fpage>1440</fpage>
          -
          <lpage>1446</lpage>
          . URL: https://doi.org/10.1145/3543873.3587616. doi:
          <volume>10</volume>
          .1145/3543873.3587616.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>T.</given-names>
            <surname>Kastner</surname>
          </string-name>
          ,
          <string-name>
            <surname>C. H.-J. Braun</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Both</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Yeboah</surname>
            ,
            <given-names>S. J.</given-names>
          </string-name>
          <string-name>
            <surname>Schmid</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Schraudner</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Käfer</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Harth</surname>
          </string-name>
          ,
          <article-title>Data-sovereign enterprise collaboration using the solid protocol</article-title>
          ,
          <source>in: 20th International Conference on Semantic Systems (SEMANTICS'24)</source>
          , Amsterdam, 17th-19th
          <source>September</source>
          <year>2024</year>
          , volume
          <volume>3759</volume>
          <source>of CEUR workshop proceedings, CEUR-WS</source>
          ,
          <year>2024</year>
          , pp.
          <source>Art</source>
          .-Nr.:
          <volume>10</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>T. R.</given-names>
            <surname>Oke</surname>
          </string-name>
          ,
          <article-title>The energetic basis of the urban heat island</article-title>
          ,
          <source>Quarterly Journal of the Royal Meteorological Society</source>
          <volume>108</volume>
          (
          <year>1982</year>
          )
          <fpage>1</fpage>
          -
          <lpage>24</lpage>
          . URL: https://rmets.onlinelibrary.wiley. com/doi/abs/10.1002/qj.49710845502. doi:https://doi.org/10.1002/qj.49710845502. arXiv:https://rmets.onlinelibrary.wiley.com/doi/pdf/10.1002/qj.49710845502.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Santamouris</surname>
          </string-name>
          ,
          <article-title>Analyzing the heat island magnitude and characteristics in one hundred asian and australian cities and regions</article-title>
          ,
          <source>Science of The Total Environment 512-513</source>
          (
          <year>2015</year>
          )
          <fpage>582</fpage>
          -
          <lpage>598</lpage>
          . URL: https://www.sciencedirect.com/science/article/pii/S0048969715000753. doi:https://doi.org/ 10.1016/j.scitotenv.
          <year>2015</year>
          .
          <volume>01</volume>
          .060.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Intergovernmental</given-names>
            <surname>Panel on Climate</surname>
          </string-name>
          <article-title>Change (IPCC)</article-title>
          ,
          <source>Climate Change</source>
          <year>2021</year>
          :
          <article-title>The Physical Science Basis</article-title>
          , Cambridge University Press,
          <year>2021</year>
          . URL: https://www.ipcc.
          <source>ch/report/ar6/wg1/.</source>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>I.</given-names>
            <surname>Stewart</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Oke</surname>
          </string-name>
          ,
          <article-title>Local climate zones for urban temperature studies</article-title>
          ,
          <source>Bulletin of the American Meteorological Society</source>
          <volume>93</volume>
          (
          <year>2012</year>
          )
          <fpage>1879</fpage>
          -
          <lpage>1900</lpage>
          . doi:
          <volume>10</volume>
          .1175/
          <string-name>
            <surname>BAMS-D-</surname>
          </string-name>
          11-
          <issue>00019</issue>
          .1.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M.</given-names>
            <surname>Franklin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Halevy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Maier</surname>
          </string-name>
          ,
          <article-title>From databases to dataspaces: a new abstraction for information management</article-title>
          ,
          <source>SIGMOD Rec</source>
          .
          <volume>34</volume>
          (
          <year>2005</year>
          )
          <fpage>27</fpage>
          -
          <lpage>33</lpage>
          . URL: https://doi.org/10.1145/1107499.1107502. doi:
          <volume>10</volume>
          . 1145/1107499.1107502.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>A.</given-names>
            <surname>Braud</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Fromentoux</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Radier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O. Le</given-names>
            <surname>Grand</surname>
          </string-name>
          ,
          <article-title>The road to european digital sovereignty with gaia-x and idsa</article-title>
          ,
          <source>IEEE Network 35</source>
          (
          <year>2021</year>
          )
          <fpage>4</fpage>
          -
          <lpage>5</lpage>
          . doi:
          <volume>10</volume>
          .1109/MNET.
          <year>2021</year>
          .
          <volume>9387709</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>I. D. S.</given-names>
            <surname>Association</surname>
          </string-name>
          , IDSA Position Paper:
          <article-title>GAIA-X and IDS</article-title>
          ,
          <string-name>
            <surname>Technical</surname>
            <given-names>Report</given-names>
          </string-name>
          , International Data Spaces Association,
          <year>2020</year>
          . URL: https://internationaldataspaces.org/wp-content/uploads/dlm_ uploads/IDSA-Position-
          <article-title>Paper-</article-title>
          <string-name>
            <surname>GAIA-X-</surname>
          </string-name>
          and
          <string-name>
            <surname>-IDS</surname>
          </string-name>
          .pdf, accessed:
          <fpage>2024</fpage>
          -12-19.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>T.</given-names>
            <surname>Berners-Lee</surname>
          </string-name>
          ,
          <article-title>Linked data - design issues</article-title>
          , https://www.w3.org/DesignIssues/LinkedData,
          <year>2006</year>
          . Accessed:
          <fpage>2025</fpage>
          -04-28.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>