Towards an Architecture for Data Altruism in Solid
Beatriz Esteves1,∗
1
Ontology Engineering Group, Universidad Politécnica de Madrid, Spain
Abstract
This demo showcases an architecture to implement data altruism as a service using the Solid protocol
and ODRL policies to grant access to personal data for altruistic purposes in a privacy-friendly manner.
Policies are represented using OAC, the ODRL profile for Access Control, and DGAterms, a vocabulary
with terms modelled from the European Union’s Data Governance Act (DGA), including data altruism
concepts. In addition, we present a Solid Data Altruism application, SoDA, where (a) a data subject can
generate a policy to share their personal data for an altruistic purpose, (b) data users can request access
to datasets for altruistic purposes, and (c) data altruism organisations can use to maintain metadata
regarding available datasets.
Keywords
Solid, data altruism, ODRL policies, personal data access
1. Introduction
Following the current efforts to decentralise the storage and access to data on the Web [1], the
Solid protocol [2] allows its users to have their data stored on personal datastores, the “Pods”,
and control which users and applications can have access to it. The regulatory agenda in Europe
has followed this technological trend by putting data subjects – individuals whose personal data
is being processed – in a decision-making position with regards to their data, while improving
data availability and promoting trust in data intermediation services [3]. In particular, the Data
Governance Act (DGA) [4] introduces the concept of data altruism – the voluntary sharing of
data for the general interest of the public, such as improving healthcare systems or combating
climate change, managed by data altruism organisations, non-profits who make personal (and
non-personal) data available to data users who wish to use such data for altruistic purposes.
In this demo, we propose to use the ODRL profile for Access Control (OAC) [5] to create
policies to determine access to data stored in Solid Pods. By using previous work on the
DGAterms vocabulary [6], these policies can also be specified for specific altruistic purposes. In
addition, we present an architecture and a proof-of-concept Solid Data Altruism application,
SoDA, which in addition to allowing data subjects to generate these policies, allows data users
to request access to datasets for altruistic purposes. The paper is organized as follows: Section
2 describes related work, Section 3 presents a description of the architecture and of the proof-of-
concept demonstration, Section 4 describes the used technologies and details the data modelling
used in the work and the last section presents conclusions and future work.
ISWC 2023 Posters and Demos: 22nd International Semantic Web Conference, November 6–10, 2023, Athens, Greece
∗
Corresponding author. Email: beatriz.gesteves@upm.es
Orcid 0000-0003-0259-7560 (B. Esteves)
© 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
CEUR
Workshop
Proceedings
http://ceur-ws.org
ISSN 1613-0073
CEUR Workshop Proceedings (CEUR-WS.org)
CEUR
ceur-ws.org
Workshop ISSN 1613-0073
Proceedings
2. Related Work
A few solutions are already emerging to deal with DGA’s requirements as it will be applicable
from the 24th of September 2023. For instance, the Smart Citizen platform1 allows citizens
to collect noise and air quality data through home sensors and share that data through the
platform with researchers and governments for them to develop targeted solutions for pressing
environmental issues such as air pollution. The German Corona-Datenspende-App2 collected
health-related data from fitness bracelets and smartwatches, e.g., heart rate, body temperature,
blood pressure, sleeping patterns, for researchers to monitor and identify at an early stage
possible Covid-19 hotspots.
3. Demonstration
Figure 1: High-level overview illustrating an architecture to implement data altruism as a service using
the Solid protocol and a Solid application, SoDA, to make available/search datasets.
The diagram in Figure 1 showcases a high-level overview of an architecture to implement
data altruism as a service using the Solid protocol, with the Solid Data Altruism application –
SoDA – at the centre of the system architecture. With this architecture, we aim to provide an
early proof-of-concept which focuses on allowing data subjects to share personal data and data
users to look for datasets available to be reused for altruistic purposes, in a privacy-friendly
manner as the only information that is disclosed about the dataset is the type of data it contains
and the purpose for which it can be used.
In this architecture, Solid users are identified by a WebID and store data and/or request
to access data stored in Solid Pods, as prescribed by the Solid protocol. In the cases where
personal data is stored in Pods, personal data protection laws apply, such as the European
Union’s General Data Protection Regulation (GDPR) and the DGA, with the user storing their
personal data in Pods being considered a data subject. Moreover, both data subjects and data
users manage access to data through Solid applications. In this context, we introduce SoDA, a
Solid Data Altruism application, which allows:
1
https://smartcitizen.me/
2
https://corona-datenspende.de/
1 PREFIX dct:
2 PREFIX dcat:
3 PREFIX odrl:
4 PREFIX dpv:
5 PREFIX oac:
6 PREFIX dga:
7 PREFIX xsd:
8 PREFIX ex:
9 ex:policy-123456 a odrl:Offer ; odrl:uid ex:policy-123456 ; odrl:profile oac: ;
10 dct:creator ;
11 dct:issued "2023-07-19T17:26:35"^^xsd:dateTime ;
12 odrl:permission [
13 odrl:assigner ;
14 odrl:action oac:Read ; dpv:hasPersonalData ex:EnergyConsumption ;
15 odrl:target ;
16 odrl:constraint [
17 odrl:leftOperand oac:Purpose ;
18 odrl:operator odrl:isA ;
19 odrl:rightOperand dga:CombatClimateChange ] ] .
Listing 1: ODRL Offer policy set by User A that allows read-access to a dataset with
EnergyConsumption data for the purpose of combating climate change.
(a) data subjects to generate policies to share their personal data for an altruistic purpose;
(b) data users to request access to datasets according to the type of data they contain and the
purpose for which it can be used;
(c) organisations to provide data altruism as a service, by storing metadata regarding available
datasets in their own Solid Pod, without having the need to store the data themselves,
following Solid’s decentralisation philosophy.
Using SoDA3 , data subjects can generate data access policies related to the access to their
personal data, which are stored by the data altruism organisation in a Solid Pod which only
records metadata about the dataset and their access conditions. These records are then used
to show the available datasets to data users using SoDA, preserving the data subjects’ privacy
by only showing the type of data available and the purpose for which it can be used, without
revealing the identity of the data subject. If data users find datasets that they wish to use, the data
altruism organisation acts as an intermediary by sending the data request to the data subject,
who then decides to authorise/deny access. More details on this demonstration are available at
https://besteves4.github.io/iswc23demo/, including a recording of the app’s functionalities.
4. Data Modelling
In this demo, OAC4 is used to define legally-aligned policies to grant access to personal
data stored in Solid Pods since it is an RDF-based specification that uses (i) the Open Digital
Rights Language (ODRL) standard to express different types of access policies, e.g., offers,
3
Source code is available at https://github.com/besteves4/soda.
4
https://w3id.org/oac#
1 ex:datasets a dcat:Catalog ; dct:created "2023-06-10"^^xsd:date ;
2 dct:description "Catalogue of datasets maintained by SoDACompany" ;
3 dct:publisher ex:SoDACompany ; dcat:dataset ex:dataset_001 .
4 ex:SoDACompany a dga:DataAltruismOrganisation .
5 ex:dataset_001 a dcat:Dataset ; odrl:hasPolicy ex:policy-123456 ;
6 dpv:hasLocation ;
7 dct:publisher ;
8 dct:description "Dataset with energy consumption data of June 2023" ;
9 dcat:mediaType .
Listing 2: Catalogue of datasets maintained by a Data Altruism Organisation.
requests or agreements, associated with data stored in decentralised datastores, and (ii) the
Data Privacy Vocabulary (DPV) [7] as a controlled vocabulary for invoking privacy and data
protection-specific terms. Moreover, OAC was chosen as it can be used to extend Solid’s access
control list mechanism, Web Access Control (WAC) [8], to have richer access control policies
where specific purposes for access can be defined, among other constraints such as restrictions
on the access duration or on the types of entities, e.g., non-profit or for-profit, that can use
the data. In addition, the DGAterms vocabulary5 is used to represent the altruistic purposes
defined in the DGA, such as scientific research or combating climate change. Listing 1 presents
an example of a policy set by User A, which allows data users to read the dataset stored at
https://solidweb.me/userA/energyconsumption/june2023, which contains EnergyConsumption
data as it is indicated by the dpv:hasPersonalData predicate, for the altruistic purpose of
combating climate change.
In addition, W3C’s Data Catalog Vocabulary (DCAT) is used to maintain a catalogue of the
available datasets, which allows the data altruism organisation to show available datasets to data
users and send data requests in their name in a privacy-friendly manner as data users only get
access to the dataset if the data subject authorises it. Listing 2 presents an example of a catalogue
of datasets maintained by SoDACompany, a data altruism organisation. Metadata regarding the
dataset storage location, the publisher of the dataset and the policy that determines access to it
is also recorded in these catalogues.
5. Conclusions and Future Work
In this demo, we presented an architecture to manage data altruism activities in a decentralised
setting, such as Solid, and an application that allows data subjects to generate policies regarding
data they wish to make available for the public good and data users to look for such datasets.
Such an architecture would help to achieve the European Commission’s vision of having
trustworthy data altruism services where data subjects are in control of who can access their data.
This system needs to be complemented by future endeavours on: (i) SHACL shapes to validate
the policies, (ii) usability testing to assess the app’s design choices, including scalability testing –
which might require the usage of data aggregators to deal with organisations that want to access
a large number of datasets, (iii) improving/automating the process of authorising/denying data
5
https://w3id.org/dgaterms#
requests using technologies, such as RDF surfaces [9], to reason over the offer/request policies
and contribute to the (iv) generation of immutable agreements – e.g., using existing work on
integrating Verifiable Credentials into the Solid ecosystem [10] – that record the conditions for
data usage that can be utilised by authorities in case the entities using the data misbehave.
Acknowledgments
This research has been supported by the European Union’s Horizon 2020 research and innovation
programme under the Marie Skłodowska-Curie grant agreement No. 813497 (PROTECT) and
Horizon 2020 innovation action under grant agreement No. 101036418 (AURORA).
References
[1] S. Verbrugge, F. Vannieuwenborg, M. Van der Wee, D. Colle, R. Taelman, R. Verborgh,
Towards a personal data vault society: an interplay between technological and business
perspectives, in: FITCE 2021, 2021. doi:10.1109/FITCE53297.2021.9588540 .
[2] S. Capadisli, T. Berners-Lee, R. Verborgh, K. Kjernsmo, Solid Protocol Version 0.10.0, W3C
Community Group Draft Report (2022). URL: https://solidproject.org/TR/protocol.
[3] European Commission, Communication from the Commission to the European Parliament,
the Council, the European Economic and Social Committee and the Committee of the
Regions - A European strategy for data, 2020.
[4] Regulation (EU) 2022/868 of the European Parliament and of the Council of 30 May 2022
on European data governance and amending Regulation (EU) 2018/1724 (Data Governance
Act), 2022.
[5] B. Esteves, H. J. Pandit, V. Rodríguez-Doncel, ODRL Profile for Expressing Consent
through Granular Access Control Policies in Solid, in: 2021 EuroS&PW, 2021, pp. 298–306.
doi:10.1109/EuroSPW54576.2021.00038 .
[6] B. Esteves, V. Rodríguez-Doncel, H. J. Pandit, D. Lewis, Semantics for Implementing Data
Reuse and Altruism under EU’s Data Governance Act, in: To Appear on SEMANTiCS
2023 Proceedings, 2023. doi:10.5281/zenodo.8301901 .
[7] H. J. Pandit, A. Polleres, B. Bos, R. Brennan, B. Bruegger, F. J. Ekaputra, J. D. Fernández,
R. G. Hamed, E. Kiesling, M. Lizar, E. Schlehahn, S. Steyskal, R. Wenning, Creating a
Vocabulary for Data Privacy: The First-Year Report of Data Privacy Vocabularies and
Controls Community Group (DPVCG), in: On the Move to Meaningful Internet Systems:
OTM 2019 Conferences, volume 11877, Springer International Publishing, 2019, pp. 714–730.
doi:10.1007/978- 3- 030- 33246- 4_44 , Lecture Notes in Computer Science.
[8] S. Capadisli, Web Access Control 1.0.0, W3C Candidate Recommendation (2022). URL:
https://solidproject.org/TR/wac.
[9] P. Hochstenbach, J. De Roo, R. Verborgh, RDF Surfaces: Computer Says No, in: 1st
Workshop on Trusting Decentralised Knowledge Graphs and Web Data, 2023.
[10] C. H.-J. Braun, T. Käfer, Attribute-based Access Control on Solid Pods using Privacy-
friendly Credentials, in: Proceedings of the Poster and Demo Track and Workshop Track
of SEMANTiCS 2022, 2022.