=Paper= {{Paper |id=Vol-3306/paper6 |storemode=property |title=On the need for strong sovereignty in data ecosystems |pdfUrl=https://ceur-ws.org/Vol-3306/paper6.pdf |volume=Vol-3306 |authors=Johannes Lohmöller,Jan Pennekamp,Roman Matzutt,Klaus Wehrle |dblpUrl=https://dblp.org/rec/conf/vldb/LohmollerPMW22 }} ==On the need for strong sovereignty in data ecosystems== https://ceur-ws.org/Vol-3306/paper6.pdf
On the need for strong sovereignty in data ecosystems
Johannes Lohmöller1,* , Jan Pennekamp1 , Roman Matzutt1 and Klaus Wehrle1
1
    RWTH Aachen University, Aachen, Germany


                                             Abstract
                                             Data ecosystems are the foundation of emerging data-driven business models as they (i) enable an automated exchange between
                                             their participants and (ii) provide them with access to huge and heterogeneous data sources. However, the corresponding
                                             benefits come with unforeseen risks as also sensitive information is potentially exposed. Consequently, data security is of
                                             utmost importance and, thus, a central requirement for the successful implementation of these ecosystems. Current initiatives,
                                             such as IDS and GAIA-X, hence foster sovereign participation via a federated infrastructure where participants retain local
                                             control. However, these designs place significant trust in remote infrastructure by mostly implementing organizational
                                             security measures such as certification processes prior to admission of a participant. At the same time, due to the sensitive
                                             nature of involved data, participants are incentivized to bypass security measures to maximize their own benefit: In practice,
                                             this issue significantly weakens sovereignty guarantees. In this paper, we hence claim that data ecosystems must be extended
                                             with technical means to reestablish such guarantees. To underpin our position, we analyze promising building blocks and
                                             identify three core research directions toward stronger data sovereignty, namely trusted remote policy enforcement, verifiable
                                             data tracking, and integration of resource-constrained participants. We conclude that these directions are critical to securely
                                             implement data ecosystems in data-sensitive contexts.


1. Introduction                                                                                         software and develop defense-in-depth strategies for pro-
                                                                                                        tection [4]. Participants receive no additional security
Data-driven business models are an invaluable pillar for guarantees beyond this ahead-of-time certification and
modern industries, and their importance will increase have no means to verify that other participants handle
with growing demands requiring more complex and glob- their data as intended (and required). Here, the lack of
ally distributed operation, as well as sophisticated collab- stronger guarantees effectively ends sovereignty of par-
orations to improve the status quo [1]. Data ecosystems ticipants in the moment of sharing.
provide the foundation for such data-driven business                                                       In this paper, we argue that data ecosystems need to
models as they center around automating data exchanges provide their participants with strong and continual guar-
and value creation based on huge and heterogeneous data antees about the security of their provided data to main-
sources from various stakeholders [2]. Added value can tain each participant’s data sovereignty. Moreover, driven
be created by, for instance, improving algorithms un- by privacy and security concerns, recent regulatory ef-
derlying existing analytics or extracting new insights of forts set strict rules on how data may flow across organi-
previously recorded data [3]. Crucially, this process in- zational borders, raising the need for fine-grained con-
volves the integration of distributed data sources owned trol [7]. To this end, data ecosystems are only sustainable
by different stakeholders. Here, data ecosystem initia- if stakeholders are willing to participate by providing and
tives such as International Data Spaces (IDS) [4] and consuming data actively. However, we argue that data-
GAIA-X [5] aim to provide a trustworthy environment consuming parties are currently incentivized to ignore
for the discovery, sharing, and processing of available previously agreed terms for data usage. Such behavior
data, irrespective of specific domains.                                                                 hurts data owners as they are not adequately compen-
    However, current efforts to establish the necessary sated for the value of the data they provide and questions
trust among stakeholders heavily rely on organizational whether data ecosystems are adequate to exchange data
agreements and processes [6, 4]. For instance, the IDS subject to privacy regulation. Consequently, data owners
certification process asserts that participants use audited might restrict their data-sharing efforts or leave the data
                                                                                                        ecosystem entirely. Hence, data ecosystems require solid
Proc. of the First International Workshop on Data Ecosystems (DEco’22), technical measures, such as cryptographically enforceable
September 5, 2022, Sydney, Australia                                                                    guarantees and verifiable continual security monitoring,
*
  Corresponding author
                                                                                                        to facilitate the establishment of trust between remote
$ lohmoeller@comsys.rwth-aachen.de (J. Lohmöller);
pennekamp@comsys.rwth-aachen.de (J. Pennekamp);                                                         and potentially mutually unknown participants. In this
matzutt@comsys.rwth-aachen.de (R. Matzutt);                                                             paper, we provide more background on the current state
wehrle@comsys.rwth-aachen.de (K. Wehrle)                                                                of data ecosystems, identify shortcomings of ongoing
 0000-0003-2101-5562 (J. Lohmöller); 0000-0003-0398-6904                                               data ecosystem initiatives, and derive and discuss future
(J. Pennekamp); 0000-0002-4263-5317 (R. Matzutt);
                                                                                                        research directions steered toward improving the sover-
0000-0001-7252-4186 (K. Wehrle)
           © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License eignty and trust of participants in data ecosystems.
                                       Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073
                                       CEUR Workshop Proceedings (CEUR-WS.org)




                                                                                                   51
2. A Primer on Current Data
   Ecosystem Initiatives and their                                                              operator association


   Architectures                                                                                   rti
                                                                                                       fy




                                                                                                               ce
                                                                                                ce




                                                                                                                  rti
                                                                                                                      fy
                                                                                 da
                                                                                      ta
To ensure a common understanding of the trust issues                data owner                                             data user
with today’s data ecosystems, we first briefly introduce                                   data                    data
data ecosystems, the notion of data sovereignty, and com-                                  provider            consumer
mon participants in this context. Moreover, we present a                                          data ecosystem
short overview of data ecosystem initiatives focusing on         Figure 1: Participating entities in data ecosystems. Data
their currently implemented security measures.                   flows from left to right, with data provider and data con-
   Ecosystem Goals. The need to share data with col-             sumer implementing a common ecosystem interface. The
laborators within specific sectors has been recognized in        data ecosystem’s operator also handles orthogonal tasks, in-
a variety of domains, including supply chains [8], public        cluding admission and discovery of participants and data.
health [9, 10], and mobility [11]. Here, on the one hand,
data ecosystems aim to provide multi-sided platforms [2]
that facilitate an automated data exchange following the        multiple contexts and values ranging from legislation to
FAIR principle [12], i.e., the offered data needs to be find-   clinical practice and control and power to recognition,
able, accessible, interoperable, and reusable. On the other     respectively.
hand, today’s data ecosystems aim to equip data owners             Initiatives. Superseding a previously rather tedious
with fine-grained control over their data, including with       bilateral exchange, the goal of initiatives like the Inter-
whom it will be shared and under what terms. This fine-         national Data Spaces (IDS) [3, 2, 21], GAIA-X [14, 5],
grained control is the foundation of data sovereignty [3].      Data Sharing Coalition [22], IHAN [23], FIWARE [24],
Achieving these goals requires solving issues w.r.t. or-        CEF [25], or BDVA [26] is to establish a universal plat-
ganization [2], semantics and data quality [13], and in-        form to regulate transactions regarding that exchange.
terfacing [14], all of which are currently under active         The EU or federal offices fund such initiatives, facilitating
research.                                                       a top-down approach toward establishing a common data
   Definitions. So far, we have seen data ecosystems            platform. Some initiatives rather bundle forces toward
only as a means for exchanging data as required in emerg-       the adoption of data ecosystems in general (Data Shar-
ing data markets and other use cases [3]. In fact, data eco-    ing Coalition, CEF, BDVA), while IHAN, for instance,
systems emerged without a standard definition in mind.          is in an early stage, without publicly released techni-
Oliveira and Lóscio [15] address this gap by reviewing          cal documentation so far. Out of the named initiatives,
and merging concurring data ecosystem definitions; as           IDS [4], GAIA-X [5], and FIWARE [27] have released
a result, they define a data ecosystem as a combination         technical documentation that permit a deeper analysis
of independently operated networks that produce and             with regard to implemented data security and trust mea-
provide data, but also other assets like software or ser-       sures. Specifically, IDS and GAIA-X both work toward
vices. Furthermore, the authors highlight that such data        a standard interface to locate and access data and pro-
ecosystems are self-regulated and driven by collabora-          vide an organizational context, including identification,
tion and competition between actors [15]. Additionally,         admission, and certification of participants [14]. Thus,
we emphasize that data ecosystems form platforms that           in the remainder of this paper, we primarily study these
have to define common interfaces and rules to enable            general-purpose initiatives. While IDS aims to provide a
collaboration across independent networks. Accordingly,         framework under which data spaces can be built quickly,
we refer to data ecosystem participants as networks that        e.g., targeting a specific domain with coherent partici-
implement the interfaces and accept the rules defined by        pants, GAIA-X plans to establish a single central cross-
a given ecosystem.                                              domain platform [14]. Moving toward domain-specific
   Similarly, the notion of data sovereignty, i.e., one of      concepts, initial projects such as CATENA-X [28], an ini-
the critical concepts of data ecosystems, currently lacks       tiative inside the automotive domain, are picking up their
a clear and common definition [16]. If used in the con-         ideas, while established platforms such as FIWARE [24],
text of data ecosystems, researchers generally agree that       a framework to connect smart devices, start to provide
data sovereignty relates to control and ownership of data       compatible interfaces [29].
items, together with specific claims and obligations made          Architecture. Despite their slightly different scopes,
by involved parties [17, 18, 19, 20]. Hence, within this        IDS and GAIA-X share a similar architecture, so we an-
paper, we will focus on this aspect of data sovereignty.        alyze both initiatives together as data ecosystem imple-
To set this into a broader context, the review by Hum-          mentations. Organizing the data exchange, data eco-
mel et al. [16] describes data sovereignty as covering          systems commonly assign different roles to participants.




                                                            52
Figure 1 shows the overall scenario we are considering to-         data and enforcing certain duties to be adhered to when
gether with the main participants. A single data exchange          processing data. Such a policy could be, for instance,
can be considered bilateral, such that we can suppose              the permission to use a dataset for one week, with the
the following roles [4]: First, a data owner legally owns          obligation to delete it after that time.
the data to be shared and is interested in enforcing their            To implement usage control, IDS utilizes and extends
rights on the data if it is shared. Second, a data provider        ODRL [33], a policy language for digital rights manage-
takes over the technical part of offering a dataset to be          ment that allows fine-grained modeling of usage terms [4].
exchanged on behalf of the data owner.                             For enforcement, the data owner has to trust that the con-
   While a single entity certainly can take over both roles,       suming party abides by the negotiated terms. To this end,
i.e., host the infrastructure to provide their data, in certain    he can only rely on the certification of the consumer re-
situations, the providing entity does not formally own the         quired to join as a participant, but can neither monitor
data. For instance, this situation is the case for electronic      the process himself, nor receive a credible proof that us-
health records owned by patients, which typically do not           age terms were enforced. However, since the negotiated
provide the infrastructure on their own. On the receiving          contracts might also involve monetary compensation,
side, a data consumer requests and receives the data from          the consuming party has incentives to disobey negoti-
the provider and passes it to a data user, who processes           ated terms, e.g., using data more often than requested,
the exchanged data, e.g., by visualizing it. Again, the            sourcing it for other purposes, or sharing it with other
consumer might also fulfill the data user role if both             systems or third parties.
processes are co-located. Noteworthy, GAIA-X does not                 Legal Context. Providing an environment for data ex-
separate the data consumer and data user [5], but we               change, the IDS builds upon surrounding legal contracts
continue using both terms to separate the logical roles,           to equip participants with the means to establish credi-
as described above.                                                bility with each other [34]. Specifically, such contracts
   Due to the distribution of providers and consumers,             regulate the terms of usage and the overall setting, e.g.,
data ecosystems operate as a federation of independent             regarding a monetary compensation [4] or a penalty for
deployments that jointly form a decentralized system.              breach of contract. Contracts can be bilateral or multilat-
Thereby, data owners can keep their sensitive datasets             eral but will typically not cover the entirety of data space
under their control until they actively decide to share            participants [4], thereby limiting spontaneous data ac-
them with selected participants. To this end, data eco-            cess. Within negotiated legal contracts, data ecosystems
systems enable data sovereignty up to the point where a            such as IDS then plan to (automatically) negotiate a re-
data sharing decision has been made and data is actually           fined technical contract. This refined contract translates
transferred to the data consumer.                                  terms into machine-readable policies that grant specific
   Trust. To not let sovereignty end at the point of data          permissions on the exchanged dataset and potential obli-
exchange, data ecosystems currently require a certifica-           gations [4].
tion of participants. Hence, they ensure that all entities
handling data adhere to a common baseline w.r.t. data
protection. Certification includes, but is not limited to,         3. Data Ecosystems Need Technical
defense-in-depth strategies and security event monitor-               Security Guarantees
ing systems [30, 31]. Specifically, the IDS requires prior
certification steps and attests successful certification via      Having outlined the fundamental ideas of sovereign data
a public key infrastructure, establishing a trusted iden-         exchange and the technical and organizational frame-
tity layer [4, 14]. Contrarily, GAIA-X does not target            work data ecosystems provide, we now critically review
a specific certification but requires participants to pro-        the design decisions of security mechanisms implemented
vide a standardized self-description with claims that are         in state-of-the-art data ecosystems. To this end, we ana-
checked before a participant’s admission [14]. In both            lyze the available technical documentation and reference
cases, the ecosystem equips participants with the means           architecture for IDS and GAIA-X. Primarily, we identify a
to identify each other and establishes a common ground            lack of technical means to facilitate strong security guar-
for mutual trust decisions.                                       antees and establish strong trust between participants.
   Based on the ecosystem-wide identity layer, data eco-          Namely, the current ecosystem initiatives can only partly
systems can provide fine-granular access control to data          address the security and trust requirements with their
and let data owners limit the target audience they are            frail certification-based approaches.
willing to share their data with. However, access control            Attacker Model. Guiding our position that data eco-
alone is insufficient, as data sovereignty would end once         systems require stronger data protection mechanisms, we
the flow of data between participants took place after            apply the notion of a malicious-but-cautions attacker [35].
access has been legitimately granted. Usage control [32]          Specifically, the malicious-but-cautious attacker can mis-
could possibly fill this gap by granting specific rights on       behave in all possible ways but aims not to leave any




                                                              53
verifiable evidence of its misbehavior [35]. Compared      data owner depends on fortunate coincidence to notice
to an honest-but-curious (or semi-honest) attacker, this   malicious behavior retrospectively. Consequently, we
definition includes explicitly local deviation from proto- argue that data owners will refrain from ever sharing
cols unless they are verifiable by externals. With data    sensitive data. With such datasets covering manufactur-
ecosystems exchanging data within established legal con-   ing plans [8], the identity of suppliers [39], or privacy-
tracts, we argue that participants aim to avoid being sued sensitive health records [40] the lack of enforcement
for their misbehavior and hence, have incentives not to    guarantees severely limits the kind of data exchangeable.
leave any evidence. To this end, a malicious-but-cautions  Hence, such scenarios require stronger data sovereignty
attacker reflects the typical power and incentives of data guarantees than the currently envisioned (weak) organi-
ecosystem participants who source, process, and utilize    zational measures.
somebody else’s data.                                         Partly addressing this issue, IDS can utilize trusted
   Data Security. Current notions of data security in-     platform modules (TPMs) as a trust anchor on remote
clude security at-rest, in-transit, and in-use [36]. At-rest
                                                           systems [4]. However, merely providing verification of
security and in-transit security are considered solved     the running software, but essentially lacking memory
problems in the context of data ecosystems as they can     encryption, TPMs still contribute little to an effective
use widely available building blocks such as storage en-   protection against malicious-but-cautious attackers.
cryption and transport layer security (TLS), respectively     I2: Trusted Data Usage Reporting. Besides effective
[4]. Contrarily, in-use data security targets data at the  usage control, usage transparency is a second corner-
moment of processing, e.g., when the decrypted data is     stone to strong data sovereignty and essential to increase
loaded into memory and is hence more difficult to ensure   the participation of data owners. To this end, data own-
and implement. Technical or cryptographic measures         ers that grant permissive access to their data shall still
to protect data by providing in-use security include, for  be able to track usages of their data in remote systems
instance, hardware-assisted security or homomorphic        transparently. Within IDS, a clearing house entity is
encryption [37, 38]. However, despite these measures,      designated to address part of this problem by enabling
today’s data ecosystems build their guarantees regarding   billing-relevant usage logging [4]. However, similarly
data in-use security upon remote participants’ honesty     to I1, there is currently no technically or cryptographi-
to enforce certain rights on shared data. Unfortunately,   cally enforced guarantee that data usage must be logged.
with monetary compensation handled as part of data         Hence, data users can easily circumvent the implemented
exchange and transfers entrusted for a specific purpose,   logging features of today’s data ecosystems and thereby
incentives to evade enforcement clearly exist.             exceed granted usage terms without being caught, such
   Hence, we argue that the following questions are criti- as evading downstream payments for data usage.
cal to the adoption of data ecosystem initiatives in data-    I3: Sovereign Participation without Own Infras-
sensitive domains:                                         tructure. A third cornerstone of strong data sovereignty
                                                           is the free choice of data owners with whom to exchange
• I1: How can data owners trust remote infrastructure to data under which conditions. Within the currently pro-
   enforce their granted rights once data has been shared? posed architecture (cf. Figure 1), data owners entirely
                                                           rely on and trust data providers to serve their data within
• I2: How can data owners track their data in a trusted
                                                           the ecosystem. However, if both roles are distributed
   way if processed by remote facilities?
                                                           between separate entities, similar trust issues as between
• I3: How can participants with little resources maintain the providing and consuming parties also apply here.
   sovereignty without requiring them to host their own Specifically, the owner needs to trust the provider to
   infrastructure?                                         serve the agreed policies and not misuse data locally.
                                                           Moreover, usage reporting systems must not assume the
In the following, we elaborate on these high-level de- provider to be trusted in this case. Hence, the providing
sign questions regarding strong data sovereignty when side of a data exchange requires the same measures to
implemented in practice.                                   implement reliable trust as the consumer side.
   I1: Trust in Remote Rights Enforcement. A first            Takeaway. Today’s data ecosystems only provide data
cornerstone of end-to-end data sovereignty is the guar- protection via organizational means, such that there is no
anteed enforcement of digital rights on remote systems, protection against malicious-but-cautious inside attack-
i.e., usage control. However, suppose a privileged user on ers on remote systems. At the same time, monetary data
the consuming side, e.g., a system administrator, copies usage compensation and usage restrictions create incen-
exchanged data without leaving traces in audit-relevant tives to evade enforcement mechanisms. Currently, these
logging systems. This unintended behavior renders us- shortcomings limit the applicability of data ecosystems
age control enforcement ineffective. While we anticipate to share sensitive datasets and thus need a remedy.
that such an action would violate negotiated terms, the




                                                           54
4. Toward Stronger Data                                          insufficient when considering, a malicious-but-cautious
   Sovereignty                                                   adversary who does not provide a trustworthy environ-
                                                                 ment for storing or processing the exchanged data.
The current data ecosystem initiatives strive for seam-             Hardware-based Trusted Execution Environments
lessly interconnecting businesses and facilitating the au-       (TEEs), such as Intel SGX, AMD SEV, or ARM TrustZone,
tomation of valuable data exchanges. However, in the             are promising candidates for closing this gap in the
last section, we identified severe open issues (I1–I3) that      future [49]. The goal of TEEs is to provide a trustworthy
impede each participant’s data sovereignty in situations         computing environment that can be established even
where organizational trust mechanisms, such as required          on untrusted remote infrastructure. To this end, a
certification prior to admission to the ecosystem, are in-       TEE provides an isolated (i.e., memory-encrypted)
sufficient. Given the competitive advantage a participant        environment for running applications with the ability
can gain by acting in a malicious-but-cautious manner (cf.       to verify the integrity of the executed program code
Section 3), these open issues only become more pressing.         remotely. A CPU-embedded cryptographic key provides
Hence, with the data sovereignty of their participants in        the required trust anchor that allows the data owner to
mind, data ecosystems must deploy additional means to            verify correct execution independently of the remote
allow them to establish trust in that new market.                host’s operating system [49]. Consequently, TEEs
   In this paper, we argue that only technical means provid-     allow for trustworthy remote execution by hiding the
ing strong cryptographic guarantees are suitable to reach        program’s execution state and hardening it against
the goal of trustworthy data ecosystems that retain partic-      hampering.
ipants’ data sovereignty. Next, we discuss how available            Implementing policy enforcement and data processing
building blocks can be integrated into data ecosystems           inside such environments has the potential to resolve
to address each of the open issues I1–I3.                        the trust issues data ecosystems are currently facing.
                                                                 However, TEE technology is an active field of research,
                                                                 and current implementations still experience security
4.1. Trusted Remote Policy Enforcement                           issues [50]. For example, today’s TEE implementations
     (I1)                                                        are prone to side-channel attacks that allow for limited
The foundation of strong data sovereignty in data eco-           data extraction [51]. Countermeasures such as oblivious
systems is providing data owners with an assurance that          RAM [52] are being investigated to fix these vulnera-
the data ecosystem will enforce terms and conditions             bilities, and we expect that future enclave designs will
on their behalf. Although today’s data ecosystems lack           provide further remedies against other technical issues
trustworthy remote enforcement of data usage terms               as they are being discovered. Hence, TEEs are a promis-
(I1), promising building blocks for addressing this issue        ing building block for improving data sovereignty in
are already available and used in other contexts. Ex-            data ecosystems via technically enforceable data policies.
amples of related building blocks are distributed usage          However, further research into hardening TEEs against
control, trusted execution environments, and different           unintended security breaches is required to improve their
cryptographic schemes. In the following, we discuss              applicability to data ecosystems. In fact, in a related con-
these building blocks, their application areas, and their        text, first work [37] demonstrates the applicability of
relation to data ecosystems.                                     TEEs in a trusted data sharing setting.
   Distributed usage control [41, 42, 43, 44, 45, 46] is an         We thus call for the established initiatives and resear-
established field of research that focuses on modeling and       chers to further investigate the utility of TEE technology
technically enforcing usage terms, so-called policies for        for data ecosystems to reliably address the lack of trust-
data usage. Data ecosystems have already adopted the no-         worthy and technically backed policy enforcement.
tion of policies in their organizational architecture [4, 47].
However, enforcing these policies proves difficult as the         4.2. Verifiable Data Tracking (I2)
data owner cannot directly observe the misconduct of
                                                                 Besides policy enforcement, establishing transparency
a data user or the consequences thereof [48]. Hilty and
                                                                 in data usage is equally important to gain data owners’
Pretschner [42] hence propose to provide data owners
                                                                 trust. For instance, a data owner might consider granting
with evidence of policy enforcement and limit possible
                                                                 generous accessibility to their data but require proper at-
computations. Both approaches are hard to realize within
                                                                 tribution by any data user. In such a case, the data owner
a data ecosystem as they require some technical trust an-
                                                                 would profit from technically guaranteed notifications
chor on remote systems. Specifically, data ecosystems
                                                                 whenever a data user accessed the data.
currently do not offer such trust anchors as the data user
                                                                    Currently, IDS implements a clearing house instance,
gains full control over the exchanged data once it has
                                                                 which can log data usage if mandated in a policy, mak-
been obtained from the data owner. This situation is
                                                                 ing it transparent to data owners [4]. However, data




                                                             55
users have neither a strict technical constraint to log data     4.3. Integration of Resource-Constrained
usage, nor can the system enforce it by some means. Con-              Participants (I3)
sequently, IDS cannot currently provide trusted monitor-
ing unless data usage can be observed externally. Hence,        With the separation between the data provider and data
the current clearing house instance does not solve the          owner, data ecosystems also address scenarios that in-
problem of verifiable data tracking (I2).                       volve particularly resource-constrained or especially pri-
   Instead, technical or cryptographic means would              vacy-aware data owners who are unable or unwilling to
help to incentivize logging. To this end, we consider           run the complete infrastructure themselves. However,
transparency logging, data-flow tracking, and distributed       infrastructure control is the foundation of self-sovereign
ledger technology promising for establishing verifiable         participation in distributed environments [4]. Hence, this
data tracking in data ecosystems.                               approach is not viable for resource-constrained partic-
   For instance, certificate transparency logging allows        ipants. Such participants could be, for instance, small
modern web browsers to reject digital certificates that are     to mid-sized enterprises (SMEs) in a supply chain con-
not tracked in a public log for auditors to verify [35]. A      text, which have no technical expertise to provide the
similar approach might improve data usage transparency          infrastructure to participate in a data ecosystem. In this
as well. Namely, cryptographically tying the decryption         case, their customers may be capable of assuming the role
of exchanged data or the transfer of results to a publicly      of a data provider collecting data from their contracted
verifiable log entry would force data users to log their        SMEs and offering that data on their behalf within the
actions accurately. Such approaches are being researched        ecosystem. For instance, large automotive manufactur-
in the field of verifiable computing [53, 54] and data eco-     ers can assume the role of a data provider on behalf of
systems could profit by utilizing corresponding building        their, typically numerous, suppliers [8]. In this case, how-
blocks.                                                         ever, data owners lose their sovereignty and depend on
   Besides logging, related work also proposes data flow        trust in their customers. Thus, appropriate (technical)
tracking [55] and data fingerprinting [56] to allow for         guarantees for such situations are desirable.
identifying the source of identified data breaches after           A scenario that would give data owners assurance that
the fact. However, the cryptographic data fingerprints re-      their data is treated as intended would be considering the
quired to apply these techniques necessitate knowledge          data provider as a different party than the data owner;
of the exact data representation and a sufficient tolerance     however, current ecosystem initiatives do not rigorously
for minor statistical noise in the monitored data [56].         satisfy this demand [4]. Under this assumption, however,
Unfortunately, these fingerprints typically cannot sur-         one could implement the same measures discussed in
vive intermediate processing steps [56], rendering them         Section 4.1 also on the provider side, i.e., realize a trusted
inapplicable in some situations. Hence, more research           data provider. Moreover, concerning usage transparency,
maturing resilient data flow tracking or fingerprinting         this scenario requires logs, as discussed in Section 4.2,
techniques is required to determine and improve their           to be accessible with no own infrastructure. Hence, not
applicability in the context of data ecosystems.                only the consumer-side aspect of logging must be trusted,
   Finally, distributed ledger technology has emerged           but also the instance that provides logging on behalf of
in recent years with the explicit goal of facilitating dig-     data owners.
ital interactions among participants who do not fully
trust each other. While Bitcoin started by establishing          4.4. Summary
a decentralized and publicly accessible digital currency
based on a blockchain [57], it spawned more versatile           Cryptographic building blocks that have been success-
distributed ledgers for any information using smart con-        fully applied in the past are promising also to address
tracts [58]. Ultimately, business-focused ledger systems        the core issues (I1–I3) currently impeding the data sov-
emerged, such as Hyperledger Fabric or Quorum. These            ereignty of data owners in today’s data ecosystems. For
architectures can facilitate the event-logging within data      instance, TEEs have the potential to provide the cur-
ecosystems and provide a medium for the automated               rently missing trust anchor during remote processing
billing of data accesses.                                       (I1). Similarly, concepts currently applied in the con-
   To avoid additional privacy or data confidentiality          text of certificate transparency logging or distributed
problems, such transparency mechanisms need to take             ledger technology may help satisfy the requirement for
privacy into account, e.g., by encrypting log entries [59].     verifiable tracking in data ecosystems (I2) once they are
Overall, technical building blocks for verifiable data track-   adapted to the scalability demands of envisioned deploy-
ing are already available. However, they still need to be       ments. Finally, these measures can also potentially be
tailored to the specific verifiable data tracking require-      applied when data providers operate on behalf of the
ments for utilization in data ecosystems regarding per-         original data owner to incorporate resource-constrained
formance, scalability, flexibility, and privacy.                participants in the process (I3).




                                                            56
5. Ongoing and Past Research                                    can provide for their use cases as well as for society in
   Efforts                                                      general.
                                                                   Technical Solutions for Data Sharing. Besides iden-
The potential to improve data ecosystems and the need           tifying novel use cases for sharing data via data ecosys-
to address their current issues has also been recognized        tems, other research successfully applied technical and
in previous work. All in all, data ecosystems are subject       especially cryptographic building blocks to tackle the
to past and active research alike, especially due to on-        general challenges of data sharing in more narrow sce-
going large-scale initiatives. In this section, we present      narios. For instance, Huang et al. [78] propose a data-
notable recent research efforts in data ecosystems. Specif-     sharing scheme to later identify sources of data breaches
ically, we provide an overview of fundamental research          based on oblivious transfers and embedded fingerprints.
regarding the organization of data ecosystems, research         Moreover, a variety of work considers sharing data with
efforts investigating the use cases that would benefit from     cloud providers [79, 80, 81, 82, 83, 84], which can be
data ecosystems, and works that apply technical security        considered conceptually similar to data ecosystems with
measures to facilitate data sharing efforts.                    multiple stakeholders. Such work includes querying en-
   Fundamental Data Ecosystem Advancements. Oli-                crypted data [85], attribute- or identity-based encryp-
veira and Lóscio [15] survey the components data eco-           tion for access control [86, 74, 87, 39], and distributed
systems typically comprise. Furthermore, several works          ledgers together with TEEs to enforce accountability and
discuss requirements and possible ways toward imple-            access control [37]. Then again, Bonatti et al. [88] iden-
menting data ecosystems in general, i.e., independent of        tify correctness and completeness as desirable properties
specific initiatives [14, 2, 3, 60, 13, 61]. Another line of    of transparency mechanisms in data sharing. These ap-
research investigates fundamental challenges faced when         proaches to strengthen sovereignty guarantees apply to
implementing (distributed) data sharing systems. Mainly,        real-world use cases and might even be translatable for
these challenges engulf transparency requirements [62],         use in data ecosystems.
addressing the potential lack of trust between partici-
pants [13, 63, 64], the need for creating a common se-
mantic understanding among all participants [65], and
                                                                6. Discussion and Future Work
governance as well as legal constraints [66, 67, 68, 34].      As we have highlighted in Section 3, today’s data ecosys-
More directly targeted to data ecosystems as they are          tems mostly rely on organizational means to implement
defined in this work, research considers alternatives to       data protection. However, technical building blocks are
the current IDS and GAIA-X initiatives. For instance,          already available to address the remaining challenges
FIWARE [24, 29] provides a platform to facilitate data ex-     for data sovereignty in data ecosystems by providing
change in an Internet of Things context and is related to      stronger guarantees for participants (cf. Section 4). Fi-
CEF [25]. Furthermore, special-purpose data ecosystems         nally, ongoing research efforts (cf. Section 5) have en-
are being considered, e.g., by the NFDI initiative [69],       visioned that suitable applications of data ecosystems
which focuses on improving the accessibility of research       include the handling of privacy-sensitive data, such as
data. Finally, NFDI and FIWARE aim to implement IDS-           patient records in medical contexts, but also confiden-
compatible interfaces, hence working toward ecosystem          tiality demands of critical business data require those
compatibility.                                                 guarantees. To this end, data ecosystems must provide a
   Use Cases. Another critical aspect of research on           framework that allows users to trust the overall system
data ecosystems revolves around the use cases they are         w.r.t. enforcing their rights at any time, including pro-
particularly well-suited for. Other works have identi-         cessing in remote systems after access was granted and
fied many relevant or desirable use cases in this regard.      data was shared.
Among these use cases are the sharing of medical health           Based on our analysis of the status quo as well as on-
records [70, 10], personal data [71], data emerging in         going research efforts so far, we discuss in the following
the Industrial Internet of Things [72, 73], and data ex-       that overcoming current shortcomings of usage control
change across supply chains, such as in the automotive         and stronger hardware-based security measures are cru-
industry [8, 39, 28], that have unique requirements con-       cial research directions to sustainably strengthen the data
cerning data confidentiality, data volume, or long-term        sovereignty for participants of data ecosystems.
persistency. Further data sharing schemes do not specifi-         Shortcomings of Usage Control. With (distributed)
cally target data ecosystems but are conceptually similar,     usage control, prior work already addresses the issues
such as applications in medicine [6, 40, 9, 74], for pro-      I1–I3 today’s data ecosystems are facing. However, the
duction technology [75, 76], along supply chains [8], or       enforcement has not (yet) been thoroughly picked up
in education [77]. We expect that additional domains           by recent initiatives, possibly due to the current lack of
will also start to investigate the benefits data ecosystems    technical guarantees [48]. Most work in this area either




                                                           57
targets rights modeling (e.g., [41, 89, 90]) or assumes op-    strates the applicability of cryptographic mechanisms,
eration on trusted infrastructure (e.g., [91, 92]), which      e.g., in certificate transparency. To this end, further re-
we argue does not withstand malicious-but-cautions at-         search must investigate how these concepts can support
tackers, as applicable to data ecosystems. Given that          transparency in data ecosystems, while not creating new
guaranteed policy enforcement is crucial for sharing sen-      privacy issues. Finally, the combination of technically en-
sitive datasets within data ecosystems, this question still    forceable usage control with usage transparency might
needs to be addressed to allow for a wide-spread adoption      also be the first step toward sovereign integration of
of data ecosystems.                                            resource-constrained participants (I3).
   With cryptographic and technical solutions, the ways
toward stronger guarantees are two-fold and not straight-
forward. The discussed cryptographic approaches to-            7. Conclusion
ward stronger guarantees, i.e., providing usage control
                                                              Today’s data ecosystems facilitate an automated
and transparency via cryptographic means, implement
                                                              exchange of data in a standardized manner while simul-
the strongest protection among the discussed techniques
                                                              taneously providing access to huge and heterogeneous
but currently either allow only limited expressiveness
                                                              data sources. Given that these data exchanges and
or suffer from a severe performance penalty. Hence, we
                                                              corresponding higher-level applications across domains
argue that they are currently not suited for general ap-
                                                              (e.g., in the automotive industry) also frequently deal
plication in data ecosystems but should be selectively
                                                              with sensitive information, including business secrets
applied for the most sensitive datasets, where the named
                                                              and data subject to privacy regulations, data ecosystems
limitations and overheads are acceptable [40].
                                                              must implement reliable measures to prevent any
   Need for Hardware-based Security. Hardware so-
                                                              undesirable exposure of sensitive data. Currently, these
lutions provide a trust anchor under the malicious-but-
                                                              measures are mostly based on organizational means,
cautious attacker model. Moreover, they are less affected
                                                              which we argue, fail to provide sufficient guarantees in
by performance penalties and eventually allow the same
                                                              settings with malicious-but-cautious participants, i.e.,
operations as standard hardware. However, TPMs, as cur-
                                                              participants who aim to remain unnoticed while still
rently envisaged by the IDS [4], cannot provide adequate
                                                              trying to infer all possible information from the data
protection of sensitive data due to the lacking memory
                                                              ecosystem and associated data exchanges.
encryption. Hence, Trusted Execution Environments
                                                                 We raise the crucial issue that today’s data ecosystems
(TEEs), despite current known side-channel attacks and
                                                              lack appropriate guarantees w.r.t. confidential processing
related weaknesses, seem to be a better choice for strong
                                                              on systems operated by third parties, transparency of data
guarantees regarding data sovereignty expanding to re-
                                                              access and usage, and the participation of parties with
mote systems.
                                                              no infrastructure under their control (I1–I3). We have
   With hardware-based TEEs being available for a few
                                                              further surveyed corresponding technical solutions to
years, the question arises as to why today’s data eco-
                                                              these issues and highlight that they are available but have
systems do not yet implement TEE-based security. One
                                                              not yet been adopted in practice. To this end, we argue
reason might be known weaknesses, which need to be
                                                              that the success of data ecosystems directly depends on
addressed in future designs. However, these weaknesses
                                                              their ability to address the present need for strong data
do not seem to hinder deployment in further applications,
                                                              sovereignty of participants. As such, especially modern
as, for instance, Microsoft Azure offers commercial sup-
                                                              technical solutions, such as TEEs, promise to provide data
port for TEEs in its cloud service [93]. Hence, we argue
                                                              owners with strong guarantees of correct data handling,
that data ecosystems should consider employing TEEs
                                                              increasing their willingness to participate in available
as a measure to enforce data owner’s rights on remote
                                                              data ecosystems.
infrastructure, which would fill the current gap toward
implementing end-to-end data sovereignty.
   Future Work. These required research efforts mo-            Acknowledgments
tivate our call for future work in the domain of data
ecosystems. Regarding the reliable enforcement of us-         Funded by the Deutsche Forschungsgemeinschaft
age terms (I1), future work must address tailoring exist-     (DFG, German Research Foundation) under Germany’s
ing data protection schemes to data ecosystems. Here,         Excellence Strategy – EXC-2023 Internet of Production –
a promising idea seems to employ TEEs as a trust an-          390621612.
chor on remote infrastructure. However, further research
must clarify to which degree current limitations, such as
performance penalties, affect application within data eco-
systems. Subsequently, this can be integrated with trans-
parency mechanisms (I2) where current work demon-




                                                          58
References                                                    [11] Z. Du, C. Wu, T. Yoshinaga, K.-L. A. Yau, Y. Ji, J. Li,
                                                                   Federated Learning for Vehicular Internet of Things:
 [1] J. Pennekamp, R. Glebke, M. Henze, T. Meisen,                 Recent Advances and Open Issues, IEEE Open
     C. Quix, R. Hai, L. Gleim, P. Niemietz, M. Rudack,            Journal of the Computer Society 1 (2020) 45–61.
     S. Knape, A. Epple, D. Trauth, U. Vroomen, T. Bergs,          doi:10.1109/OJCS.2020.2992630.
     C. Brecher, A. Buhrig-Polaczek, M. Jarke, K. Wehrle,     [12] M. D. Wilkinson et al, The FAIR Guiding Princi-
     Towards an Infrastructure Enabling the Internet of            ples for scientific data management and steward-
     Production, in: 2019 IEEE International Conference            ship, Scientific Data 3 (2016) 160018. doi:10.1038/
     on Industrial Cyber Physical Systems (ICPS), IEEE,            sdata.2016.18.
     Taipei, Taiwan, 2019, pp. 31–37. doi:10.1109/            [13] J. Gelhaar, B. Otto, Challenges in the Emergence of
     ICPHYS.2019.8780276.                                          Data Ecosystems, in: Pacific Asia Conference on
 [2] B. Otto, M. Jarke, Designing a multi-sided data               Information Systems (PACIS), Dubai, 2020.
     platform: Findings from the International Data           [14] A. Braud, G. Fromentoux, B. Radier, O. Le Grand,
     Spaces case, Electronic Markets 29 (2019) 561–580.            The Road to European Digital Sovereignty with
     doi:10.1007/s12525-019-00362-x.                               Gaia-X and IDSA, IEEE Network 35 (2021) 4–5.
 [3] B. Otto, Interview with Reinhold Achatz on “Data              doi:10.1109/MNET.2021.9387709.
     Sovereignty and Data Ecosystems”, Business &             [15] M. I. S. Oliveira, B. F. Lóscio, What is a data
     Information Systems Engineering 61 (2019) 635–                ecosystem?, in: Proceedings of the 19th Annual
     636. doi:10.1007/s12599-019-00609-z.                          International Conference on Digital Government
 [4] B. Otto, S. Steinbuss, A. Teuscher, S. Lohmann et             Research: Governance in the Data Age, ACM,
     al., IDS Reference Architecture Model (Version 3.0),          Delft The Netherlands, 2018, pp. 1–9. doi:10.1145/
     2019.                                                         3209281.3209335.
 [5] Gaia-X Technical Committee, Gaia-X Architecture          [16] P. Hummel, M. Braun, M. Tretter, P. Dabrock,
     Document, 2021.                                               Data sovereignty: A review, Big Data & So-
 [6] D. Froelicher, P. Egger, J. S. Sousa, J. L. Raisaro,          ciety 8 (2021) 205395172098201. doi:10.1177/
     Z. Huang, C. Mouchet, B. Ford, J.-P. Hubaux, UnL-             2053951720982012.
     ynx: A Decentralized System for Privacy-Conscious        [17] M. Schanzenbach, Towards Self-sovereign, Decen-
     Data Sharing, Proceedings on Privacy Enhancing                tralized Personal Data Sharing and Identity Man-
     Technologies 2017 (2017) 232–250. doi:10.1515/                agement, Ph.D. thesis, 2020.
     popets-2017-0047.                                        [18] V. Pedreira, D. Barros, P. Pinto, A Review of At-
 [7] D. McCabe, A. Satariano, The Era of Borderless                tacks, Vulnerabilities, and Defenses in Industry 4.0
     Data is Ending, New York Times (2022).                        with New Challenges on Data Sovereignty Ahead,
 [8] L. Bader, J. Pennekamp, R. Matzutt, D. Hedderich,             Sensors 21 (2021) 5189. doi:10.3390/s21155189.
     M. Kowalski, V. Lücken, K. Wehrle, Blockchain-           [19] S. Couture, S. Toupin, What does the notion of
     based privacy preservation for supply chains sup-             “sovereignty” mean when referring to the digital?,
     porting lightweight multi-hop information ac-                 New Media & Society 21 (2019) 2305–2322. doi:10.
     countability, Information Processing & Manage-                1177/1461444819865984.
     ment 58 (2021) 102529. doi:10.1016/j.ipm.2021.           [20] K. Irion, Government Cloud Computing and Na-
     102529.                                                       tional Data Sovereignty: Government Cloud Com-
 [9] H. Ma, R. Zhang, G. Yang, Z. Song, K. He, Y. Xiao,            puting and National Data Sovereignty, Policy &
     Efficient Fine-Grained Data Sharing Mechanism                 Internet 4 (2012) 40–71. doi:10.1002/poi3.10.
     for Electronic Medical Record Systems with Mo-           [21] S. R. Bader, M. Maleshkova, SOLIOT—Decentralized
     bile Devices, IEEE Transactions on Dependable                 Data Control and Interactions for IoT, Future Inter-
     and Secure Computing 17 (2020) 1026–1038. doi:10.             net 12 (2020) 105. doi:10.3390/fi12060105.
     1109/TDSC.2018.2844814.                                  [22] Data          Sharing         Coalition,         https:
[10] A. Appenzeller, S. Bartholomaus, R. Breitschwerdt,            //datasharingcoalition.eu/
     C. Claussen, S. Geisler, T. Hartz, P. Kachel, E. Krem-        about-the-data-sharing-coalition/, 2022. Ac-
     pel, S. Robert, S. R. Zeissig, Towards Distributed            cessed 2022-08-09.
     Healthcare Systems – Virtual Data Pooling Between        [23] IHAN, https://ihan.fi/, 2022. Accessed 2022-08-09.
     Cancer Registries as Backbone of Care and Re-            [24] F. Cirillo, G. Solmaz, E. L. Berz, M. Bauer, B. Cheng,
     search, in: 2021 IEEE/ACS 18th International Con-             E. Kovacs, A Standard-Based Open Source IoT
     ference on Computer Systems and Applications                  Platform: FIWARE, IEEE Internet of Things Mag-
     (AICCSA), IEEE, Tangier, Morocco, 2021, pp. 1–8.              azine 2 (2019) 12–18. doi:10.1109/IOTM.0001.
     doi:10.1109/AICCSA53542.2021.9686918.                         1800022.




                                                         59
[25] CEF Digital, https://ec.europa.eu/cefdigital/wiki/            Kingdom, 2019, pp. 45–56. doi:10.1145/3338469.
     display/CEFDIGITAL/CEF+Digital+Home, 2022.                    3358944.
     Accessed 2022-08-09.                                     [39] S. Malik, N. Gupta, V. Dedeoglu, S. S. Kanhere, R. Jur-
[26] Big Data Value Association, https://www.bdva.eu/,             dak, TradeChain: Decoupling Traceability and Iden-
     2022. Accessed 2022-08-09.                                    tity in Blockchain enabled Supply Chains (2021).
[27] ETSI GR CIM 007 V1.1.1: Security and Privacy, Tech-           doi:10.48550/ARXIV.2105.11217.
     nical Report, France, 2022.                              [40] D. Froelicher, J. R. Troncoso-Pastoriza, J. L. Rais-
[28] O. Voß, Catena-X: Datenstandards für die Auto-                aro, M. A. Cuendet, J. S. Sousa, H. Cho, B. Berger,
     branche, Tagesspiegel Background Digitalisierung              J. Fellay, J.-P. Hubaux, Truly Privacy-Preserving
     & KI (2021).                                                  Federated Analytics for Precision Medicine with
[29] Á. Alonso, A. Pozo, J. Cantera, F. de la Vega, J. Hi-         Multiparty Homomorphic Encryption, Preprint,
     erro, Industrial Data Space Architecture Imple-               Bioinformatics, 2021. doi:10.1101/2021.02.24.
     mentation Using FIWARE, Sensors 18 (2018) 2226.               432489.
     doi:10.3390/s18072226.                                   [41] J. Park, R. Sandhu, The UCON ABC usage control
[30] N. Menz, A. Resetko, B. Otto, Framework for the IDS           model, ACM Transactions on Information and
     Certification Scheme 2.0, Technical Report, IDSA,             System Security 7 (2004) 128–174. doi:10.1145/
     2019. doi:10.5281/ZENODO.5244858.                             984334.984339.
[31] CEN European Committee for Standardization, In-          [42] M. Hilty, D. Basin, A. Pretschner, On Obligations,
     formation technology - Security techniques - In-              in: D. Hutchison, T. Kanade, J. Kittler, J. M. Klein-
     formation security management systems - Require-              berg, F. Mattern, J. C. Mitchell, M. Naor, O. Nier-
     ments (ISO/IEC 27001:2013 including Cor 1:2014                strasz, C. Pandu Rangan, B. Steffen, M. Sudan,
     and Cor 2:2015), 2017.                                        D. Terzopoulos, D. Tygar, M. Y. Vardi, G. Weikum,
[32] A. Pretschner, M. Hilty, F. Schütz, C. Schaefer,              S. d. C. di Vimercati, P. Syverson, D. Gollmann
     T. Walter, Usage Control Enforcement: Present                 (Eds.), Computer Security – ESORICS 2005, volume
     and Future, IEEE Security & Privacy Magazine 6                3679, Springer Berlin Heidelberg, Berlin, Heidel-
     (2008) 44–53. doi:10.1109/MSP.2008.101.                       berg, 2005, pp. 98–117. doi:10.1007/11555827_
[33] R. Ianella, Open digital rights language (ODRL),              7.
     Open Content Licensing: Cultivating the Creative         [43] M. Hilty, A. Pretschner, D. Basin, C. Schaefer,
     Commons (2007).                                               T. Walter, A Policy Language for Distributed Us-
[34] A. Duisberg, Legal Aspects of IDS: Data Sovereignty           age Control, in: D. Hutchison, T. Kanade, J. Kit-
     - What Does It Imply?, in: Designing Data Spaces,             tler, J. M. Kleinberg, F. Mattern, J. C. Mitchell,
     Springer, 2022.                                               M. Naor, O. Nierstrasz, C. Pandu Rangan, B. Steffen,
[35] M. D. Ryan, Enhanced Certificate Transparency                 M. Sudan, D. Terzopoulos, D. Tygar, M. Y. Vardi,
     and End-to-End Encrypted Mail, in: Proceedings                G. Weikum, J. Biskup, J. López (Eds.), Computer
     2014 Network and Distributed System Security                  Security – ESORICS 2007, volume 4734, Springer
     Symposium, Internet Society, San Diego, CA, 2014.             Berlin Heidelberg, Berlin, Heidelberg, 2007, pp. 531–
     doi:10.14722/ndss.2014.23379.                                 546. doi:10.1007/978-3-540-74835-9_35.
[36] L. kacha, A. Zitouni,           An Overview on           [44] F. Kelbert, A. Pretschner, Data usage control en-
     Data Security in Cloud Computing,                vol-         forcement in distributed systems, in: Proceed-
     ume 661, 2018, pp. 250–261. doi:10.1007/                      ings of the Third ACM Conference on Data and
     978-3-319-67618-0_23. arXiv:1812.09053.                       Application Security and Privacy - CODASPY ’13,
[37] H. Lei, Y. Yan, Z. Bao, Q. Wang, Y. Zhang, W. Shi,            ACM Press, San Antonio, Texas, USA, 2013, p. 71.
     SDSBT: A Secure Multi-party Data Sharing Plat-                doi:10.1145/2435349.2435358.
     form Based on Blockchain and TEE, in: J. Cheng,          [45] F. Kelbert, A. Pretschner, A Fully Decentralized
     X. Tang, X. Liu (Eds.), Cyberspace Safety and Se-             Data Usage Control Enforcement Infrastructure, in:
     curity, volume 12653, Springer International Pub-             T. Malkin, V. Kolesnikov, A. B. Lewko, M. Polychron-
     lishing, Cham, 2021, pp. 184–196. doi:10.1007/                akis (Eds.), Applied Cryptography and Network
     978-3-030-73671-2_17.                                         Security, volume 9092, Springer International Pub-
[38] F. Boemer, A. Costache, R. Cammarota, C. Wierzyn-             lishing, Cham, 2015, pp. 409–430. doi:10.1007/
     ski, nGraph-HE2: A High-Throughput Framework                  978-3-319-28166-7_20.
     for Neural Network Inference on Encrypted Data,          [46] I. Akaichi, S. Kirrane, Usage Control Specifi-
     in: Proceedings of the 7th ACM Workshop on En-                cation, Enforcement, and Robustness: A Survey,
     crypted Computing & Applied Homomorphic Cryp-                 arXiv:2203.04800 [cs] (2022). arXiv:2203.04800.
     tography - WAHC’19, ACM Press, London, United            [47] S. Steinbuss, et. al, Usage Control in the Interna-
                                                                   tional Data Spaces, 2021.




                                                         60
[48] A. Hosseinzadeh, A. Eitel, C. Jung, A Systematic                preserving transparency logging, in: Proceed-
     Approach toward Extracting Technically Enforce-                 ings of the 12th ACM Workshop on Workshop
     able Policies from Data Usage Control Require-                  on Privacy in the Electronic Society, ACM, Berlin
     ments:, in: Proceedings of the 6th International                Germany, 2013, pp. 83–94. doi:10.1145/2517840.
     Conference on Information Systems Security and                  2517847.
     Privacy, SCITEPRESS - Science and Technology               [60] J. Zrenner, F. O. Möller, C. Jung, A. Eitel, B. Otto,
     Publications, Valletta, Malta, 2020, pp. 397–405.               Usage control architecture options for data sover-
     doi:10.5220/0008936003970405.                                   eignty in business ecosystems, Journal of Enter-
[49] M. Schneider, R. J. Masti, S. Shinde, S. Capkun,                prise Information Management 32 (2019) 477–495.
     R. Perez, SoK: Hardware-supported Trusted Exe-                  doi:10.1108/JEIM-03-2018-0058.
     cution Environments, 2022. arXiv:2205.12742.               [61] M. Henze, M. Grossfengels, M. Koprowski,
[50] A. Nilsson, P. N. Bideh, J. Brorsson, A Survey of               K. Wehrle, Towards Data Handling Requirements-
     Published Attacks on Intel SGX, arXiv:2006.13598                Aware Cloud Computing, in: 2013 IEEE 5th Interna-
     [cs] (2020). arXiv:2006.13598.                                  tional Conference on Cloud Computing Technology
[51] M.-W. Shih, S. Lee, T. Kim, M. Peinado, T-SGX:                  and Science, IEEE, Bristol, United Kingdom, 2013,
     Eradicating Controlled-Channel Attacks Against                  pp. 266–269. doi:10.1109/CloudCom.2013.145.
     Enclave Programs, in: Proceedings 2017 Network             [62] S. Geisler, M.-E. Vidal, C. Cappiello, B. F. Lóscio,
     and Distributed System Security Symposium, Inter-               A. Gal, M. Jarke, M. Lenzerini, P. Missier, B. Otto,
     net Society, San Diego, CA, 2017. doi:10.14722/                 E. Paja, B. Pernici, J. Rehof, Knowledge-Driven Data
     ndss.2017.23193.                                                Ecosystems Toward Data Transparency, Journal
[52] S. Sasy, S. Gorbunov, C. W. Fletcher, ZeroTrace :               of Data and Information Quality 14 (2022) 1–12.
     Oblivious Memory Primitives from Intel SGX, in:                 doi:10.1145/3467022.
     Proceedings 2018 Network and Distributed System            [63] A. Munoz-Arcentales, S. López-Pernas, A. Pozo,
     Security Symposium, Internet Society, San Diego,                Á. Alonso, J. Salvachúa, G. Huecas, An Architec-
     CA, 2018. doi:10.14722/ndss.2018.23239.                         ture for Providing Data Usage and Access Control in
[53] R. Gennaro, C. Gentry, B. Parno, Non-interactive                Data Sharing Ecosystems, Procedia Computer Sci-
     Verifiable Computing: Outsourcing Computation                   ence 160 (2019) 590–597. doi:10.1016/j.procs.
     to Untrusted Workers, in: D. Hutchison, T. Kanade,              2019.11.042.
     J. Kittler, J. M. Kleinberg, F. Mattern, J. C. Mitchell,   [64] M. Huber, S. Wessel, G. Brost, N. Menz, Building
     M. Naor, O. Nierstrasz, C. Pandu Rangan, B. Steffen,            Trust in Data Spaces, in: Designing Data Spaces,
     M. Sudan, D. Terzopoulos, D. Tygar, M. Y. Vardi,                Springer, 2022.
     G. Weikum, T. Rabin (Eds.), Advances in Cryptol-           [65] S. Bader, J. Pullmann, C. Mader, S. Tramp, C. Quix,
     ogy – CRYPTO 2010, volume 6223, Springer Berlin                 A. W. Müller, H. Akyürek, M. Böckmann, B. T. Im-
     Heidelberg, Berlin, Heidelberg, 2010, pp. 465–482.              busch, J. Lipp, S. Geisler, C. Lange, The Interna-
     doi:10.1007/978-3-642-14623-7_25.                               tional Data Spaces Information Model – An Ontol-
[54] B. Parno, J. Howell, C. Gentry, M. Raykova, Pinoc-              ogy for Sovereign Exchange of Digital Content, in:
     chio: Nearly Practical Verifiable Computation, in:              J. Z. Pan, V. Tamma, C. d’Amato, K. Janowicz, B. Fu,
     2013 IEEE Symposium on Security and Privacy,                    A. Polleres, O. Seneviratne, L. Kagal (Eds.), The Se-
     IEEE, Berkeley, CA, 2013, pp. 238–252. doi:10.                  mantic Web – ISWC 2020, volume 12507, Springer
     1109/SP.2013.47.                                                International Publishing, Cham, 2020, pp. 176–192.
[55] I. Kunz, V. Casola, A. Schneider, C. Banse, J. Schütte,         doi:10.1007/978-3-030-62466-8_12.
     Towards Tracking Data Flows in Cloud Architec-             [66] C. Ducuing, Data as infrastructure? A study of data
     tures, 2020 IEEE 13th International Conference on               sharing legal regimes, Competition and Regulation
     Cloud Computing (CLOUD) (2020) 445–452.                         in Network Industries 21 (2020) 124–142. doi:10.
[56] M. Backes, N. Grimm, A. Kate, Data Lineage in                   1177/1783591719895390.
     Malicious Environments, IEEE Transactions on               [67] D. Wu, S. G. Verhulst, A. Pentland, T. Avila, K. Finch,
     Dependable and Secure Computing 13 (2016) 178–                  A. Gupta, How data governance technologies
     191. doi:10.1109/TDSC.2015.2399296.                             can democratize data sharing for community well-
[57] S. Nakamoto, Bitcoin: A peer-to-peer electronic                 being, Data & Policy 3 (2021) e14. doi:10.1017/
     cash system, Decentralized Business Review (2008)               dap.2021.13.
     21260.                                                     [68] L. Helminger, C. Rechberger, Multi-party com-
[58] V. Buterin, et al., A next-generation smart contract            putation in the GDPR, in: Privacy Symposium
     and decentralized application platform, white paper             2022 - Data Protection Law International Conver-
     3 (2014) 2–1.                                                   gence and Compliance with Innovative Technolo-
[59] T. Pulls, R. Peeters, K. Wouters, Distributed privacy-          gies (DPLICIT), 2022.




                                                           61
[69] N. L. Weisweiler, R. Bertelmann, P. Braesicke,                Education Material, in: 2020 International Con-
     T. Bronger, C. Curdt, F. O. Glöckner, S. Rank, O. Ste-        ference on Information Networking (ICOIN), IEEE,
     gle, Y. Sure-Vetter, N. Villacorta, Helmholtz Open            Barcelona, Spain, 2020, pp. 529–534. doi:10.1109/
     Science Briefing: Helmholtz in der Nationalen                 ICOIN48656.2020.9016478.
     Forschungsdateninfrastruktur (NFDI): Report des          [78] C. Huang, D. Liu, J. Ni, R. Lu, X. Shen, Achieving
     Helmholtz Open Science Forums, Technical Re-                  Accountable and Efficient Data Sharing in Indus-
     port, Helmholtz Open Science Office, 2021. doi:10.            trial Internet of Things, IEEE Transactions on In-
     48440/OS.HELMHOLTZ.030.                                       dustrial Informatics 17 (2021) 1416–1427. doi:10.
[70] J. Scheibner, J. L. Raisaro, J. R. Troncoso-Pastoriza,        1109/TII.2020.2982942.
     M. Ienca, J. Fellay, E. Vayena, J.-P. Hubaux, Revo-      [79] J. Shen, T. Zhou, D. He, Y. Zhang, X. Sun, Y. Xiang,
     lutionizing Medical Data Sharing Using Advanced               Block Design-Based Key Agreement for Group Data
     Privacy-Enhancing Technologies: Technical, Legal,             Sharing in Cloud Computing, IEEE Transactions
     and Ethical Synthesis, Journal of Medical Internet            on Dependable and Secure Computing 16 (2019)
     Research 23 (2021) e25120. doi:10.2196/25120.                 996–1010. doi:10.1109/TDSC.2017.2725953.
[71] R. Matzutt, D. Müllmann, E.-M. Zeissig, C. Horst,        [80] A. Fromm, V. Stepa, HDFT++ Hybrid Data Flow
     K. Kasugai, S. Lidynia, S. Wieninger, J. H. Ziegel-           Tracking for SaaS Cloud Services, in: 2017 IEEE 4th
     dorf, G. Gudergan, I. S. gen. Döhmann, K. Wehrle,             International Conference on Cyber Security and
     M. Ziefle, myneData: Towards a Trusted and User-              Cloud Computing (CSCloud), IEEE, New York, NY,
     controlled Ecosystem for Sharing Personal Data                USA, 2017, pp. 333–338. doi:10.1109/CSCloud.
     (2017). doi:10.18420/IN2017_109.                              2017.9.
[72] H. Baars, A. Tank, P. Weber, H.-G. Kemper, H. Lasi,      [81] Z. Qin, H. Xiong, S. Wu, J. Batamuliza, A Survey
     B. Pedell, Cooperative Approaches to Data Shar-               of Proxy Re-Encryption for Secure Data Sharing in
     ing and Analysis for Industrial Internet of Things            Cloud Computing, IEEE Transactions on Services
     Ecosystems, Applied Sciences 11 (2021) 7547.                  Computing (2016) 1–1. doi:10.1109/TSC.2016.
     doi:10.3390/app11167547.                                      2551238.
[73] A. L. Marra, F. Martinelli, P. Mori, A. Saracino,        [82] T. Pasquier, J. Bacon, J. Singh, D. Eyers, Data-
     A Distributed Usage Control Framework for In-                 Centric Access Control for Cloud Computing, in:
     dustrial Internet of Things, in: C. Alcaraz (Ed.),            Proceedings of the 21st ACM on Symposium on
     Security and Privacy Trends in the Industrial                 Access Control Models and Technologies, ACM,
     Internet of Things, Springer International Pub-               Shanghai China, 2016, pp. 81–88. doi:10.1145/
     lishing, Cham, 2019, pp. 115–135. doi:10.1007/                2914642.2914662.
     978-3-030-12330-7_6.                                     [83] A. Bessani, M. Correia, B. Quaresma, F. André,
[74] X. Lu, X. Cheng, A Secure and Lightweight Data                P. Sousa, DepSky: Dependable and Secure Stor-
     Sharing Scheme for Internet of Medical Things,                age in a Cloud-of-Clouds, ACM Transactions on
     IEEE Access 8 (2020) 5022–5030. doi:10.1109/                  Storage 9 (2013) 1–33. doi:10.1145/2535929.
     ACCESS.2019.2962729.                                     [84] S. Sundareswaran, A. Squicciarini, D. Lin, Ensur-
[75] J. Pennekamp, E. Buchholz, Y. Lockner,                        ing Distributed Accountability for Data Sharing in
     M. Dahlmanns, T. Xi, M. Fey, C. Brecher,                      the Cloud, IEEE Transactions on Dependable and
     C. Hopmann, K. Wehrle, Privacy-Preserving                     Secure Computing 9 (2012) 556–568. doi:10.1109/
     Production Process Parameter Exchange,             in:        TDSC.2012.26.
     Annual Computer Security Applications Con-               [85] A. Rafique, D. Van Landuyt, E. Heydari Beni, B. La-
     ference, ACM, Austin USA, 2020, pp. 510–525.                  gaisse, W. Joosen, CryptDICE: Distributed data
     doi:10.1145/3427228.3427248.                                  protection system for secure cloud data storage and
[76] S. Mangel, L. Gleim, J. Pennekamp, K. Wehrle,                 computation, Information Systems 96 (2021) 101671.
     S. Decker, Data Reliability and Trustworthiness               doi:10.1016/j.is.2020.101671.
     Through Digital Transmission Contracts, in: The          [86] K. Edemacu, B. Jang, J. W. Kim, CESCR: CP-ABE
     Semantic Web, volume 12731, Springer Interna-                 for efficient and secure sharing of data in collab-
     tional Publishing, Cham, 2021, pp. 265–283. doi:10.           orative ehealth with revocation and no dummy
     1007/978-3-030-77385-4_16.                                    attribute, PLOS ONE 16 (2021) e0250992. doi:10.
[77] R. Matzutt, J. Pennekamp, K. Wehrle, A Secure and             1371/journal.pone.0250992.
     Practical Decentralized Ecosystem for Shareable




                                                         62
[87] B. Waters, Ciphertext-Policy Attribute-Based En-                 Boston, MA, 2010, pp. 133–146. doi:10.1007/
     cryption: An Expressive, Efficient, and Provably                 978-1-4419-6794-7_11.
     Secure Realization, in: D. Hutchison, T. Kanade,            [90] Q. H. Cao, M. Giyyarpuram, R. Farahbakhsh,
     J. Kittler, J. M. Kleinberg, F. Mattern, J. C. Mitchell,         N. Crespi, Policy-based usage control for a trust-
     M. Naor, O. Nierstrasz, C. Pandu Rangan, B. Stef-                worthy data sharing platform in smart cities, Future
     fen, M. Sudan, D. Terzopoulos, D. Tygar, M. Y.                   Generation Computer Systems 107 (2020) 998–1010.
     Vardi, G. Weikum, D. Catalano, N. Fazio, R. Gen-                 doi:10.1016/j.future.2017.05.039.
     naro, A. Nicolosi (Eds.), Public Key Cryptogra-             [91] F. Cirillo, B. Cheng, R. Porcellana, M. Russo, G. Sol-
     phy – PKC 2011, volume 6571, Springer Berlin                     maz, H. Sakamoto, S. P. Romano, IntentKeeper:
     Heidelberg, Berlin, Heidelberg, 2011, pp. 53–70.                 Intent-oriented Data Usage Control for Federated
     doi:10.1007/978-3-642-19379-8_4.                                 Data Analytics, in: 2020 IEEE 45th Conference
[88] P. Bonatti, S. Kirrane, A. Polleres, R. Wenning,                 on Local Computer Networks (LCN), IEEE, Sydney,
     Transparent Personal Data Processing: The Road                   NSW, Australia, 2020, pp. 204–215. doi:10.1109/
     Ahead, in: S. Tonetta, E. Schoitsch, F. Bitsch                   LCN48667.2020.9314823.
     (Eds.), Computer Safety, Reliability, and Secu-             [92] F. Kelbert, A. Pretschner, Data Usage Control for
     rity, volume 10489, Springer International Pub-                  Distributed Systems, ACM Transactions on Pri-
     lishing, Cham, 2017, pp. 337–349. doi:10.1007/                   vacy and Security 21 (2018) 1–32. doi:10.1145/
     978-3-319-66284-8_28.                                            3183342.
[89] M. Colombo, A. Lazouski, F. Martinelli, P. Mori,            [93] F. Y. Rashid, The rise of confidential computing:
     A Proposal on Enhancing XACML with Con-                          Big tech companies are adopting a new security
     tinuous Usage Control Features,             in: F. De-           model to protect data while it’s in use-[news], IEEE
     sprez, V. Getov, T. Priol, R. Yahyapour (Eds.),                  Spectrum 57 (2020) 8–9.
     Grids, P2P and Services Computing, Springer US,




                                                            63