On the need for strong sovereignty in data ecosystems Johannes Lohmöller1,* , Jan Pennekamp1 , Roman Matzutt1 and Klaus Wehrle1 1 RWTH Aachen University, Aachen, Germany Abstract Data ecosystems are the foundation of emerging data-driven business models as they (i) enable an automated exchange between their participants and (ii) provide them with access to huge and heterogeneous data sources. However, the corresponding benefits come with unforeseen risks as also sensitive information is potentially exposed. Consequently, data security is of utmost importance and, thus, a central requirement for the successful implementation of these ecosystems. Current initiatives, such as IDS and GAIA-X, hence foster sovereign participation via a federated infrastructure where participants retain local control. However, these designs place significant trust in remote infrastructure by mostly implementing organizational security measures such as certification processes prior to admission of a participant. At the same time, due to the sensitive nature of involved data, participants are incentivized to bypass security measures to maximize their own benefit: In practice, this issue significantly weakens sovereignty guarantees. In this paper, we hence claim that data ecosystems must be extended with technical means to reestablish such guarantees. To underpin our position, we analyze promising building blocks and identify three core research directions toward stronger data sovereignty, namely trusted remote policy enforcement, verifiable data tracking, and integration of resource-constrained participants. We conclude that these directions are critical to securely implement data ecosystems in data-sensitive contexts. 1. Introduction software and develop defense-in-depth strategies for pro- tection [4]. Participants receive no additional security Data-driven business models are an invaluable pillar for guarantees beyond this ahead-of-time certification and modern industries, and their importance will increase have no means to verify that other participants handle with growing demands requiring more complex and glob- their data as intended (and required). Here, the lack of ally distributed operation, as well as sophisticated collab- stronger guarantees effectively ends sovereignty of par- orations to improve the status quo [1]. Data ecosystems ticipants in the moment of sharing. provide the foundation for such data-driven business In this paper, we argue that data ecosystems need to models as they center around automating data exchanges provide their participants with strong and continual guar- and value creation based on huge and heterogeneous data antees about the security of their provided data to main- sources from various stakeholders [2]. Added value can tain each participant’s data sovereignty. Moreover, driven be created by, for instance, improving algorithms un- by privacy and security concerns, recent regulatory ef- derlying existing analytics or extracting new insights of forts set strict rules on how data may flow across organi- previously recorded data [3]. Crucially, this process in- zational borders, raising the need for fine-grained con- volves the integration of distributed data sources owned trol [7]. To this end, data ecosystems are only sustainable by different stakeholders. Here, data ecosystem initia- if stakeholders are willing to participate by providing and tives such as International Data Spaces (IDS) [4] and consuming data actively. However, we argue that data- GAIA-X [5] aim to provide a trustworthy environment consuming parties are currently incentivized to ignore for the discovery, sharing, and processing of available previously agreed terms for data usage. Such behavior data, irrespective of specific domains. hurts data owners as they are not adequately compen- However, current efforts to establish the necessary sated for the value of the data they provide and questions trust among stakeholders heavily rely on organizational whether data ecosystems are adequate to exchange data agreements and processes [6, 4]. For instance, the IDS subject to privacy regulation. Consequently, data owners certification process asserts that participants use audited might restrict their data-sharing efforts or leave the data ecosystem entirely. Hence, data ecosystems require solid Proc. of the First International Workshop on Data Ecosystems (DEco’22), technical measures, such as cryptographically enforceable September 5, 2022, Sydney, Australia guarantees and verifiable continual security monitoring, * Corresponding author to facilitate the establishment of trust between remote $ lohmoeller@comsys.rwth-aachen.de (J. Lohmöller); pennekamp@comsys.rwth-aachen.de (J. Pennekamp); and potentially mutually unknown participants. In this matzutt@comsys.rwth-aachen.de (R. Matzutt); paper, we provide more background on the current state wehrle@comsys.rwth-aachen.de (K. Wehrle) of data ecosystems, identify shortcomings of ongoing  0000-0003-2101-5562 (J. Lohmöller); 0000-0003-0398-6904 data ecosystem initiatives, and derive and discuss future (J. Pennekamp); 0000-0002-4263-5317 (R. Matzutt); research directions steered toward improving the sover- 0000-0001-7252-4186 (K. Wehrle) © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License eignty and trust of participants in data ecosystems. Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) 51 2. A Primer on Current Data Ecosystem Initiatives and their operator association Architectures rti fy ce ce rti fy da ta To ensure a common understanding of the trust issues data owner data user with today’s data ecosystems, we first briefly introduce data data data ecosystems, the notion of data sovereignty, and com- provider consumer mon participants in this context. Moreover, we present a data ecosystem short overview of data ecosystem initiatives focusing on Figure 1: Participating entities in data ecosystems. Data their currently implemented security measures. flows from left to right, with data provider and data con- Ecosystem Goals. The need to share data with col- sumer implementing a common ecosystem interface. The laborators within specific sectors has been recognized in data ecosystem’s operator also handles orthogonal tasks, in- a variety of domains, including supply chains [8], public cluding admission and discovery of participants and data. health [9, 10], and mobility [11]. Here, on the one hand, data ecosystems aim to provide multi-sided platforms [2] that facilitate an automated data exchange following the multiple contexts and values ranging from legislation to FAIR principle [12], i.e., the offered data needs to be find- clinical practice and control and power to recognition, able, accessible, interoperable, and reusable. On the other respectively. hand, today’s data ecosystems aim to equip data owners Initiatives. Superseding a previously rather tedious with fine-grained control over their data, including with bilateral exchange, the goal of initiatives like the Inter- whom it will be shared and under what terms. This fine- national Data Spaces (IDS) [3, 2, 21], GAIA-X [14, 5], grained control is the foundation of data sovereignty [3]. Data Sharing Coalition [22], IHAN [23], FIWARE [24], Achieving these goals requires solving issues w.r.t. or- CEF [25], or BDVA [26] is to establish a universal plat- ganization [2], semantics and data quality [13], and in- form to regulate transactions regarding that exchange. terfacing [14], all of which are currently under active The EU or federal offices fund such initiatives, facilitating research. a top-down approach toward establishing a common data Definitions. So far, we have seen data ecosystems platform. Some initiatives rather bundle forces toward only as a means for exchanging data as required in emerg- the adoption of data ecosystems in general (Data Shar- ing data markets and other use cases [3]. In fact, data eco- ing Coalition, CEF, BDVA), while IHAN, for instance, systems emerged without a standard definition in mind. is in an early stage, without publicly released techni- Oliveira and Lóscio [15] address this gap by reviewing cal documentation so far. Out of the named initiatives, and merging concurring data ecosystem definitions; as IDS [4], GAIA-X [5], and FIWARE [27] have released a result, they define a data ecosystem as a combination technical documentation that permit a deeper analysis of independently operated networks that produce and with regard to implemented data security and trust mea- provide data, but also other assets like software or ser- sures. Specifically, IDS and GAIA-X both work toward vices. Furthermore, the authors highlight that such data a standard interface to locate and access data and pro- ecosystems are self-regulated and driven by collabora- vide an organizational context, including identification, tion and competition between actors [15]. Additionally, admission, and certification of participants [14]. Thus, we emphasize that data ecosystems form platforms that in the remainder of this paper, we primarily study these have to define common interfaces and rules to enable general-purpose initiatives. While IDS aims to provide a collaboration across independent networks. Accordingly, framework under which data spaces can be built quickly, we refer to data ecosystem participants as networks that e.g., targeting a specific domain with coherent partici- implement the interfaces and accept the rules defined by pants, GAIA-X plans to establish a single central cross- a given ecosystem. domain platform [14]. Moving toward domain-specific Similarly, the notion of data sovereignty, i.e., one of concepts, initial projects such as CATENA-X [28], an ini- the critical concepts of data ecosystems, currently lacks tiative inside the automotive domain, are picking up their a clear and common definition [16]. If used in the con- ideas, while established platforms such as FIWARE [24], text of data ecosystems, researchers generally agree that a framework to connect smart devices, start to provide data sovereignty relates to control and ownership of data compatible interfaces [29]. items, together with specific claims and obligations made Architecture. Despite their slightly different scopes, by involved parties [17, 18, 19, 20]. Hence, within this IDS and GAIA-X share a similar architecture, so we an- paper, we will focus on this aspect of data sovereignty. alyze both initiatives together as data ecosystem imple- To set this into a broader context, the review by Hum- mentations. Organizing the data exchange, data eco- mel et al. [16] describes data sovereignty as covering systems commonly assign different roles to participants. 52 Figure 1 shows the overall scenario we are considering to- data and enforcing certain duties to be adhered to when gether with the main participants. A single data exchange processing data. Such a policy could be, for instance, can be considered bilateral, such that we can suppose the permission to use a dataset for one week, with the the following roles [4]: First, a data owner legally owns obligation to delete it after that time. the data to be shared and is interested in enforcing their To implement usage control, IDS utilizes and extends rights on the data if it is shared. Second, a data provider ODRL [33], a policy language for digital rights manage- takes over the technical part of offering a dataset to be ment that allows fine-grained modeling of usage terms [4]. exchanged on behalf of the data owner. For enforcement, the data owner has to trust that the con- While a single entity certainly can take over both roles, suming party abides by the negotiated terms. To this end, i.e., host the infrastructure to provide their data, in certain he can only rely on the certification of the consumer re- situations, the providing entity does not formally own the quired to join as a participant, but can neither monitor data. For instance, this situation is the case for electronic the process himself, nor receive a credible proof that us- health records owned by patients, which typically do not age terms were enforced. However, since the negotiated provide the infrastructure on their own. On the receiving contracts might also involve monetary compensation, side, a data consumer requests and receives the data from the consuming party has incentives to disobey negoti- the provider and passes it to a data user, who processes ated terms, e.g., using data more often than requested, the exchanged data, e.g., by visualizing it. Again, the sourcing it for other purposes, or sharing it with other consumer might also fulfill the data user role if both systems or third parties. processes are co-located. Noteworthy, GAIA-X does not Legal Context. Providing an environment for data ex- separate the data consumer and data user [5], but we change, the IDS builds upon surrounding legal contracts continue using both terms to separate the logical roles, to equip participants with the means to establish credi- as described above. bility with each other [34]. Specifically, such contracts Due to the distribution of providers and consumers, regulate the terms of usage and the overall setting, e.g., data ecosystems operate as a federation of independent regarding a monetary compensation [4] or a penalty for deployments that jointly form a decentralized system. breach of contract. Contracts can be bilateral or multilat- Thereby, data owners can keep their sensitive datasets eral but will typically not cover the entirety of data space under their control until they actively decide to share participants [4], thereby limiting spontaneous data ac- them with selected participants. To this end, data eco- cess. Within negotiated legal contracts, data ecosystems systems enable data sovereignty up to the point where a such as IDS then plan to (automatically) negotiate a re- data sharing decision has been made and data is actually fined technical contract. This refined contract translates transferred to the data consumer. terms into machine-readable policies that grant specific Trust. To not let sovereignty end at the point of data permissions on the exchanged dataset and potential obli- exchange, data ecosystems currently require a certifica- gations [4]. tion of participants. Hence, they ensure that all entities handling data adhere to a common baseline w.r.t. data protection. Certification includes, but is not limited to, 3. Data Ecosystems Need Technical defense-in-depth strategies and security event monitor- Security Guarantees ing systems [30, 31]. Specifically, the IDS requires prior certification steps and attests successful certification via Having outlined the fundamental ideas of sovereign data a public key infrastructure, establishing a trusted iden- exchange and the technical and organizational frame- tity layer [4, 14]. Contrarily, GAIA-X does not target work data ecosystems provide, we now critically review a specific certification but requires participants to pro- the design decisions of security mechanisms implemented vide a standardized self-description with claims that are in state-of-the-art data ecosystems. To this end, we ana- checked before a participant’s admission [14]. In both lyze the available technical documentation and reference cases, the ecosystem equips participants with the means architecture for IDS and GAIA-X. Primarily, we identify a to identify each other and establishes a common ground lack of technical means to facilitate strong security guar- for mutual trust decisions. antees and establish strong trust between participants. Based on the ecosystem-wide identity layer, data eco- Namely, the current ecosystem initiatives can only partly systems can provide fine-granular access control to data address the security and trust requirements with their and let data owners limit the target audience they are frail certification-based approaches. willing to share their data with. However, access control Attacker Model. Guiding our position that data eco- alone is insufficient, as data sovereignty would end once systems require stronger data protection mechanisms, we the flow of data between participants took place after apply the notion of a malicious-but-cautions attacker [35]. access has been legitimately granted. Usage control [32] Specifically, the malicious-but-cautious attacker can mis- could possibly fill this gap by granting specific rights on behave in all possible ways but aims not to leave any 53 verifiable evidence of its misbehavior [35]. Compared data owner depends on fortunate coincidence to notice to an honest-but-curious (or semi-honest) attacker, this malicious behavior retrospectively. Consequently, we definition includes explicitly local deviation from proto- argue that data owners will refrain from ever sharing cols unless they are verifiable by externals. With data sensitive data. With such datasets covering manufactur- ecosystems exchanging data within established legal con- ing plans [8], the identity of suppliers [39], or privacy- tracts, we argue that participants aim to avoid being sued sensitive health records [40] the lack of enforcement for their misbehavior and hence, have incentives not to guarantees severely limits the kind of data exchangeable. leave any evidence. To this end, a malicious-but-cautions Hence, such scenarios require stronger data sovereignty attacker reflects the typical power and incentives of data guarantees than the currently envisioned (weak) organi- ecosystem participants who source, process, and utilize zational measures. somebody else’s data. Partly addressing this issue, IDS can utilize trusted Data Security. Current notions of data security in- platform modules (TPMs) as a trust anchor on remote clude security at-rest, in-transit, and in-use [36]. At-rest systems [4]. However, merely providing verification of security and in-transit security are considered solved the running software, but essentially lacking memory problems in the context of data ecosystems as they can encryption, TPMs still contribute little to an effective use widely available building blocks such as storage en- protection against malicious-but-cautious attackers. cryption and transport layer security (TLS), respectively I2: Trusted Data Usage Reporting. Besides effective [4]. Contrarily, in-use data security targets data at the usage control, usage transparency is a second corner- moment of processing, e.g., when the decrypted data is stone to strong data sovereignty and essential to increase loaded into memory and is hence more difficult to ensure the participation of data owners. To this end, data own- and implement. Technical or cryptographic measures ers that grant permissive access to their data shall still to protect data by providing in-use security include, for be able to track usages of their data in remote systems instance, hardware-assisted security or homomorphic transparently. Within IDS, a clearing house entity is encryption [37, 38]. However, despite these measures, designated to address part of this problem by enabling today’s data ecosystems build their guarantees regarding billing-relevant usage logging [4]. However, similarly data in-use security upon remote participants’ honesty to I1, there is currently no technically or cryptographi- to enforce certain rights on shared data. Unfortunately, cally enforced guarantee that data usage must be logged. with monetary compensation handled as part of data Hence, data users can easily circumvent the implemented exchange and transfers entrusted for a specific purpose, logging features of today’s data ecosystems and thereby incentives to evade enforcement clearly exist. exceed granted usage terms without being caught, such Hence, we argue that the following questions are criti- as evading downstream payments for data usage. cal to the adoption of data ecosystem initiatives in data- I3: Sovereign Participation without Own Infras- sensitive domains: tructure. A third cornerstone of strong data sovereignty is the free choice of data owners with whom to exchange • I1: How can data owners trust remote infrastructure to data under which conditions. Within the currently pro- enforce their granted rights once data has been shared? posed architecture (cf. Figure 1), data owners entirely rely on and trust data providers to serve their data within • I2: How can data owners track their data in a trusted the ecosystem. However, if both roles are distributed way if processed by remote facilities? between separate entities, similar trust issues as between • I3: How can participants with little resources maintain the providing and consuming parties also apply here. sovereignty without requiring them to host their own Specifically, the owner needs to trust the provider to infrastructure? serve the agreed policies and not misuse data locally. Moreover, usage reporting systems must not assume the In the following, we elaborate on these high-level de- provider to be trusted in this case. Hence, the providing sign questions regarding strong data sovereignty when side of a data exchange requires the same measures to implemented in practice. implement reliable trust as the consumer side. I1: Trust in Remote Rights Enforcement. A first Takeaway. Today’s data ecosystems only provide data cornerstone of end-to-end data sovereignty is the guar- protection via organizational means, such that there is no anteed enforcement of digital rights on remote systems, protection against malicious-but-cautious inside attack- i.e., usage control. However, suppose a privileged user on ers on remote systems. At the same time, monetary data the consuming side, e.g., a system administrator, copies usage compensation and usage restrictions create incen- exchanged data without leaving traces in audit-relevant tives to evade enforcement mechanisms. Currently, these logging systems. This unintended behavior renders us- shortcomings limit the applicability of data ecosystems age control enforcement ineffective. While we anticipate to share sensitive datasets and thus need a remedy. that such an action would violate negotiated terms, the 54 4. Toward Stronger Data insufficient when considering, a malicious-but-cautious Sovereignty adversary who does not provide a trustworthy environ- ment for storing or processing the exchanged data. The current data ecosystem initiatives strive for seam- Hardware-based Trusted Execution Environments lessly interconnecting businesses and facilitating the au- (TEEs), such as Intel SGX, AMD SEV, or ARM TrustZone, tomation of valuable data exchanges. However, in the are promising candidates for closing this gap in the last section, we identified severe open issues (I1–I3) that future [49]. The goal of TEEs is to provide a trustworthy impede each participant’s data sovereignty in situations computing environment that can be established even where organizational trust mechanisms, such as required on untrusted remote infrastructure. To this end, a certification prior to admission to the ecosystem, are in- TEE provides an isolated (i.e., memory-encrypted) sufficient. Given the competitive advantage a participant environment for running applications with the ability can gain by acting in a malicious-but-cautious manner (cf. to verify the integrity of the executed program code Section 3), these open issues only become more pressing. remotely. A CPU-embedded cryptographic key provides Hence, with the data sovereignty of their participants in the required trust anchor that allows the data owner to mind, data ecosystems must deploy additional means to verify correct execution independently of the remote allow them to establish trust in that new market. host’s operating system [49]. Consequently, TEEs In this paper, we argue that only technical means provid- allow for trustworthy remote execution by hiding the ing strong cryptographic guarantees are suitable to reach program’s execution state and hardening it against the goal of trustworthy data ecosystems that retain partic- hampering. ipants’ data sovereignty. Next, we discuss how available Implementing policy enforcement and data processing building blocks can be integrated into data ecosystems inside such environments has the potential to resolve to address each of the open issues I1–I3. the trust issues data ecosystems are currently facing. However, TEE technology is an active field of research, and current implementations still experience security 4.1. Trusted Remote Policy Enforcement issues [50]. For example, today’s TEE implementations (I1) are prone to side-channel attacks that allow for limited The foundation of strong data sovereignty in data eco- data extraction [51]. Countermeasures such as oblivious systems is providing data owners with an assurance that RAM [52] are being investigated to fix these vulnera- the data ecosystem will enforce terms and conditions bilities, and we expect that future enclave designs will on their behalf. Although today’s data ecosystems lack provide further remedies against other technical issues trustworthy remote enforcement of data usage terms as they are being discovered. Hence, TEEs are a promis- (I1), promising building blocks for addressing this issue ing building block for improving data sovereignty in are already available and used in other contexts. Ex- data ecosystems via technically enforceable data policies. amples of related building blocks are distributed usage However, further research into hardening TEEs against control, trusted execution environments, and different unintended security breaches is required to improve their cryptographic schemes. In the following, we discuss applicability to data ecosystems. In fact, in a related con- these building blocks, their application areas, and their text, first work [37] demonstrates the applicability of relation to data ecosystems. TEEs in a trusted data sharing setting. Distributed usage control [41, 42, 43, 44, 45, 46] is an We thus call for the established initiatives and resear- established field of research that focuses on modeling and chers to further investigate the utility of TEE technology technically enforcing usage terms, so-called policies for for data ecosystems to reliably address the lack of trust- data usage. Data ecosystems have already adopted the no- worthy and technically backed policy enforcement. tion of policies in their organizational architecture [4, 47]. However, enforcing these policies proves difficult as the 4.2. Verifiable Data Tracking (I2) data owner cannot directly observe the misconduct of Besides policy enforcement, establishing transparency a data user or the consequences thereof [48]. Hilty and in data usage is equally important to gain data owners’ Pretschner [42] hence propose to provide data owners trust. For instance, a data owner might consider granting with evidence of policy enforcement and limit possible generous accessibility to their data but require proper at- computations. Both approaches are hard to realize within tribution by any data user. In such a case, the data owner a data ecosystem as they require some technical trust an- would profit from technically guaranteed notifications chor on remote systems. Specifically, data ecosystems whenever a data user accessed the data. currently do not offer such trust anchors as the data user Currently, IDS implements a clearing house instance, gains full control over the exchanged data once it has which can log data usage if mandated in a policy, mak- been obtained from the data owner. This situation is ing it transparent to data owners [4]. However, data 55 users have neither a strict technical constraint to log data 4.3. Integration of Resource-Constrained usage, nor can the system enforce it by some means. Con- Participants (I3) sequently, IDS cannot currently provide trusted monitor- ing unless data usage can be observed externally. Hence, With the separation between the data provider and data the current clearing house instance does not solve the owner, data ecosystems also address scenarios that in- problem of verifiable data tracking (I2). volve particularly resource-constrained or especially pri- Instead, technical or cryptographic means would vacy-aware data owners who are unable or unwilling to help to incentivize logging. To this end, we consider run the complete infrastructure themselves. However, transparency logging, data-flow tracking, and distributed infrastructure control is the foundation of self-sovereign ledger technology promising for establishing verifiable participation in distributed environments [4]. Hence, this data tracking in data ecosystems. approach is not viable for resource-constrained partic- For instance, certificate transparency logging allows ipants. Such participants could be, for instance, small modern web browsers to reject digital certificates that are to mid-sized enterprises (SMEs) in a supply chain con- not tracked in a public log for auditors to verify [35]. A text, which have no technical expertise to provide the similar approach might improve data usage transparency infrastructure to participate in a data ecosystem. In this as well. Namely, cryptographically tying the decryption case, their customers may be capable of assuming the role of exchanged data or the transfer of results to a publicly of a data provider collecting data from their contracted verifiable log entry would force data users to log their SMEs and offering that data on their behalf within the actions accurately. Such approaches are being researched ecosystem. For instance, large automotive manufactur- in the field of verifiable computing [53, 54] and data eco- ers can assume the role of a data provider on behalf of systems could profit by utilizing corresponding building their, typically numerous, suppliers [8]. In this case, how- blocks. ever, data owners lose their sovereignty and depend on Besides logging, related work also proposes data flow trust in their customers. Thus, appropriate (technical) tracking [55] and data fingerprinting [56] to allow for guarantees for such situations are desirable. identifying the source of identified data breaches after A scenario that would give data owners assurance that the fact. However, the cryptographic data fingerprints re- their data is treated as intended would be considering the quired to apply these techniques necessitate knowledge data provider as a different party than the data owner; of the exact data representation and a sufficient tolerance however, current ecosystem initiatives do not rigorously for minor statistical noise in the monitored data [56]. satisfy this demand [4]. Under this assumption, however, Unfortunately, these fingerprints typically cannot sur- one could implement the same measures discussed in vive intermediate processing steps [56], rendering them Section 4.1 also on the provider side, i.e., realize a trusted inapplicable in some situations. Hence, more research data provider. Moreover, concerning usage transparency, maturing resilient data flow tracking or fingerprinting this scenario requires logs, as discussed in Section 4.2, techniques is required to determine and improve their to be accessible with no own infrastructure. Hence, not applicability in the context of data ecosystems. only the consumer-side aspect of logging must be trusted, Finally, distributed ledger technology has emerged but also the instance that provides logging on behalf of in recent years with the explicit goal of facilitating dig- data owners. ital interactions among participants who do not fully trust each other. While Bitcoin started by establishing 4.4. Summary a decentralized and publicly accessible digital currency based on a blockchain [57], it spawned more versatile Cryptographic building blocks that have been success- distributed ledgers for any information using smart con- fully applied in the past are promising also to address tracts [58]. Ultimately, business-focused ledger systems the core issues (I1–I3) currently impeding the data sov- emerged, such as Hyperledger Fabric or Quorum. These ereignty of data owners in today’s data ecosystems. For architectures can facilitate the event-logging within data instance, TEEs have the potential to provide the cur- ecosystems and provide a medium for the automated rently missing trust anchor during remote processing billing of data accesses. (I1). Similarly, concepts currently applied in the con- To avoid additional privacy or data confidentiality text of certificate transparency logging or distributed problems, such transparency mechanisms need to take ledger technology may help satisfy the requirement for privacy into account, e.g., by encrypting log entries [59]. verifiable tracking in data ecosystems (I2) once they are Overall, technical building blocks for verifiable data track- adapted to the scalability demands of envisioned deploy- ing are already available. However, they still need to be ments. Finally, these measures can also potentially be tailored to the specific verifiable data tracking require- applied when data providers operate on behalf of the ments for utilization in data ecosystems regarding per- original data owner to incorporate resource-constrained formance, scalability, flexibility, and privacy. participants in the process (I3). 56 5. Ongoing and Past Research can provide for their use cases as well as for society in Efforts general. Technical Solutions for Data Sharing. Besides iden- The potential to improve data ecosystems and the need tifying novel use cases for sharing data via data ecosys- to address their current issues has also been recognized tems, other research successfully applied technical and in previous work. All in all, data ecosystems are subject especially cryptographic building blocks to tackle the to past and active research alike, especially due to on- general challenges of data sharing in more narrow sce- going large-scale initiatives. In this section, we present narios. For instance, Huang et al. [78] propose a data- notable recent research efforts in data ecosystems. Specif- sharing scheme to later identify sources of data breaches ically, we provide an overview of fundamental research based on oblivious transfers and embedded fingerprints. regarding the organization of data ecosystems, research Moreover, a variety of work considers sharing data with efforts investigating the use cases that would benefit from cloud providers [79, 80, 81, 82, 83, 84], which can be data ecosystems, and works that apply technical security considered conceptually similar to data ecosystems with measures to facilitate data sharing efforts. multiple stakeholders. Such work includes querying en- Fundamental Data Ecosystem Advancements. Oli- crypted data [85], attribute- or identity-based encryp- veira and Lóscio [15] survey the components data eco- tion for access control [86, 74, 87, 39], and distributed systems typically comprise. Furthermore, several works ledgers together with TEEs to enforce accountability and discuss requirements and possible ways toward imple- access control [37]. Then again, Bonatti et al. [88] iden- menting data ecosystems in general, i.e., independent of tify correctness and completeness as desirable properties specific initiatives [14, 2, 3, 60, 13, 61]. Another line of of transparency mechanisms in data sharing. These ap- research investigates fundamental challenges faced when proaches to strengthen sovereignty guarantees apply to implementing (distributed) data sharing systems. Mainly, real-world use cases and might even be translatable for these challenges engulf transparency requirements [62], use in data ecosystems. addressing the potential lack of trust between partici- pants [13, 63, 64], the need for creating a common se- mantic understanding among all participants [65], and 6. Discussion and Future Work governance as well as legal constraints [66, 67, 68, 34]. As we have highlighted in Section 3, today’s data ecosys- More directly targeted to data ecosystems as they are tems mostly rely on organizational means to implement defined in this work, research considers alternatives to data protection. However, technical building blocks are the current IDS and GAIA-X initiatives. For instance, already available to address the remaining challenges FIWARE [24, 29] provides a platform to facilitate data ex- for data sovereignty in data ecosystems by providing change in an Internet of Things context and is related to stronger guarantees for participants (cf. Section 4). Fi- CEF [25]. Furthermore, special-purpose data ecosystems nally, ongoing research efforts (cf. Section 5) have en- are being considered, e.g., by the NFDI initiative [69], visioned that suitable applications of data ecosystems which focuses on improving the accessibility of research include the handling of privacy-sensitive data, such as data. Finally, NFDI and FIWARE aim to implement IDS- patient records in medical contexts, but also confiden- compatible interfaces, hence working toward ecosystem tiality demands of critical business data require those compatibility. guarantees. To this end, data ecosystems must provide a Use Cases. Another critical aspect of research on framework that allows users to trust the overall system data ecosystems revolves around the use cases they are w.r.t. enforcing their rights at any time, including pro- particularly well-suited for. Other works have identi- cessing in remote systems after access was granted and fied many relevant or desirable use cases in this regard. data was shared. Among these use cases are the sharing of medical health Based on our analysis of the status quo as well as on- records [70, 10], personal data [71], data emerging in going research efforts so far, we discuss in the following the Industrial Internet of Things [72, 73], and data ex- that overcoming current shortcomings of usage control change across supply chains, such as in the automotive and stronger hardware-based security measures are cru- industry [8, 39, 28], that have unique requirements con- cial research directions to sustainably strengthen the data cerning data confidentiality, data volume, or long-term sovereignty for participants of data ecosystems. persistency. Further data sharing schemes do not specifi- Shortcomings of Usage Control. With (distributed) cally target data ecosystems but are conceptually similar, usage control, prior work already addresses the issues such as applications in medicine [6, 40, 9, 74], for pro- I1–I3 today’s data ecosystems are facing. However, the duction technology [75, 76], along supply chains [8], or enforcement has not (yet) been thoroughly picked up in education [77]. We expect that additional domains by recent initiatives, possibly due to the current lack of will also start to investigate the benefits data ecosystems technical guarantees [48]. Most work in this area either 57 targets rights modeling (e.g., [41, 89, 90]) or assumes op- strates the applicability of cryptographic mechanisms, eration on trusted infrastructure (e.g., [91, 92]), which e.g., in certificate transparency. To this end, further re- we argue does not withstand malicious-but-cautions at- search must investigate how these concepts can support tackers, as applicable to data ecosystems. Given that transparency in data ecosystems, while not creating new guaranteed policy enforcement is crucial for sharing sen- privacy issues. Finally, the combination of technically en- sitive datasets within data ecosystems, this question still forceable usage control with usage transparency might needs to be addressed to allow for a wide-spread adoption also be the first step toward sovereign integration of of data ecosystems. resource-constrained participants (I3). With cryptographic and technical solutions, the ways toward stronger guarantees are two-fold and not straight- forward. The discussed cryptographic approaches to- 7. Conclusion ward stronger guarantees, i.e., providing usage control Today’s data ecosystems facilitate an automated and transparency via cryptographic means, implement exchange of data in a standardized manner while simul- the strongest protection among the discussed techniques taneously providing access to huge and heterogeneous but currently either allow only limited expressiveness data sources. Given that these data exchanges and or suffer from a severe performance penalty. Hence, we corresponding higher-level applications across domains argue that they are currently not suited for general ap- (e.g., in the automotive industry) also frequently deal plication in data ecosystems but should be selectively with sensitive information, including business secrets applied for the most sensitive datasets, where the named and data subject to privacy regulations, data ecosystems limitations and overheads are acceptable [40]. must implement reliable measures to prevent any Need for Hardware-based Security. Hardware so- undesirable exposure of sensitive data. Currently, these lutions provide a trust anchor under the malicious-but- measures are mostly based on organizational means, cautious attacker model. Moreover, they are less affected which we argue, fail to provide sufficient guarantees in by performance penalties and eventually allow the same settings with malicious-but-cautious participants, i.e., operations as standard hardware. However, TPMs, as cur- participants who aim to remain unnoticed while still rently envisaged by the IDS [4], cannot provide adequate trying to infer all possible information from the data protection of sensitive data due to the lacking memory ecosystem and associated data exchanges. encryption. Hence, Trusted Execution Environments We raise the crucial issue that today’s data ecosystems (TEEs), despite current known side-channel attacks and lack appropriate guarantees w.r.t. confidential processing related weaknesses, seem to be a better choice for strong on systems operated by third parties, transparency of data guarantees regarding data sovereignty expanding to re- access and usage, and the participation of parties with mote systems. no infrastructure under their control (I1–I3). We have With hardware-based TEEs being available for a few further surveyed corresponding technical solutions to years, the question arises as to why today’s data eco- these issues and highlight that they are available but have systems do not yet implement TEE-based security. One not yet been adopted in practice. To this end, we argue reason might be known weaknesses, which need to be that the success of data ecosystems directly depends on addressed in future designs. However, these weaknesses their ability to address the present need for strong data do not seem to hinder deployment in further applications, sovereignty of participants. As such, especially modern as, for instance, Microsoft Azure offers commercial sup- technical solutions, such as TEEs, promise to provide data port for TEEs in its cloud service [93]. Hence, we argue owners with strong guarantees of correct data handling, that data ecosystems should consider employing TEEs increasing their willingness to participate in available as a measure to enforce data owner’s rights on remote data ecosystems. infrastructure, which would fill the current gap toward implementing end-to-end data sovereignty. Future Work. These required research efforts mo- Acknowledgments tivate our call for future work in the domain of data ecosystems. Regarding the reliable enforcement of us- Funded by the Deutsche Forschungsgemeinschaft age terms (I1), future work must address tailoring exist- (DFG, German Research Foundation) under Germany’s ing data protection schemes to data ecosystems. Here, Excellence Strategy – EXC-2023 Internet of Production – a promising idea seems to employ TEEs as a trust an- 390621612. chor on remote infrastructure. However, further research must clarify to which degree current limitations, such as performance penalties, affect application within data eco- systems. Subsequently, this can be integrated with trans- parency mechanisms (I2) where current work demon- 58 References [11] Z. Du, C. Wu, T. Yoshinaga, K.-L. A. Yau, Y. Ji, J. Li, Federated Learning for Vehicular Internet of Things: [1] J. Pennekamp, R. Glebke, M. Henze, T. Meisen, Recent Advances and Open Issues, IEEE Open C. Quix, R. Hai, L. Gleim, P. Niemietz, M. Rudack, Journal of the Computer Society 1 (2020) 45–61. S. Knape, A. Epple, D. Trauth, U. Vroomen, T. Bergs, doi:10.1109/OJCS.2020.2992630. C. Brecher, A. Buhrig-Polaczek, M. Jarke, K. Wehrle, [12] M. D. Wilkinson et al, The FAIR Guiding Princi- Towards an Infrastructure Enabling the Internet of ples for scientific data management and steward- Production, in: 2019 IEEE International Conference ship, Scientific Data 3 (2016) 160018. doi:10.1038/ on Industrial Cyber Physical Systems (ICPS), IEEE, sdata.2016.18. Taipei, Taiwan, 2019, pp. 31–37. doi:10.1109/ [13] J. Gelhaar, B. Otto, Challenges in the Emergence of ICPHYS.2019.8780276. Data Ecosystems, in: Pacific Asia Conference on [2] B. Otto, M. Jarke, Designing a multi-sided data Information Systems (PACIS), Dubai, 2020. platform: Findings from the International Data [14] A. Braud, G. Fromentoux, B. Radier, O. Le Grand, Spaces case, Electronic Markets 29 (2019) 561–580. The Road to European Digital Sovereignty with doi:10.1007/s12525-019-00362-x. Gaia-X and IDSA, IEEE Network 35 (2021) 4–5. [3] B. Otto, Interview with Reinhold Achatz on “Data doi:10.1109/MNET.2021.9387709. Sovereignty and Data Ecosystems”, Business & [15] M. I. S. Oliveira, B. F. Lóscio, What is a data Information Systems Engineering 61 (2019) 635– ecosystem?, in: Proceedings of the 19th Annual 636. doi:10.1007/s12599-019-00609-z. International Conference on Digital Government [4] B. Otto, S. Steinbuss, A. Teuscher, S. Lohmann et Research: Governance in the Data Age, ACM, al., IDS Reference Architecture Model (Version 3.0), Delft The Netherlands, 2018, pp. 1–9. doi:10.1145/ 2019. 3209281.3209335. [5] Gaia-X Technical Committee, Gaia-X Architecture [16] P. Hummel, M. Braun, M. Tretter, P. Dabrock, Document, 2021. Data sovereignty: A review, Big Data & So- [6] D. Froelicher, P. Egger, J. S. Sousa, J. L. Raisaro, ciety 8 (2021) 205395172098201. doi:10.1177/ Z. Huang, C. Mouchet, B. Ford, J.-P. Hubaux, UnL- 2053951720982012. ynx: A Decentralized System for Privacy-Conscious [17] M. Schanzenbach, Towards Self-sovereign, Decen- Data Sharing, Proceedings on Privacy Enhancing tralized Personal Data Sharing and Identity Man- Technologies 2017 (2017) 232–250. doi:10.1515/ agement, Ph.D. thesis, 2020. popets-2017-0047. [18] V. Pedreira, D. Barros, P. Pinto, A Review of At- [7] D. McCabe, A. Satariano, The Era of Borderless tacks, Vulnerabilities, and Defenses in Industry 4.0 Data is Ending, New York Times (2022). with New Challenges on Data Sovereignty Ahead, [8] L. Bader, J. Pennekamp, R. Matzutt, D. Hedderich, Sensors 21 (2021) 5189. doi:10.3390/s21155189. M. Kowalski, V. Lücken, K. Wehrle, Blockchain- [19] S. Couture, S. Toupin, What does the notion of based privacy preservation for supply chains sup- “sovereignty” mean when referring to the digital?, porting lightweight multi-hop information ac- New Media & Society 21 (2019) 2305–2322. doi:10. countability, Information Processing & Manage- 1177/1461444819865984. ment 58 (2021) 102529. doi:10.1016/j.ipm.2021. [20] K. Irion, Government Cloud Computing and Na- 102529. tional Data Sovereignty: Government Cloud Com- [9] H. Ma, R. Zhang, G. Yang, Z. Song, K. He, Y. Xiao, puting and National Data Sovereignty, Policy & Efficient Fine-Grained Data Sharing Mechanism Internet 4 (2012) 40–71. doi:10.1002/poi3.10. for Electronic Medical Record Systems with Mo- [21] S. R. Bader, M. Maleshkova, SOLIOT—Decentralized bile Devices, IEEE Transactions on Dependable Data Control and Interactions for IoT, Future Inter- and Secure Computing 17 (2020) 1026–1038. doi:10. net 12 (2020) 105. doi:10.3390/fi12060105. 1109/TDSC.2018.2844814. [22] Data Sharing Coalition, https: [10] A. Appenzeller, S. Bartholomaus, R. Breitschwerdt, //datasharingcoalition.eu/ C. Claussen, S. Geisler, T. Hartz, P. Kachel, E. Krem- about-the-data-sharing-coalition/, 2022. Ac- pel, S. Robert, S. R. Zeissig, Towards Distributed cessed 2022-08-09. Healthcare Systems – Virtual Data Pooling Between [23] IHAN, https://ihan.fi/, 2022. Accessed 2022-08-09. Cancer Registries as Backbone of Care and Re- [24] F. Cirillo, G. Solmaz, E. L. Berz, M. Bauer, B. Cheng, search, in: 2021 IEEE/ACS 18th International Con- E. Kovacs, A Standard-Based Open Source IoT ference on Computer Systems and Applications Platform: FIWARE, IEEE Internet of Things Mag- (AICCSA), IEEE, Tangier, Morocco, 2021, pp. 1–8. azine 2 (2019) 12–18. doi:10.1109/IOTM.0001. doi:10.1109/AICCSA53542.2021.9686918. 1800022. 59 [25] CEF Digital, https://ec.europa.eu/cefdigital/wiki/ Kingdom, 2019, pp. 45–56. doi:10.1145/3338469. display/CEFDIGITAL/CEF+Digital+Home, 2022. 3358944. Accessed 2022-08-09. [39] S. Malik, N. Gupta, V. Dedeoglu, S. S. Kanhere, R. Jur- [26] Big Data Value Association, https://www.bdva.eu/, dak, TradeChain: Decoupling Traceability and Iden- 2022. Accessed 2022-08-09. tity in Blockchain enabled Supply Chains (2021). [27] ETSI GR CIM 007 V1.1.1: Security and Privacy, Tech- doi:10.48550/ARXIV.2105.11217. nical Report, France, 2022. [40] D. Froelicher, J. R. Troncoso-Pastoriza, J. L. Rais- [28] O. Voß, Catena-X: Datenstandards für die Auto- aro, M. A. Cuendet, J. S. Sousa, H. Cho, B. Berger, branche, Tagesspiegel Background Digitalisierung J. Fellay, J.-P. Hubaux, Truly Privacy-Preserving & KI (2021). Federated Analytics for Precision Medicine with [29] Á. Alonso, A. Pozo, J. Cantera, F. de la Vega, J. Hi- Multiparty Homomorphic Encryption, Preprint, erro, Industrial Data Space Architecture Imple- Bioinformatics, 2021. doi:10.1101/2021.02.24. mentation Using FIWARE, Sensors 18 (2018) 2226. 432489. doi:10.3390/s18072226. [41] J. Park, R. Sandhu, The UCON ABC usage control [30] N. Menz, A. Resetko, B. Otto, Framework for the IDS model, ACM Transactions on Information and Certification Scheme 2.0, Technical Report, IDSA, System Security 7 (2004) 128–174. doi:10.1145/ 2019. doi:10.5281/ZENODO.5244858. 984334.984339. [31] CEN European Committee for Standardization, In- [42] M. Hilty, D. Basin, A. Pretschner, On Obligations, formation technology - Security techniques - In- in: D. Hutchison, T. Kanade, J. Kittler, J. M. Klein- formation security management systems - Require- berg, F. Mattern, J. C. Mitchell, M. Naor, O. Nier- ments (ISO/IEC 27001:2013 including Cor 1:2014 strasz, C. Pandu Rangan, B. Steffen, M. Sudan, and Cor 2:2015), 2017. D. Terzopoulos, D. Tygar, M. Y. Vardi, G. Weikum, [32] A. Pretschner, M. Hilty, F. Schütz, C. Schaefer, S. d. C. di Vimercati, P. Syverson, D. Gollmann T. Walter, Usage Control Enforcement: Present (Eds.), Computer Security – ESORICS 2005, volume and Future, IEEE Security & Privacy Magazine 6 3679, Springer Berlin Heidelberg, Berlin, Heidel- (2008) 44–53. doi:10.1109/MSP.2008.101. berg, 2005, pp. 98–117. doi:10.1007/11555827_ [33] R. Ianella, Open digital rights language (ODRL), 7. Open Content Licensing: Cultivating the Creative [43] M. Hilty, A. Pretschner, D. Basin, C. Schaefer, Commons (2007). T. Walter, A Policy Language for Distributed Us- [34] A. Duisberg, Legal Aspects of IDS: Data Sovereignty age Control, in: D. Hutchison, T. Kanade, J. Kit- - What Does It Imply?, in: Designing Data Spaces, tler, J. M. Kleinberg, F. Mattern, J. C. Mitchell, Springer, 2022. M. Naor, O. Nierstrasz, C. Pandu Rangan, B. Steffen, [35] M. D. Ryan, Enhanced Certificate Transparency M. Sudan, D. Terzopoulos, D. Tygar, M. Y. Vardi, and End-to-End Encrypted Mail, in: Proceedings G. Weikum, J. Biskup, J. López (Eds.), Computer 2014 Network and Distributed System Security Security – ESORICS 2007, volume 4734, Springer Symposium, Internet Society, San Diego, CA, 2014. Berlin Heidelberg, Berlin, Heidelberg, 2007, pp. 531– doi:10.14722/ndss.2014.23379. 546. doi:10.1007/978-3-540-74835-9_35. [36] L. kacha, A. Zitouni, An Overview on [44] F. Kelbert, A. Pretschner, Data usage control en- Data Security in Cloud Computing, vol- forcement in distributed systems, in: Proceed- ume 661, 2018, pp. 250–261. doi:10.1007/ ings of the Third ACM Conference on Data and 978-3-319-67618-0_23. arXiv:1812.09053. Application Security and Privacy - CODASPY ’13, [37] H. Lei, Y. Yan, Z. Bao, Q. Wang, Y. Zhang, W. Shi, ACM Press, San Antonio, Texas, USA, 2013, p. 71. SDSBT: A Secure Multi-party Data Sharing Plat- doi:10.1145/2435349.2435358. form Based on Blockchain and TEE, in: J. Cheng, [45] F. Kelbert, A. Pretschner, A Fully Decentralized X. Tang, X. Liu (Eds.), Cyberspace Safety and Se- Data Usage Control Enforcement Infrastructure, in: curity, volume 12653, Springer International Pub- T. Malkin, V. Kolesnikov, A. B. Lewko, M. Polychron- lishing, Cham, 2021, pp. 184–196. doi:10.1007/ akis (Eds.), Applied Cryptography and Network 978-3-030-73671-2_17. Security, volume 9092, Springer International Pub- [38] F. Boemer, A. Costache, R. Cammarota, C. Wierzyn- lishing, Cham, 2015, pp. 409–430. doi:10.1007/ ski, nGraph-HE2: A High-Throughput Framework 978-3-319-28166-7_20. for Neural Network Inference on Encrypted Data, [46] I. Akaichi, S. Kirrane, Usage Control Specifi- in: Proceedings of the 7th ACM Workshop on En- cation, Enforcement, and Robustness: A Survey, crypted Computing & Applied Homomorphic Cryp- arXiv:2203.04800 [cs] (2022). arXiv:2203.04800. tography - WAHC’19, ACM Press, London, United [47] S. Steinbuss, et. al, Usage Control in the Interna- tional Data Spaces, 2021. 60 [48] A. Hosseinzadeh, A. Eitel, C. Jung, A Systematic preserving transparency logging, in: Proceed- Approach toward Extracting Technically Enforce- ings of the 12th ACM Workshop on Workshop able Policies from Data Usage Control Require- on Privacy in the Electronic Society, ACM, Berlin ments:, in: Proceedings of the 6th International Germany, 2013, pp. 83–94. doi:10.1145/2517840. Conference on Information Systems Security and 2517847. Privacy, SCITEPRESS - Science and Technology [60] J. Zrenner, F. O. Möller, C. Jung, A. Eitel, B. Otto, Publications, Valletta, Malta, 2020, pp. 397–405. Usage control architecture options for data sover- doi:10.5220/0008936003970405. eignty in business ecosystems, Journal of Enter- [49] M. Schneider, R. J. Masti, S. Shinde, S. Capkun, prise Information Management 32 (2019) 477–495. R. Perez, SoK: Hardware-supported Trusted Exe- doi:10.1108/JEIM-03-2018-0058. cution Environments, 2022. arXiv:2205.12742. [61] M. Henze, M. Grossfengels, M. Koprowski, [50] A. Nilsson, P. N. Bideh, J. Brorsson, A Survey of K. Wehrle, Towards Data Handling Requirements- Published Attacks on Intel SGX, arXiv:2006.13598 Aware Cloud Computing, in: 2013 IEEE 5th Interna- [cs] (2020). arXiv:2006.13598. tional Conference on Cloud Computing Technology [51] M.-W. Shih, S. Lee, T. Kim, M. Peinado, T-SGX: and Science, IEEE, Bristol, United Kingdom, 2013, Eradicating Controlled-Channel Attacks Against pp. 266–269. doi:10.1109/CloudCom.2013.145. Enclave Programs, in: Proceedings 2017 Network [62] S. Geisler, M.-E. Vidal, C. Cappiello, B. F. Lóscio, and Distributed System Security Symposium, Inter- A. Gal, M. Jarke, M. Lenzerini, P. Missier, B. Otto, net Society, San Diego, CA, 2017. doi:10.14722/ E. Paja, B. Pernici, J. Rehof, Knowledge-Driven Data ndss.2017.23193. Ecosystems Toward Data Transparency, Journal [52] S. Sasy, S. Gorbunov, C. W. Fletcher, ZeroTrace : of Data and Information Quality 14 (2022) 1–12. Oblivious Memory Primitives from Intel SGX, in: doi:10.1145/3467022. Proceedings 2018 Network and Distributed System [63] A. Munoz-Arcentales, S. López-Pernas, A. Pozo, Security Symposium, Internet Society, San Diego, Á. Alonso, J. Salvachúa, G. Huecas, An Architec- CA, 2018. doi:10.14722/ndss.2018.23239. ture for Providing Data Usage and Access Control in [53] R. Gennaro, C. Gentry, B. Parno, Non-interactive Data Sharing Ecosystems, Procedia Computer Sci- Verifiable Computing: Outsourcing Computation ence 160 (2019) 590–597. doi:10.1016/j.procs. to Untrusted Workers, in: D. Hutchison, T. Kanade, 2019.11.042. J. Kittler, J. M. Kleinberg, F. Mattern, J. C. Mitchell, [64] M. Huber, S. Wessel, G. Brost, N. Menz, Building M. Naor, O. Nierstrasz, C. Pandu Rangan, B. Steffen, Trust in Data Spaces, in: Designing Data Spaces, M. Sudan, D. Terzopoulos, D. Tygar, M. Y. Vardi, Springer, 2022. G. Weikum, T. Rabin (Eds.), Advances in Cryptol- [65] S. Bader, J. Pullmann, C. Mader, S. Tramp, C. Quix, ogy – CRYPTO 2010, volume 6223, Springer Berlin A. W. Müller, H. Akyürek, M. Böckmann, B. T. Im- Heidelberg, Berlin, Heidelberg, 2010, pp. 465–482. busch, J. Lipp, S. Geisler, C. Lange, The Interna- doi:10.1007/978-3-642-14623-7_25. tional Data Spaces Information Model – An Ontol- [54] B. Parno, J. Howell, C. Gentry, M. Raykova, Pinoc- ogy for Sovereign Exchange of Digital Content, in: chio: Nearly Practical Verifiable Computation, in: J. Z. Pan, V. Tamma, C. d’Amato, K. Janowicz, B. Fu, 2013 IEEE Symposium on Security and Privacy, A. Polleres, O. Seneviratne, L. Kagal (Eds.), The Se- IEEE, Berkeley, CA, 2013, pp. 238–252. doi:10. mantic Web – ISWC 2020, volume 12507, Springer 1109/SP.2013.47. International Publishing, Cham, 2020, pp. 176–192. [55] I. Kunz, V. Casola, A. Schneider, C. Banse, J. Schütte, doi:10.1007/978-3-030-62466-8_12. Towards Tracking Data Flows in Cloud Architec- [66] C. Ducuing, Data as infrastructure? A study of data tures, 2020 IEEE 13th International Conference on sharing legal regimes, Competition and Regulation Cloud Computing (CLOUD) (2020) 445–452. in Network Industries 21 (2020) 124–142. doi:10. [56] M. Backes, N. Grimm, A. Kate, Data Lineage in 1177/1783591719895390. Malicious Environments, IEEE Transactions on [67] D. Wu, S. G. Verhulst, A. Pentland, T. Avila, K. Finch, Dependable and Secure Computing 13 (2016) 178– A. Gupta, How data governance technologies 191. doi:10.1109/TDSC.2015.2399296. can democratize data sharing for community well- [57] S. Nakamoto, Bitcoin: A peer-to-peer electronic being, Data & Policy 3 (2021) e14. doi:10.1017/ cash system, Decentralized Business Review (2008) dap.2021.13. 21260. [68] L. Helminger, C. Rechberger, Multi-party com- [58] V. Buterin, et al., A next-generation smart contract putation in the GDPR, in: Privacy Symposium and decentralized application platform, white paper 2022 - Data Protection Law International Conver- 3 (2014) 2–1. gence and Compliance with Innovative Technolo- [59] T. Pulls, R. Peeters, K. Wouters, Distributed privacy- gies (DPLICIT), 2022. 61 [69] N. L. Weisweiler, R. Bertelmann, P. Braesicke, Education Material, in: 2020 International Con- T. Bronger, C. Curdt, F. O. Glöckner, S. Rank, O. Ste- ference on Information Networking (ICOIN), IEEE, gle, Y. Sure-Vetter, N. Villacorta, Helmholtz Open Barcelona, Spain, 2020, pp. 529–534. doi:10.1109/ Science Briefing: Helmholtz in der Nationalen ICOIN48656.2020.9016478. Forschungsdateninfrastruktur (NFDI): Report des [78] C. Huang, D. Liu, J. Ni, R. Lu, X. Shen, Achieving Helmholtz Open Science Forums, Technical Re- Accountable and Efficient Data Sharing in Indus- port, Helmholtz Open Science Office, 2021. doi:10. trial Internet of Things, IEEE Transactions on In- 48440/OS.HELMHOLTZ.030. dustrial Informatics 17 (2021) 1416–1427. doi:10. [70] J. Scheibner, J. L. Raisaro, J. R. Troncoso-Pastoriza, 1109/TII.2020.2982942. M. Ienca, J. Fellay, E. Vayena, J.-P. Hubaux, Revo- [79] J. Shen, T. Zhou, D. He, Y. Zhang, X. Sun, Y. Xiang, lutionizing Medical Data Sharing Using Advanced Block Design-Based Key Agreement for Group Data Privacy-Enhancing Technologies: Technical, Legal, Sharing in Cloud Computing, IEEE Transactions and Ethical Synthesis, Journal of Medical Internet on Dependable and Secure Computing 16 (2019) Research 23 (2021) e25120. doi:10.2196/25120. 996–1010. doi:10.1109/TDSC.2017.2725953. [71] R. Matzutt, D. Müllmann, E.-M. Zeissig, C. Horst, [80] A. Fromm, V. Stepa, HDFT++ Hybrid Data Flow K. Kasugai, S. Lidynia, S. Wieninger, J. H. Ziegel- Tracking for SaaS Cloud Services, in: 2017 IEEE 4th dorf, G. Gudergan, I. S. gen. Döhmann, K. Wehrle, International Conference on Cyber Security and M. Ziefle, myneData: Towards a Trusted and User- Cloud Computing (CSCloud), IEEE, New York, NY, controlled Ecosystem for Sharing Personal Data USA, 2017, pp. 333–338. doi:10.1109/CSCloud. (2017). doi:10.18420/IN2017_109. 2017.9. [72] H. Baars, A. Tank, P. Weber, H.-G. Kemper, H. Lasi, [81] Z. Qin, H. Xiong, S. Wu, J. Batamuliza, A Survey B. Pedell, Cooperative Approaches to Data Shar- of Proxy Re-Encryption for Secure Data Sharing in ing and Analysis for Industrial Internet of Things Cloud Computing, IEEE Transactions on Services Ecosystems, Applied Sciences 11 (2021) 7547. Computing (2016) 1–1. doi:10.1109/TSC.2016. doi:10.3390/app11167547. 2551238. [73] A. L. Marra, F. Martinelli, P. Mori, A. Saracino, [82] T. Pasquier, J. Bacon, J. Singh, D. Eyers, Data- A Distributed Usage Control Framework for In- Centric Access Control for Cloud Computing, in: dustrial Internet of Things, in: C. Alcaraz (Ed.), Proceedings of the 21st ACM on Symposium on Security and Privacy Trends in the Industrial Access Control Models and Technologies, ACM, Internet of Things, Springer International Pub- Shanghai China, 2016, pp. 81–88. doi:10.1145/ lishing, Cham, 2019, pp. 115–135. doi:10.1007/ 2914642.2914662. 978-3-030-12330-7_6. [83] A. Bessani, M. Correia, B. Quaresma, F. André, [74] X. Lu, X. Cheng, A Secure and Lightweight Data P. Sousa, DepSky: Dependable and Secure Stor- Sharing Scheme for Internet of Medical Things, age in a Cloud-of-Clouds, ACM Transactions on IEEE Access 8 (2020) 5022–5030. doi:10.1109/ Storage 9 (2013) 1–33. doi:10.1145/2535929. ACCESS.2019.2962729. [84] S. Sundareswaran, A. Squicciarini, D. Lin, Ensur- [75] J. Pennekamp, E. Buchholz, Y. Lockner, ing Distributed Accountability for Data Sharing in M. Dahlmanns, T. Xi, M. Fey, C. Brecher, the Cloud, IEEE Transactions on Dependable and C. Hopmann, K. Wehrle, Privacy-Preserving Secure Computing 9 (2012) 556–568. doi:10.1109/ Production Process Parameter Exchange, in: TDSC.2012.26. Annual Computer Security Applications Con- [85] A. Rafique, D. Van Landuyt, E. Heydari Beni, B. La- ference, ACM, Austin USA, 2020, pp. 510–525. gaisse, W. Joosen, CryptDICE: Distributed data doi:10.1145/3427228.3427248. protection system for secure cloud data storage and [76] S. Mangel, L. Gleim, J. Pennekamp, K. Wehrle, computation, Information Systems 96 (2021) 101671. S. Decker, Data Reliability and Trustworthiness doi:10.1016/j.is.2020.101671. Through Digital Transmission Contracts, in: The [86] K. Edemacu, B. Jang, J. W. Kim, CESCR: CP-ABE Semantic Web, volume 12731, Springer Interna- for efficient and secure sharing of data in collab- tional Publishing, Cham, 2021, pp. 265–283. doi:10. orative ehealth with revocation and no dummy 1007/978-3-030-77385-4_16. attribute, PLOS ONE 16 (2021) e0250992. doi:10. [77] R. Matzutt, J. Pennekamp, K. Wehrle, A Secure and 1371/journal.pone.0250992. Practical Decentralized Ecosystem for Shareable 62 [87] B. Waters, Ciphertext-Policy Attribute-Based En- Boston, MA, 2010, pp. 133–146. doi:10.1007/ cryption: An Expressive, Efficient, and Provably 978-1-4419-6794-7_11. Secure Realization, in: D. Hutchison, T. Kanade, [90] Q. H. Cao, M. Giyyarpuram, R. Farahbakhsh, J. Kittler, J. M. Kleinberg, F. Mattern, J. C. Mitchell, N. Crespi, Policy-based usage control for a trust- M. Naor, O. Nierstrasz, C. Pandu Rangan, B. Stef- worthy data sharing platform in smart cities, Future fen, M. Sudan, D. Terzopoulos, D. Tygar, M. Y. Generation Computer Systems 107 (2020) 998–1010. Vardi, G. Weikum, D. Catalano, N. Fazio, R. Gen- doi:10.1016/j.future.2017.05.039. naro, A. Nicolosi (Eds.), Public Key Cryptogra- [91] F. Cirillo, B. Cheng, R. Porcellana, M. Russo, G. Sol- phy – PKC 2011, volume 6571, Springer Berlin maz, H. Sakamoto, S. P. Romano, IntentKeeper: Heidelberg, Berlin, Heidelberg, 2011, pp. 53–70. Intent-oriented Data Usage Control for Federated doi:10.1007/978-3-642-19379-8_4. Data Analytics, in: 2020 IEEE 45th Conference [88] P. Bonatti, S. Kirrane, A. Polleres, R. Wenning, on Local Computer Networks (LCN), IEEE, Sydney, Transparent Personal Data Processing: The Road NSW, Australia, 2020, pp. 204–215. doi:10.1109/ Ahead, in: S. Tonetta, E. Schoitsch, F. Bitsch LCN48667.2020.9314823. (Eds.), Computer Safety, Reliability, and Secu- [92] F. Kelbert, A. Pretschner, Data Usage Control for rity, volume 10489, Springer International Pub- Distributed Systems, ACM Transactions on Pri- lishing, Cham, 2017, pp. 337–349. doi:10.1007/ vacy and Security 21 (2018) 1–32. doi:10.1145/ 978-3-319-66284-8_28. 3183342. [89] M. Colombo, A. Lazouski, F. Martinelli, P. Mori, [93] F. Y. Rashid, The rise of confidential computing: A Proposal on Enhancing XACML with Con- Big tech companies are adopting a new security tinuous Usage Control Features, in: F. De- model to protect data while it’s in use-[news], IEEE sprez, V. Getov, T. Priol, R. Yahyapour (Eds.), Spectrum 57 (2020) 8–9. Grids, P2P and Services Computing, Springer US, 63