If You Can’t Enforce It, Contract It: Enforceability in Policy-Driven (Linked) Data Markets ∗ Simon Steyskal , Sabrina Kirrane Vienna University of Economics and Business, Vienna, Austria [firstname.lastname]@wu.ac.at ABSTRACT for the buying and selling of raw data, but also offers value- The Web of Data refers to a network of data, which is pub- added services derived from this data (e.g. data cleansing, lished from various data sources, distributed across differ- integration, analytics and visualisation). According to a re- ent machines, and possibly interconnected as Linked (Open) cent survey conducted by the European Research Center for Data. We assume that in the near future these machines will Information Systems (ERCIS) [9], last year there was a slight not only publish and consume data, but will also perform decrease in the number of service providers offering access transactions in digital data markets without human inter- to raw data and an increase in the provision of high quality vention. For these digital data markets to succeed, it is cru- processed data. Here high quality processed data refers to cial that published data is accessed and used in a manner, data that is represented in a manner which supports data which is compliant with restrictions or regulations that have integration and analytics (i.e. accurate data represented in been defined by data publishers. While it is fairly simple to manner which is interoperable, flexible and extensible). Ad- express access policies using one of the numerous vocabu- ditionally the survey highlighted that the number of service laries available, the actual enforcement of those policies is providers that publish data using the Resource Description rather difficult especially when taking dependencies among Framework (RDF) is significantly less than the number that policies into account. In this paper, we demonstrate how publish data using the Extensible Markup Language (XML) ODRL can be used not only to represent access policies but or Comma-Separated Values (CSV) / Excel Spreadsheets also to specify access requests, offers and agreements, and (XLS). Given that interoperability, flexibility, and extensi- propose an approach to generate on-the-fly contracts that bility are cornerstones of the RDF data model and the fact govern all explicit and implicit non-enforceable policies. that the number of Linked Open Data publishers is grow- ing year-on-year, it begs the question what are the current Categories and Subject Descriptors challenges for Linked Data Markets?. Although there are K.4.4 [Computers and Society]: Electronic Commerce - a number of challenges with respect to data quality, data Security; K.6.5 [Management of Computing and Infor- lifecycle management and quality of service, in this paper mation Systems]: Security and Protection we focus specifically on the challenges that relate to access control and licensing. General Terms Möller and Dodds [7], De Virgilio et al. [2] and Kim et al. [5] Data Markets, Policies, Enforceability, Policy-Driven Linked all propose systems that can potentially be used to realise Data Markets, ODRL the LDM vision, however very little consideration if any is given either to access control or machine readable licensing. 1. INTRODUCTION A number of authors have looked into using the Open Dig- In recent years, we have seen the emergence of online service ital Rights language (ODRL) to specify access constraints providers who trade, potentially one of the most valuable and licensing [1, 10, 8, 11], however they do not focus on the commodities for any business, data. The service offering, question of enforceability nor do they apply their work to which is commonly known as a data market, caters not only LDMs. In order to fill this gap in this paper, we present our ∗Simon Steyskal has been partially funded by the Vienna vision of a Policy-Driven (Linked) Data Market and discuss how our framework can be used to cater for both enforce- Science and Technology Fund (WWTF) through project ICT12-015 and by the Austrian Research Promotion Agency able and non-enforceable ODRL policies. Our main con- (FFG) grant 845638 (SHAPE). tributions can be summarized as follows, we: (i) propose a workflow for PDLDM transactions and demonstrate how ODRL can be used not only to represent access policies but also to specify access requests, data offers and agreements; and (ii) present a framework which can be used to both en- force access restrictions (in the case of enforceable policies) and automatically generate license agreements (in the case of non-enforceable policies). The remainder of the paper is structured as follows: 63 We demonstrate how the ODRL can be used to express a access or generates an error. In the sample policies that fol- variety of policies in Section 2. Our strategy for dealing low we use an odrl prefix for with non-enforceable policies is presented in Section 3. We and an ex prefix for . Listing 1 discuss related work in Section 4. Finally, we conclude and demonstrates how ODRL can be used to specify two poli- outline directions for future work in Section 5. cies, one that prohibits ex:provider1 to aggregate data from ex:dataset1 and another that permits ex:provider1 to read 2. EXPRESSING (LINKED) DATA MARKET data from ex:dataset1. POLICIES IN ODRL A Data Market is a platform where data and potentially Listing 1: Policy governing access to ex:dataset1 value-added services derived from the data are bought and ex:storedPolicy1 a odrl:Set ; sold. Although data markets are not a new concept, with an odrl:prohibition [ a odrl:Prohibition ; odrl:assigner ex:provider1 ; ever increasing amount of data available (social data, sen- odrl:target ex:dataset1 ; sor data, open data) and advances in information technology odrl:action odrl:aggregate ] ; we are seeing more and more online marketplaces appear [9]. ex:storedPolicy2 a odrl:Set ; Data consumers can benefit from the high quality data, that odrl:permission [ a odrl:Permission ; is aggregated and presented in a consistent format, making odrl:assigner ex:provider1 ; it easier for then to find and use the data they require. On odrl:target ex:dataset1 ; odrl:action odrl:read ] . the other hand, data produces can outsource the cleansing, hosting and discoverability of their data. While, both parties can take advantage of value added services such as integra- tion and analytics. 2.1 Selected ODRL Policy Types In this paper, we go beyond simple access control policies A Linked Data Market is a specific type of marketplace, and licenses and demonstrate how ODRL can be used to which is built on top of the Linked Data Web (LDW) and represent access requests, data offers and agreements. Al- adheres to the Linked Data principles. In this paper, we though all types of policies share the same general structure propose a Policy-Driven (Linked) Data Market (PDLDM) (i.e. they all consist of a set of rules and a conflict resolution where data requests, data offers, access policies and agree- strategy) they differ in terms of their scope. ments are encoded in machine readable policies. The var- ious transactions required for contract negotiation are rep- ODRL Request Policies contain rules that represent the resented using the workflow illustrated in Figure 1, which terms of usage sought by a data consumer. The policy de- consists of four major steps: fined in Listing 2 can be used to specify that ex:consumer1 requests read access to ex:dataset1. 1. Make a request. A data transaction is initiated when a data consumer issues a request to the data market, which Listing 2: Request read access to ex:dataset1 is subsequently forwarded to one or more data providers ex:request a odrl:Request ; who can potentially service the request. odrl:permission [ a odrl:Permission ; odrl:assignee ex:consumer1 ; 2. Check applicable policies. On receipt of the request odrl:target ex:dataset1 ; the data provider retrieves the relevant access policies odrl:action odrl:read ] . (relevance is determined based on the data requested and the credentials supplied by the data consumer). ODRL Offer Policies contain rules that propose terms of 3. Compose and offer contract. The data provider gen- usage to data consumers. The policy defined in Listing 3 erates a machine readable contract (known as an offer), offers ex:consumer1 read access to ex:dataset1 if they agree based on the explicit and implicit non-enforceable actions to a contract that prohibits them from aggregating the data. that are associated with the request. The auto-generated contract is subsequently offered to the data consumer. Listing 3: Offer a contract for ex:dataset1 4. Accept contract. If the data consumer agrees to the ex:offer a odrl:Offer ; terms of the contract, an agreement between the data odrl:prohibition [ a odrl:Prohibition ; consumer and the data publisher is generated and per- odrl:assigner ex:provider1 ; sisted for accountability and compliance purposes. odrl:assignee ex:consumer1 ; odrl:target ex:dataset1 ; odrl:action odrl:aggregate ] . The Open Digital Rights Language (ODRL) [4] is a com- prehensive policy expression language that is suitable for ODRL Agreement Policies represent contracts between expressing fine-grained access restrictions, access policies, data producers and consumers that stipulate all terms of us- as well as licensing information for Linked Data as shown age. The policy defined in Listing 4 states that ex:consumer1 in [1, 10]. has agreed to a contract that prohibits them from aggregat- ing the data from ex:dataset1. An ODRL Policy is composed of a set of ODRL Rules and an ODRL Conflict Resolution Strategy, which is used by the enforcement mechanism to ensure that when conflicts Listing 4: Construct an agreement for ex:dataset1 among rules occur the system either grants access, denies ex:agreement a odrl:Agreement ; 64 Figure 1: PDLDM workflow odrl:prohibition [ a odrl:Prohibition ; actions relating to assets that they request (Step 2). This odrl:assigner ex:provider1 ; matching process does not only consider actions explicitly odrl:assignee ex:consumer1 ; stated in the request but also those which are implicitly re- odrl:target ex:dataset1 ; odrl:action odrl:aggregrate ] ; lated to them and the relevant conflict resolution strategy. odrl:permission [ a odrl:Permission ; A contract which is composed and offered (Step 3) is repre- odrl:assigner ex:provider1 ; sented as an ODRL Offer Policy and incorporates a set of odrl:assignee ex:consumer1 ; requested permissions together with the terms of usage that odrl:target ex:dataset1 ; are retrieved from the data provider’s stored policies. odrl:action odrl:read ] . Algorithm 1: Minimal Contract Composition Algorithm 3. ENFORCING ODRL POLICIES Not only in PDLDMs but also in other domains, policies Input: A set of applicable ODRL Policies P according to and especially licenses are widely used to stipulate terms of a certain ODRL Request Policy π R . usage for assets. From a data producer perspective, gover- Output: A minimal ODRL Offer Policy π O . nance and ensuring compliance with non-enforceable poli- 1 forall the policies π in P do cies is difficult and can result in litigation, which can be a 2 forall the permission rules δ in π do lengthy and expensive processes. As such when it comes to 3 add δ to the set of permission rules in π O ; PDLDMs, it is necessary to make the distinction between 4 add all new uncontrollable actions to the set of enforceable and non-enforceable policies and to propose a uncontrollable actions; framework that is capable of handling both. Another con- 5 end sideration is the fact that data consumers might be less eager 6 forall the prohibition rules δ in π do to conduct business with data providers that offer complex 7 if prohibited action α is uncontrollable then and verbose contracts (even if they able to comply with the 8 add δ to the set of prohibition rules in π O ; verbose policies), as opposed to data providers that keep 9 end their contracts as concise as possible. As such, we propose 10 end an access control strategy, which on receipt of a request ver- 11 end ifies that the requested access is allowed and auto-generates contracts for non-enforceable policies that are as concise as possible (i.e. minimal contracts). Algorithm 1 (minimal contract composition) denotes the com- position procedure that is used to generate minimal ODRL 3.1 Enforceability of ODRL Policies Offer Policies. The algorithm takes an ODRL Request Pol- A policy is enforceable if restrictions on actions defined in icy π R and a respective set of applicable ODRL Policies P the policy can actually be controlled by a system. In the retrieved from the policy store as input and iterates over all context of ODRL we define an ODRL Action to be control- policies in P . lable if its execution is permitted, or in the case where its execution is prohibited compliance with the prohibition can • For each of the permission rules, the algorithm adds be controlled by the party who assigned the policy. Thus, a all actions that become uncontrollable once the per- policy is defined to be enforceable, if all actions it aims to mission has been granted to the overall set of uncon- prohibit are not part of the set of uncontrollable actions. trollable actions of the policy, and adds the permission rule to the set of permission rules in π O (line 1-5). 3.2 Composition of Minimal Contracts We propose an algorithm which auto-generates contracts for • For each of the prohibition rules, the algorithm checks non-enforceable policies based on the workflow presented in whether the rule prohibits an action that is defined Section 2. A data request which is submitted by a data to be uncontrollable (line 6-7). If that is the case, the consumer (Step 1) is matched against a set of stored poli- respective prohibition rule is added to the ODRL Offer cies based on the credentials of the requesting party and the Policy π O (line 8). 65 The final ODRL Offer Policy π O now consists of all re- A LDM is a specific type of marketplace, which is built on quested permissions a data provider is able to grant as well top of the LDW and adheres to the Linked Data principles. as all non-enforceable prohibitions that are consequences of If LDMs are to succeed, it is crucial that data published is ac- these permissions. The ODRL Offer Policy is subsequently cessed and used in a manner, which is compliant with access offered to the data consumer that initiated the transaction restrictions and licenses. In this paper, we demonstrated (Step 4). If the data consumer agrees to the terms of the how ODRL can be used to specify auto-generated contracts. contract (i.e. accepts), an ODRL Agreement Policy is gen- We subsequently proposed a framework which can be used erated from the ODRL Offer Policy and persisted for ac- to both enforce access restrictions and automatically gener- countability and compliance purposes (Step 5). ate contractual agreements for non-enforceable policies. In future work, we will investigate the various mechanisms that 4. RELATED WORK can be used to ensure policy compliance and accountability. Möller and Dodds [7], De Virgilio et al. [2] and Kim et al. [5] We also plan to extend the existing framework to support all propose systems that can potentially be used to realise advanced contract composition and privacy protecting, us- the LDM vision. Möller and Dodds [7] describe the Kasabi ing a combination of negotiation and reasoning techniques. information marketplace which is built on Linked Data prin- ciples. Although data publishers are required to supply li- References censing metadata, the authors do not detail how access to [1] Elena Cabrio, Alessio Palmero Aprosio, and Serena Vil- data is restricted or how licenses are enforced. De Virgilio lata. These are your rights. In Proceedings of the 11th et al. [2] present Nyaya, a system which can be used to Extended Semantic Web Conference (ESWC), 2014. manage different Semantic Web datasets. The authors dis- [2] Roberto De Virgilio, Giorgio Orsi, Letizia Tanca, and cuss how their system can support user defined constraints, Riccardo Torlone. Semantic Data Markets: a Flexible however no specific consideration is given either to access Environment for Knowledge Management. In Proceed- policies or licenses. Kim et al. [5] present an architecture ings of the 20th ACM international conference on In- that can be used to support Linked Open Data as a Ser- formation and knowledge management, 2011. vice (LODaaS) however, they do not mention either access control or licensing. [3] Susanne Guth, Gustaf Neumann, and Mark Strembeck. Experiences with the Enforcement of Access Rights Ex- When it comes to access control for RDF, broadly speak- tracted from ODRL-based Digital Contracts. In Pro- ing researchers have focused on representing existing ac- ceedings of the 3rd ACM Workshop on Digital Rights cess control models and standards using semantic technol- Management, DRM ’03, 2003. ogy; proposing new access control models suitable for open, [4] Renato Iannella, Susanne Guth, Daniel Pähler, and heterogeneous and distributed environments; and devising Andreas Kasten. Odrl: Open digital rights language languages and frameworks that can be used to facilitate ac- 2.1. W3C ODRL Community Group, 2012. http: cess control specification and maintenance. Kirrane et al. //www.w3.org/community/odrl/. [6] provide a comprehensive survey of existing access con- trol proposals for RDF. To date no specific consideration [5] Seonho Kim, Ivan Berlocher, and Tony Lee. RDF based has been given to enforceable versus non-enforceable poli- Linked Open Data Management as a DaaS Platform. cies. There has however been a number of digital rights 2015. management proposals that use ODRL to model their ac- [6] Sabrina Kirrane, Alessandra Mileo, and Stefan Decker. cess control and licensing policies. Guth et al. [3] demon- Access control and the resource description framework: strate how ODRL can be used to exchange access control A survey. Technical Report, 2015. information and present a framework, which can be used to enforce access control policies. Cabrio et al. [1] discuss how [7] Knud Möller and Leigh Dodds. The Kasabi Information ODRL can be used to model licenses as opposed to access Marketplace. In 21nd World Wide Web Conference, rights. Rodriguez-Doncel et al. [8] present a legal framework Lyon, France, 2012. for publishing and consuming Linked Data and provide an overview of the existing vocabularies for rights and licensing [8] Victor Rodriguez-Doncel, Asunción Gómez-Pérez, and represented using RDF. Villata and Gandon [11] present a Nandana Mihindukulasooriya. Rights declaration in framework which associates licensing terms with data and linked data. In The 3rd International Workshop on auto-generates an aggregated license. Consuming Linked Data. 2013. [9] Florian Stahl, Fabian Schomm, and Gottfried Vossen. In this paper, we go beyond existing proposals by demon- The Data Marketplace Survey Revisited. Technical re- strating how ODRL can be used to represent not only access port, Working Papers, ERCIS-European Research Cen- policies and licenses, but can also support contract negotia- ter for Information Systems, 2014. tion in the form of data requests, data offers and data agree- ments. We subsequent present a framework, which is ca- [10] Simon Steyskal and Axel Polleres. Defining expressive pable of dealing with both enforceable and non-enforceable access policies for linked data using the ODRL ontology policies. 2.0. In Proceedings of the 10th International Conference on Semantic Systems, SEMANTICS 2014, 2014. 5. CONCLUSIONS AND FUTURE WORK [11] Serena Villata and Fabien Gandon. Licenses compati- A digital data market is an online marketplace where data bility and composition in the web of data. In The 2nd and potentially value-added services such as data cleansing, International Workshop on Consuming Linked Data. integration, analytics and visualisation are bought and sold. 2012. 66