=Paper=
{{Paper
|id=None
|storemode=property
|title=A Privacy Preference Ontology (PPO) for Linked Data
|pdfUrl=https://ceur-ws.org/Vol-813/ldow2011-paper01.pdf
|volume=Vol-813
|dblpUrl=https://dblp.org/rec/conf/www/SaccoP11
}}
==A Privacy Preference Ontology (PPO) for Linked Data==
A Privacy Preference Ontology (PPO) for Linked Data∗ Owen Sacco and Alexandre Passant Digital Enterprise Research Institute National University of Ireland, Galway Galway, Ireland firstname.lastname@deri.org ABSTRACT (WAC) vocabulary1 . This vocabulary enables owners to cre- Linked Data enables people to access other users’ data stored ate access control lists (ACL) that specify access privileges in several places, distributed across the Web. Current Linked to the users that can access the data. The WAC vocab- Data mechanisms mostly provide an open environment where ulary defines the Read and Write access control privileges all data is freely accessible, which could discourage some (for reading or updating data) as well as the Control privi- people to provide sensitive data in the Linking Open Data lege to grant access to modify the ACL. This vocabulary is (LOD) cloud. Although the existing Web Access Control designed to specify access control to the full RDF document (WAC) vocabulary restricts RDF documents to specified rather than specifying access control properties to specific users, it does not provide fine-grained privacy measures which data contained within the RDF document. specify complex restrictions to access the data. In this pa- In [13], the authors discuss the importance that protecting per, we propose a lightweight vocabulary — on top of WAC data does not only mean granting full access or not, but in — called the Privacy Preference Ontology (PPO) that en- certain instances fine-grained access control mechanisms are ables users to create fine-grained privacy preferences for their required to restrict pieces of information. For instance users data. The vocabulary is designed to restrict any resource to could define which specific microblog posts in SMOB2 are certain attributes which a requester must satisfy. shared to certain users only based on #tags. Therefore, the Linked Data infrastructure currently lacks mechanisms for creating fine-grained privacy preferences that define which Categories and Subject Descriptors data can be accessed by whom. This might discourage Web 1.2.4 [Artificial Intelligence]: Knowledge Representation users to publish sensitive data such as user’s personal infor- Formalisms and Method; K.4.1 [Public Policy Issues]: mation contained in FOAF profiles. Privacy; K.6.5 [Management of Computing and Infor- In this paper, we propose the Privacy Preference Ontol- mation Systems]: Security and Protection ogy (OPO), a lightweight vocabulary on top of the Web Ac- cess Control ontology aiming at providing users with means General Terms to define fine-grained privacy preferences for restricting (or granting) access specific RDF data. As we rely on Semantic Design, Security Web technologies to enable these privacy preferences, our proposed vocabulary is platform independent and can thus Keywords be used by any system relying on these technologies. Privacy, Linked Data, WebID, Web Access Control, FOAF, The remainder of the paper is as follows: Section 2 presents RDF, Named Graphs some use cases where privacy is a concern. In Section 3, we present our Privacy Preference Ontology (PPO), online at 1. INTRODUCTION – http://vocab.deri.ie/ppo#, and discuss how to apply it to protect sensible data. Section 4 provides a brief descrip- The Linked Data community encourages Web users to for- tion of current privacy research related to protecting RDF mat their data and publish it on the Web in machine pro- data and social networks. Finally, Section 5 concludes this cessable formats so that other datasets can be linked to this paper and gives an overview of future work. published data. However, as pointed out in [2], one of the challenges of Linked Data is privacy. Datasets are being published in the Linking Open Data (LOD) cloud without 2. MOTIVATIONS any metadata that describes privacy restrictions, and there- Open social networks can contain user’s information de- fore the data is publicly accessible. A vocabulary that de- scribed in RDF, using common vocabularies such as FOAF scribes access control privileges is the Web Access Control to describe this data. Applications are being developed to ∗This work is funded by the Science Foundation Ireland un- export user information stored within closed social networks der grant number SFI/08/CE/I1380 (Lı́on 2), by an IRC- into RDF, while various projects now directly support these SET scholarship and by a Google Research Award. models to represent user data, such as Drupal 73 . Current 1 WAC — http://www.w3.org/ns/auth/acl 2 Copyright is held by the author/owner(s). SMOB — http://smob.me 3 LDOW2011 March 29, 2011, Hyderabad, India. Drupal — http://drupal.org/ social networks provide minimum privacy settings such as granting privileges to all people belonging to one’s social graph to access her/his information. Imagine a social net- work where users would be able to specify which information can be shared only to some contacts or friends, e.g. the ones having similar interests. This would make users feel more confident when publishing such information without being concerned that it could be reused. Moreover, such a sys- tem will let users fully-control who can access their personal information and who can access their published RDF data. Ideally, data owners can specify a set of attributes which requesters must satisfy in order to be granted access to the Figure 2: The Privacy Preference Manager requested information. For example a user can set a privacy preference to share an e-mail address only to those who are a resource, either as a subject or an object of a particular belonging to his company. This could be achieved by ex- statement. ecuting a SPARQL query combining a privacy preference Access are restricted according to patterns which users pattern and the FOAF description of the requester as sug- (that want to access data) must satisfy, for instance having gested in [14]. In this social network scenario, the WebID a particular interest or being a member of a group. We rely protocol [15] can be used to authenticate a user and also on the Web Access Control vocabulary to describe the access it provides a secure connection to a user’s personal infor- privilege to the data: either Read, Write or both. Therefore, mation stored in a FOAF profile [7]. Therefore, once a user a privacy preference contains properties defining: (1) which authenticates using WebID when visiting another user’s pro- resource, statement or graph to restrict access; (2) the type file, the privacy preferences could be checked to determine of restriction; (3) the access control type; and (4) a SPARQL which information can be accessed. query containing a graph pattern representing what must be Another scenario relates to online publications, and in satisfied by the user requesting information. particular microblogging. Currently, most microblogging Currently we assume that the user’s information is trust- systems allow any user to access posts created by others. As worthy. Eventually, we plan to extend the vocabulary to pointed out in [13], sensitive posts such as the ones shared cater for situations where the user’s information is not a within an organisation, require more complex access restric- reliable source, by relying on trust measures to do so [6]. tion. In SMOB [12], microblog posts are described in RDF One way to use this ontology is to define a personal Pri- using ontologies such as SIOC (for describing posts) and vacy Preference Manager (PPM), providing users with means FOAF (for describing user profiles). Additionally, SMOB to specify preferences based on their FOAF profile. The provides the ability to tag microblog posts with concepts PPM can then be used to grant privileges to requesters that taken from databases such as DBpedia4 and GeoNames5 un- want to access the user’s information. Figure 2 illustrates like microblogging systems such as Twitter that only allow the related concept: (1) a requester authenticates to the text-based tags. While it relied on the Online Presence On- other user’s PPM using the WebID protocol; (2) the privacy tology (OPO) [14] so that messages can be directed to par- preferences are queried to identify which preference applies; ticular users, further privacy preferences are required such (3) the preferences are matched according to the requester’s as restricting access to posts only to some people, for ex- profile to test what the requester can access; (4) the re- ample the ones having interests related to the post’s topic quested information (in this case, FOAF data) is retrieved (based on its tags). Since SMOB relies on Semantic Web based on what can be accessed; and (5) the requester is pro- technologies and Linked Data, advanced privacy preferences vided with the data she/he can access. This privacy man- can be easily applied. For example, if a user wants to re- ager will not be limited to only data described in FOAF, strict a microblog post tagged with a particular topic to a but to any RDF data since PPO is ontology-agnostic. For group of friends, this privacy preference can be applied by instance, it can be used to restrict microblog posts described restricting the post to users being interested in one of the using SIOC and other ontologies used in SMOB. ”semantic tag” used in the post, this tag being defined with its own URI, e.g. from DBpedia. 3.1 Ontology The Privacy Preference Ontology (PPO) illustrated in fig- 3. PRIVACY PREFERENCE ONTOLOGY ure 1 provides: (1) a main class called PrivacyPreference The previous use cases illustrate situations where fine- that defines a privacy preference; (2) some properties to de- grained privacy preferences are required. We therefore cre- fine which statement, resource and/or graph is to be re- ated a dedicated vocabulary called the Privacy Preference stricted; (3) some properties that define conditions in order Ontology (PPO) to describe privacy preferences that can to create specific privacy preferences; (4) some properties to restrict access to information represented as Linked Data. define which access privilege should be granted; (5) and some Since Linked Data uses RDF as a representation format, this properties that define which attribute patterns a requester requires the privacy preferences to restrict access to particu- must satisfy. Moreover, a user may want to define global lar RDF data. In particular, the vocabulary should provide preferences such as restricting access to values that have the ability to restrict access to: (1) a particular statement; a specific property. For instance, if one wants to restrict or (2) to a group of statements (i.e. an RDF graph); or (3) to access to all statements containing foaf:homepage, rather than only the ones linking to a specific homepage, s/he can 4 DBpedia — http://dbpedia.org/ create a condition that restricts every statement containing 5 GeoNames — http://www.geonames.org/ the foaf:homepage property. Hence, the restriction levels Figure 1: The Privacy Preference Ontology provided by PPO can be seen as a tree graph that contains number, based on the number prefix (hasLiteral). at the top node an instance of a class or property from any appliesToResource. The appliesToResource property ontology down to specific data value nodes found in RDF is used to specify which resource must be restricted. This statements. The privacy preference restrictions can be ap- property restricts statements that contain the resource’s URI plied to any node within this tree graph. The other classes both when it is a subject or an object. The user may cre- and properties provided by PPO are explained below. ate a condition to distinguish when the resource is either a Condition. The Condition class is used to define restric- subject or object by using the resourceAsSubject or re- tions within a privacy preference. These restrictions can sourceAsObject properties respectively. be applied using the properties provided by this class. The appliesToStatement. The appliesToStatement prop- resourceAsSubject property provides a condition whereby erty is used to specify which statement must be restricted. a resource is used as a subject in a statement. Similarly, When the user uses this property, the user must specify the the resourceAsObject property is used to apply a condi- subject, predicate and object of the statement which needs tion whenever a resource is defined as an object. In cer- to be restricted. tain cases, users would want to specify instances of a par- appliesToNamedGraph. In certain cases, users require ticular class. This is achieved by using the classAsSub- a group of statements to be restricted using similar condi- ject or the classAsObject properties. When the classAs- tions. Yet, it would be cumbersome to create a preference Subject property is used, the privacy preference applies to for each statement using the appliesToStatement property. those statements that contain the instance of the class spec- Hence, users can use named graphs [1] to combine state- ified as the subject of the statement. Additionally, if the ments and apply a privacy preference to the graph, using the classAsObject property is used, then the privacy prefer- appliesToNamedGraph property. Named graphs are identi- ence applies to those statements that the object defines the fied with URIs which can be used to refer to a particular instance of the class. The property hasProperty restricts all named graph that needs to be restricted. Although named instances of a particular property used in RDF statements. graphs are not yet standardised within the RDF specifica- This means that if one is using hasProperty together with tion these are accepted by the SPARQL specification and foaf:phone, all statements containing this property will be are in the scope of the new W3C RDF Working Group. restricted. In certain scenarios, users would require to re- hasAccessSpace. In the previous scenarios, we men- strict access to statements based on a particular literal value tioned that it may be cumbersome for users to update their contained within statements. This can be achieved by using preferences by adding or removing users manually, since the hasLiteral property. This property is useful when the user’s interests or relationships change over time. Rather user is not aware of which property describes the literal. In than specifying who can access the resources, we suggest to this scenario, the hasLiteral property must be used with use a set of attributes specifying which ones are required care since if there is another statement with the same value to access some data. This can be done by using a SPARQL but has a different property, then this statement with a dif- ASK query that contains a graph pattern specifying which at- ferent property is also restricted. This property can also tributes and properties must be satisfied. By executing the be used together with the hasProperty. For instance if a query on the requester’s FOAF profile, we know whether user wants to restrict a particular value of a specific prop- the requester satisfies these attributes or not. The SPARQL erty, then both the hasProperty and the hasLiteral must query is described as a Literal in the privacy preferences be used, such as restricting access to a mobile phone num- using the hasAccessQuery property. The hasAccessQuery ber (foaf:phone) but allowing access to the land-line phone property is defined within a class called AccessSpace which denotes a space of access test queries. Finally, the prop- erty hasAccessSpace represents the relationship between < http :// www . example . org / pp2 > a ppo : PrivacyPreference ; the privacy preference and the access space. Unfortunately, the current SPARQL specification does not cater for trig- ppo : appliesToResource < http :// smob . me / gers similar to the DBMS trigger concept6 . Therefore, the user / xyz / post1 >; query defined in hasAccessQuery has to be executed by a ppo : assignAccess acl : Read ; manual system call rather than called automatic if a hasAc- cessQuery property appears within the privacy preference. ppo : hasCondition [ ppo : hasProperty tag : Tag ; hasAccess. The Privacy Preference Ontology provides ppo : resourceAsObject a property that describes the type of access control to be < http :// dbpedia . org / resource / granted when a privacy preference applies. The hasAccess Linked_Data > ]; property defines the access control described using the Web Access Control vocabulary described in section 1. ppo : hasAccessSpace [ ppo : hasAccessQuery 3.2 Creating Privacy Preferences " ASK { ? x foaf : topic_interest Privacy preferences can easily be created using the PPO < http :// dbpedia . org / resource / and the Web Access Control vocabulary. For example if a Linked_Data > }" ]. user wants to create a privacy preference that restricts the phone number to whoever works at DERI, the following has to be defined7 . 4. RELATED WORK < http :// www . example . org / pp1 > The Platform for Privacy Preferences (P3P)9 specifies a a ppo : PrivacyPreference ; protocol that enables Web sites to share their privacy poli- cies with Web users. The privacy policies are expressed in ppo : hasCondition XML which can be easily parsed by user agents. This plat- [ ppo : hasProperty foaf : phone form does not ensure that Web sites act according to their ]; publicised policies. Moreover, since this platform aims to enable Web sites to define their privacy policies, it does not ppo : hasAccess acl : Read ; solve our aim of enabling users to define their own privacy preferences. The Protocol for Web Description Resources ppo : hasAccessSpace (POWDER)10 is designed to express statements that de- [ ppo : hasAccessQuery scribe what a collection of RDF statements are about. The " ASK { descriptions expressed using this protocol are text based ? x foaf : workplaceHomepage and therefore do not contain any semantics that can define < http :// www . deri . ie > }" what the description states. Therefore, our approach en- ]. ables users to define what the privacy preferences are about and hence facilitate other systems to use such preferences. This example illustrates that wherever in the user’s profile The authors in [9] propose a privacy preference formal there is a statement that contains a property foaf:phone model consisting of relationships between objects and sub- then all statements containing this property are restricted. jects. Objects consist of resources and actions, whereas sub- If the user requires a particular foaf:phone to be restricted, jects are those roles that are allowed to perform the action then the user must also define the phone number in the on the resource. The privacy settings based on this formal condition by using the hasLiteral property. As mentioned model are implemented using Protune [3], a policy frame- in the previous section, the SPARQL query is executed on work that consists of a policy language and a policy reasoner. the requester’s FOAF profile by the system once it parses This implies that any system using this method must have that there is a query. The query returns either True or the Protune framework. Since our aim is to propose a light False whether the requester’s information satisfies the graph weight vocabulary that can be platform independent, there- pattern or not since the query is a SPARQL ASK query. If the fore this approach of using the Protune policy engine does query returns a Yes then the requester is granted access to not solve our goal. Moreover, the proposed formal model re- the statement, otherwise the requester is not allowed access8 . lies on specifying precisely who can access the resource. Our The following example shows how to restrict a microblog approach provides a more flexible solution which requires the post to users that share an interest similar to the concept user to specify attributes which the requester must satisfy. used to tag the post. Restricting posts tagged with the con- The authors in [4] propose an access control framework for cept of Linked Data to all users interested in Linked Data Social Networks by specifying privacy rules using the Seman- is done as follows: tic Web Rule Language (SWRL) 11 . This approach is also based on specifying who can access which resource. More- 6 over, this approach relies that the system contains a SWRL However, some SPARQL engines provide triggers, such as ARC2 — http://arc.semsol.org reasoner. In [5] the authors propose a relational based access 7 We assume that a PPO interpreter would know the com- control model called RelBac which provides a formal model mon prefixes for SPARQL queries, while they could also be based on relationships amongst communities and resources. defined in the ASK pattern. 9 8 As previously mentioned, so far, we assume that we can P3P — http://www.w3.org/TR/P3P/ 10 trust the statements defined in the requester FOAF file, and POWDER — http://www.w3.org/TR/powder-dr/ 11 we tackle this issue separately. SWRL — http://www.w3.org/Submission/SWRL/ This approach also requires to specifically define who can 6. REFERENCES access the resource(s). [1] C. Bizer and J. Carroll. Modelling Context Using The authors in [11] propose a tag-based model to cre- Named Graphs. In W3C Semantic Web Interest Group ate privacy settings for medical applications that consist of Meeting, 2004. annotating resources with different access policy rules. The [2] C. Bizer, T. Heath, and T. Berners-Lee. Linked Data - privacy rules are denoted in a system specific language which The Story So Far. International Journal on Semantic only the system can interpret the access control. The au- Web and Information Systems, 2009. thors in [10] also propose an annotation based access control [3] P. Bonatti and D. Olmedilla. Driving and Monitoring model. This approach enables users to annotate the resource Provisional Trust Negotiation with Metapolicies. In and also to annotate users. The access control rules there- Sixth IEEE International Workshop on Policies for fore specify which resource annotations can be accessed by Distributed Systems and Networks, 2005. which user annotations. Although this approach might be [4] B. Carminati, E. Ferrari, R. Heatherly, more flexible than other systems, it still relies on specifying M. Kantarcioglu, and B. Thuraisingham. A Semantic who can access the resource. Web Based Framework for Social Network Access In [14] the authors propose a method to direct messages, Control. In Proceedings of the 14th ACM Symposium such as microblog posts in SMOB, to specific users accord- on Access Control Models and Technologies, SACMAT ing to their online status. The authors also propose the idea ’09, 2009. of a SharingSpace which represents the persons or group of persons who can access the messages. The authors also [5] F. Giunchiglia, R. Zhang, and B. Crispo. Ontology describe that a SharingSpace can be a dynamic group con- Driven Community Access Control. Trust and Privacy structed using a SPARQL CONSTRUCT query. However, the on the Social and Semantic Web, SPOT’09, 2009. proposed ontology only allows relating the messages to a [6] O. Hartig. Querying trust in rdf data with tsparql. In pre-constructed group. 6th Annual European Semantic Web Conference, In [8] the authors propose a system whereby users can ESWC’09, 2009. set access control to RDF documents. The access controls [7] B. Heitmann, J. Kim, A. Passant, C. Hayes, and are described using the Web Access Control vocabulary by H. Kim. An Architecture for Privacy-Enabled User specifying who can access which RDF document. Authenti- Profile Portability on the Web of Data. In Proceedings cation to this system is achieved using the WebID protocol of the 1st International Workshop on Information [15]. This protocol uses FOAF+SSL techniques whereby a Heterogeneity and Fusion in Recommender Systems, user provides a certificate which contains a URL that de- HetRec ’10, 2010. notes the user’s FOAF profile. The public key from the [8] J. Hollenbach and J. Presbrey. Using RDF Metadata FOAF profile and the public key contained in the certificate to Enable Access Control on the Social Semantic Web. which the user provides are matched to allow or disallow In Proceedings of the Workshop on Collaborative access. Our approach extends the Web Access Control vo- Construction, Management and Linking of Structured cabulary to provide more fine-grained access control to the Knowledge, CK’09, 2009. data rather than to the whole RDF document. [9] P. Kärger and W. Siberski. Guarding a Walled Garden Semantic Privacy Preferences for the Social Web. The 5. CONCLUSION AND FUTURE WORK Semantic Web: Research and Applications, 2010. In this paper we argue that there are not sufficient fine- [10] P. Nasirifard, V. Peristeras, and S. Decker. grained privacy preferences for Linked Data. We therefore Annotation-Based Access Control for Collaborative proposed a light weight vocabulary which provides classes Information Spaces. Computers in Human Behavior, and properties to define fine-grained privacy preferences for 2010. RDF data. The privacy preferences define what needs to be [11] S. Nepal, J. Zic, F. Jaccard, and G. Kraehenbuehl. A protected, the conditions to create fine-grained restrictions; Tag-Based Data Model for Privacy-Preserving Medical which access control privilege will be granted and a space Applications. Current Trends in Database Technology, to define which attributes a requester must satisfy in order 2006. to access the resource. The access control privileges are de- [12] A. Passant, J. Breslin, and S. Decker. Rethinking scribed using the Web Access Control vocabulary. We plan Microblogging: Open, Distributed, Semantic. In to extend the PPO to also restrict actions which are com- Proceedings of the 10th International Conference on monly found in Social Web applications and we also plan to Web Engineering, ICWE’10, 2010. extend our work to cater for conflicting privacy preferences. [13] A. Passant, P. Kärger, M. Hausenblas, D. Olmedilla, Additionally, we will investigate a formal model for PPO A. Polleres, and S. Decker. Enabling Trust and and its relationships with RDFS and OWL entailments, to Privacy on the Social Web. In W3C Workshop on the ensure that preferences can also apply to inferred data (for Future of Social Networking), 2009. example to restrict the sub-properties or subclasses of the [14] M. Stankovic, A. Passant, and P. Laublet. Directing property or class being restricted). This step will also be re- status messages to their audience in online quired to be sure that will not be any vulnerability attacks communities. In Proceedings of the 5th International caused by inferred statements. Moreover, we are currently Conference on Coordination, Organizations, developing the Privacy Preference Manager mentioned in Institutions, and Norms in Agent Systems, 2010. section 3 which provides a user-friendly interface where users [15] H. Story, B. Harbulot, I. Jacobi, and M. Jones. FOAF can specify privacy preferences described using the Privacy + SSL : RESTful Authentication for the Social Web. Preference Ontology, as well as applying the privacy prefer- Semantic Web Conference, 2009. ences when accessing the RDF data.