=Paper= {{Paper |id=None |storemode=property |title=A Privacy Preference Ontology (PPO) for Linked Data |pdfUrl=https://ceur-ws.org/Vol-813/ldow2011-paper01.pdf |volume=Vol-813 |dblpUrl=https://dblp.org/rec/conf/www/SaccoP11 }} ==A Privacy Preference Ontology (PPO) for Linked Data== https://ceur-ws.org/Vol-813/ldow2011-paper01.pdf
     A Privacy Preference Ontology (PPO) for Linked Data∗

                                             Owen Sacco and Alexandre Passant
                                               Digital Enterprise Research Institute
                                               National University of Ireland, Galway
                                                          Galway, Ireland
                                                firstname.lastname@deri.org


ABSTRACT                                                            (WAC) vocabulary1 . This vocabulary enables owners to cre-
Linked Data enables people to access other users’ data stored       ate access control lists (ACL) that specify access privileges
in several places, distributed across the Web. Current Linked       to the users that can access the data. The WAC vocab-
Data mechanisms mostly provide an open environment where            ulary defines the Read and Write access control privileges
all data is freely accessible, which could discourage some          (for reading or updating data) as well as the Control privi-
people to provide sensitive data in the Linking Open Data           lege to grant access to modify the ACL. This vocabulary is
(LOD) cloud. Although the existing Web Access Control               designed to specify access control to the full RDF document
(WAC) vocabulary restricts RDF documents to specified                rather than specifying access control properties to specific
users, it does not provide fine-grained privacy measures which       data contained within the RDF document.
specify complex restrictions to access the data. In this pa-           In [13], the authors discuss the importance that protecting
per, we propose a lightweight vocabulary — on top of WAC            data does not only mean granting full access or not, but in
— called the Privacy Preference Ontology (PPO) that en-             certain instances fine-grained access control mechanisms are
ables users to create fine-grained privacy preferences for their     required to restrict pieces of information. For instance users
data. The vocabulary is designed to restrict any resource to        could define which specific microblog posts in SMOB2 are
certain attributes which a requester must satisfy.                  shared to certain users only based on #tags. Therefore, the
                                                                    Linked Data infrastructure currently lacks mechanisms for
                                                                    creating fine-grained privacy preferences that define which
Categories and Subject Descriptors                                  data can be accessed by whom. This might discourage Web
1.2.4 [Artificial Intelligence]: Knowledge Representation           users to publish sensitive data such as user’s personal infor-
Formalisms and Method; K.4.1 [Public Policy Issues]:                mation contained in FOAF profiles.
Privacy; K.6.5 [Management of Computing and Infor-                     In this paper, we propose the Privacy Preference Ontol-
mation Systems]: Security and Protection                            ogy (OPO), a lightweight vocabulary on top of the Web Ac-
                                                                    cess Control ontology aiming at providing users with means
General Terms                                                       to define fine-grained privacy preferences for restricting (or
                                                                    granting) access specific RDF data. As we rely on Semantic
Design, Security                                                    Web technologies to enable these privacy preferences, our
                                                                    proposed vocabulary is platform independent and can thus
Keywords                                                            be used by any system relying on these technologies.
Privacy, Linked Data, WebID, Web Access Control, FOAF,                 The remainder of the paper is as follows: Section 2 presents
RDF, Named Graphs                                                   some use cases where privacy is a concern. In Section 3, we
                                                                    present our Privacy Preference Ontology (PPO), online at
1. INTRODUCTION                                                     – http://vocab.deri.ie/ppo#, and discuss how to apply it
                                                                    to protect sensible data. Section 4 provides a brief descrip-
  The Linked Data community encourages Web users to for-            tion of current privacy research related to protecting RDF
mat their data and publish it on the Web in machine pro-            data and social networks. Finally, Section 5 concludes this
cessable formats so that other datasets can be linked to this       paper and gives an overview of future work.
published data. However, as pointed out in [2], one of the
challenges of Linked Data is privacy. Datasets are being
published in the Linking Open Data (LOD) cloud without              2.   MOTIVATIONS
any metadata that describes privacy restrictions, and there-          Open social networks can contain user’s information de-
fore the data is publicly accessible. A vocabulary that de-         scribed in RDF, using common vocabularies such as FOAF
scribes access control privileges is the Web Access Control         to describe this data. Applications are being developed to
∗This work is funded by the Science Foundation Ireland un-          export user information stored within closed social networks
der grant number SFI/08/CE/I1380 (Lı́on 2), by an IRC-              into RDF, while various projects now directly support these
SET scholarship and by a Google Research Award.                     models to represent user data, such as Drupal 73 . Current
                                                                    1
                                                                      WAC — http://www.w3.org/ns/auth/acl
                                                                    2
Copyright is held by the author/owner(s).                             SMOB — http://smob.me
                                                                    3
LDOW2011 March 29, 2011, Hyderabad, India.                            Drupal — http://drupal.org/
social networks provide minimum privacy settings such as
granting privileges to all people belonging to one’s social
graph to access her/his information. Imagine a social net-
work where users would be able to specify which information
can be shared only to some contacts or friends, e.g. the ones
having similar interests. This would make users feel more
confident when publishing such information without being
concerned that it could be reused. Moreover, such a sys-
tem will let users fully-control who can access their personal
information and who can access their published RDF data.
Ideally, data owners can specify a set of attributes which
requesters must satisfy in order to be granted access to the          Figure 2: The Privacy Preference Manager
requested information. For example a user can set a privacy
preference to share an e-mail address only to those who are
                                                                  a resource, either as a subject or an object of a particular
belonging to his company. This could be achieved by ex-
                                                                  statement.
ecuting a SPARQL query combining a privacy preference
                                                                     Access are restricted according to patterns which users
pattern and the FOAF description of the requester as sug-
                                                                  (that want to access data) must satisfy, for instance having
gested in [14]. In this social network scenario, the WebID
                                                                  a particular interest or being a member of a group. We rely
protocol [15] can be used to authenticate a user and also
                                                                  on the Web Access Control vocabulary to describe the access
it provides a secure connection to a user’s personal infor-
                                                                  privilege to the data: either Read, Write or both. Therefore,
mation stored in a FOAF profile [7]. Therefore, once a user
                                                                  a privacy preference contains properties defining: (1) which
authenticates using WebID when visiting another user’s pro-
                                                                  resource, statement or graph to restrict access; (2) the type
file, the privacy preferences could be checked to determine
                                                                  of restriction; (3) the access control type; and (4) a SPARQL
which information can be accessed.
                                                                  query containing a graph pattern representing what must be
   Another scenario relates to online publications, and in
                                                                  satisfied by the user requesting information.
particular microblogging. Currently, most microblogging
                                                                     Currently we assume that the user’s information is trust-
systems allow any user to access posts created by others. As
                                                                  worthy. Eventually, we plan to extend the vocabulary to
pointed out in [13], sensitive posts such as the ones shared
                                                                  cater for situations where the user’s information is not a
within an organisation, require more complex access restric-
                                                                  reliable source, by relying on trust measures to do so [6].
tion. In SMOB [12], microblog posts are described in RDF
                                                                     One way to use this ontology is to define a personal Pri-
using ontologies such as SIOC (for describing posts) and
                                                                  vacy Preference Manager (PPM), providing users with means
FOAF (for describing user profiles). Additionally, SMOB
                                                                  to specify preferences based on their FOAF profile. The
provides the ability to tag microblog posts with concepts
                                                                  PPM can then be used to grant privileges to requesters that
taken from databases such as DBpedia4 and GeoNames5 un-
                                                                  want to access the user’s information. Figure 2 illustrates
like microblogging systems such as Twitter that only allow
                                                                  the related concept: (1) a requester authenticates to the
text-based tags. While it relied on the Online Presence On-
                                                                  other user’s PPM using the WebID protocol; (2) the privacy
tology (OPO) [14] so that messages can be directed to par-
                                                                  preferences are queried to identify which preference applies;
ticular users, further privacy preferences are required such
                                                                  (3) the preferences are matched according to the requester’s
as restricting access to posts only to some people, for ex-
                                                                  profile to test what the requester can access; (4) the re-
ample the ones having interests related to the post’s topic
                                                                  quested information (in this case, FOAF data) is retrieved
(based on its tags). Since SMOB relies on Semantic Web
                                                                  based on what can be accessed; and (5) the requester is pro-
technologies and Linked Data, advanced privacy preferences
                                                                  vided with the data she/he can access. This privacy man-
can be easily applied. For example, if a user wants to re-
                                                                  ager will not be limited to only data described in FOAF,
strict a microblog post tagged with a particular topic to a
                                                                  but to any RDF data since PPO is ontology-agnostic. For
group of friends, this privacy preference can be applied by
                                                                  instance, it can be used to restrict microblog posts described
restricting the post to users being interested in one of the
                                                                  using SIOC and other ontologies used in SMOB.
”semantic tag” used in the post, this tag being defined with
its own URI, e.g. from DBpedia.                                   3.1 Ontology
                                                                    The Privacy Preference Ontology (PPO) illustrated in fig-
3. PRIVACY PREFERENCE ONTOLOGY                                    ure 1 provides: (1) a main class called PrivacyPreference
  The previous use cases illustrate situations where fine-         that defines a privacy preference; (2) some properties to de-
grained privacy preferences are required. We therefore cre-       fine which statement, resource and/or graph is to be re-
ated a dedicated vocabulary called the Privacy Preference         stricted; (3) some properties that define conditions in order
Ontology (PPO) to describe privacy preferences that can           to create specific privacy preferences; (4) some properties to
restrict access to information represented as Linked Data.        define which access privilege should be granted; (5) and some
Since Linked Data uses RDF as a representation format, this       properties that define which attribute patterns a requester
requires the privacy preferences to restrict access to particu-   must satisfy. Moreover, a user may want to define global
lar RDF data. In particular, the vocabulary should provide        preferences such as restricting access to values that have
the ability to restrict access to: (1) a particular statement;    a specific property. For instance, if one wants to restrict
or (2) to a group of statements (i.e. an RDF graph); or (3) to    access to all statements containing foaf:homepage, rather
                                                                  than only the ones linking to a specific homepage, s/he can
4
    DBpedia — http://dbpedia.org/                                 create a condition that restricts every statement containing
5
    GeoNames — http://www.geonames.org/                           the foaf:homepage property. Hence, the restriction levels
                                       Figure 1: The Privacy Preference Ontology


provided by PPO can be seen as a tree graph that contains         number, based on the number prefix (hasLiteral).
at the top node an instance of a class or property from any          appliesToResource. The appliesToResource property
ontology down to specific data value nodes found in RDF            is used to specify which resource must be restricted. This
statements. The privacy preference restrictions can be ap-        property restricts statements that contain the resource’s URI
plied to any node within this tree graph. The other classes       both when it is a subject or an object. The user may cre-
and properties provided by PPO are explained below.               ate a condition to distinguish when the resource is either a
   Condition. The Condition class is used to define restric-       subject or object by using the resourceAsSubject or re-
tions within a privacy preference. These restrictions can         sourceAsObject properties respectively.
be applied using the properties provided by this class. The          appliesToStatement. The appliesToStatement prop-
resourceAsSubject property provides a condition whereby           erty is used to specify which statement must be restricted.
a resource is used as a subject in a statement. Similarly,        When the user uses this property, the user must specify the
the resourceAsObject property is used to apply a condi-           subject, predicate and object of the statement which needs
tion whenever a resource is defined as an object. In cer-          to be restricted.
tain cases, users would want to specify instances of a par-          appliesToNamedGraph. In certain cases, users require
ticular class. This is achieved by using the classAsSub-          a group of statements to be restricted using similar condi-
ject or the classAsObject properties. When the classAs-           tions. Yet, it would be cumbersome to create a preference
Subject property is used, the privacy preference applies to       for each statement using the appliesToStatement property.
those statements that contain the instance of the class spec-     Hence, users can use named graphs [1] to combine state-
ified as the subject of the statement. Additionally, if the        ments and apply a privacy preference to the graph, using the
classAsObject property is used, then the privacy prefer-          appliesToNamedGraph property. Named graphs are identi-
ence applies to those statements that the object defines the       fied with URIs which can be used to refer to a particular
instance of the class. The property hasProperty restricts all     named graph that needs to be restricted. Although named
instances of a particular property used in RDF statements.        graphs are not yet standardised within the RDF specifica-
This means that if one is using hasProperty together with         tion these are accepted by the SPARQL specification and
foaf:phone, all statements containing this property will be       are in the scope of the new W3C RDF Working Group.
restricted. In certain scenarios, users would require to re-         hasAccessSpace. In the previous scenarios, we men-
strict access to statements based on a particular literal value   tioned that it may be cumbersome for users to update their
contained within statements. This can be achieved by using        preferences by adding or removing users manually, since
the hasLiteral property. This property is useful when the         user’s interests or relationships change over time. Rather
user is not aware of which property describes the literal. In     than specifying who can access the resources, we suggest to
this scenario, the hasLiteral property must be used with          use a set of attributes specifying which ones are required
care since if there is another statement with the same value      to access some data. This can be done by using a SPARQL
but has a different property, then this statement with a dif-      ASK query that contains a graph pattern specifying which at-
ferent property is also restricted. This property can also        tributes and properties must be satisfied. By executing the
be used together with the hasProperty. For instance if a          query on the requester’s FOAF profile, we know whether
user wants to restrict a particular value of a specific prop-      the requester satisfies these attributes or not. The SPARQL
erty, then both the hasProperty and the hasLiteral must           query is described as a Literal in the privacy preferences
be used, such as restricting access to a mobile phone num-        using the hasAccessQuery property. The hasAccessQuery
ber (foaf:phone) but allowing access to the land-line phone       property is defined within a class called AccessSpace which
denotes a space of access test queries. Finally, the prop-
erty hasAccessSpace represents the relationship between            < http :// www . example . org / pp2 >
                                                                      a ppo : PrivacyPreference ;
the privacy preference and the access space. Unfortunately,
the current SPARQL specification does not cater for trig-               ppo : appliesToResource < http :// smob . me /
gers similar to the DBMS trigger concept6 . Therefore, the                  user / xyz / post1 >;
query defined in hasAccessQuery has to be executed by a                 ppo : assignAccess acl : Read ;
manual system call rather than called automatic if a hasAc-
cessQuery property appears within the privacy preference.              ppo : hasCondition [
                                                                               ppo : hasProperty tag : Tag ;
  hasAccess. The Privacy Preference Ontology provides                          ppo : resourceAsObject
a property that describes the type of access control to be                       < http :// dbpedia . org / resource /
granted when a privacy preference applies. The hasAccess                               Linked_Data > ];
property defines the access control described using the Web
Access Control vocabulary described in section 1.                      ppo : hasAccessSpace [
                                                                             ppo : hasAccessQuery
3.2 Creating Privacy Preferences                                             " ASK {
                                                                                ? x foaf : topic_interest
  Privacy preferences can easily be created using the PPO                          < http :// dbpedia . org / resource /
and the Web Access Control vocabulary. For example if a                                 Linked_Data > }" ].
user wants to create a privacy preference that restricts the
phone number to whoever works at DERI, the following has
to be defined7 .                                                   4.     RELATED WORK
< http :// www . example . org / pp1 >                               The Platform for Privacy Preferences (P3P)9 specifies a
   a ppo : PrivacyPreference ;                                    protocol that enables Web sites to share their privacy poli-
                                                                  cies with Web users. The privacy policies are expressed in
   ppo : hasCondition                                             XML which can be easily parsed by user agents. This plat-
     [
         ppo : hasProperty foaf : phone                           form does not ensure that Web sites act according to their
     ];                                                           publicised policies. Moreover, since this platform aims to
                                                                  enable Web sites to define their privacy policies, it does not
   ppo : hasAccess acl : Read ;                                   solve our aim of enabling users to define their own privacy
                                                                  preferences. The Protocol for Web Description Resources
   ppo : hasAccessSpace                                           (POWDER)10 is designed to express statements that de-
     [
         ppo : hasAccessQuery                                     scribe what a collection of RDF statements are about. The
         " ASK {                                                  descriptions expressed using this protocol are text based
            ? x foaf : workplaceHomepage                          and therefore do not contain any semantics that can define
               < http :// www . deri . ie > }"                    what the description states. Therefore, our approach en-
     ].                                                           ables users to define what the privacy preferences are about
                                                                  and hence facilitate other systems to use such preferences.
   This example illustrates that wherever in the user’s profile       The authors in [9] propose a privacy preference formal
there is a statement that contains a property foaf:phone          model consisting of relationships between objects and sub-
then all statements containing this property are restricted.      jects. Objects consist of resources and actions, whereas sub-
If the user requires a particular foaf:phone to be restricted,    jects are those roles that are allowed to perform the action
then the user must also define the phone number in the             on the resource. The privacy settings based on this formal
condition by using the hasLiteral property. As mentioned          model are implemented using Protune [3], a policy frame-
in the previous section, the SPARQL query is executed on          work that consists of a policy language and a policy reasoner.
the requester’s FOAF profile by the system once it parses          This implies that any system using this method must have
that there is a query. The query returns either True or           the Protune framework. Since our aim is to propose a light
False whether the requester’s information satisfies the graph      weight vocabulary that can be platform independent, there-
pattern or not since the query is a SPARQL ASK query. If the      fore this approach of using the Protune policy engine does
query returns a Yes then the requester is granted access to       not solve our goal. Moreover, the proposed formal model re-
the statement, otherwise the requester is not allowed access8 .   lies on specifying precisely who can access the resource. Our
   The following example shows how to restrict a microblog        approach provides a more flexible solution which requires the
post to users that share an interest similar to the concept       user to specify attributes which the requester must satisfy.
used to tag the post. Restricting posts tagged with the con-      The authors in [4] propose an access control framework for
cept of Linked Data to all users interested in Linked Data        Social Networks by specifying privacy rules using the Seman-
is done as follows:                                               tic Web Rule Language (SWRL) 11 . This approach is also
                                                                  based on specifying who can access which resource. More-
6                                                                 over, this approach relies that the system contains a SWRL
  However, some SPARQL engines provide triggers, such as
ARC2 — http://arc.semsol.org                                      reasoner. In [5] the authors propose a relational based access
7
  We assume that a PPO interpreter would know the com-            control model called RelBac which provides a formal model
mon prefixes for SPARQL queries, while they could also be          based on relationships amongst communities and resources.
defined in the ASK pattern.                                         9
8
  As previously mentioned, so far, we assume that we can             P3P — http://www.w3.org/TR/P3P/
                                                                  10
trust the statements defined in the requester FOAF file, and           POWDER — http://www.w3.org/TR/powder-dr/
                                                                  11
we tackle this issue separately.                                     SWRL — http://www.w3.org/Submission/SWRL/
This approach also requires to specifically define who can          6.   REFERENCES
access the resource(s).                                            [1] C. Bizer and J. Carroll. Modelling Context Using
  The authors in [11] propose a tag-based model to cre-                Named Graphs. In W3C Semantic Web Interest Group
ate privacy settings for medical applications that consist of          Meeting, 2004.
annotating resources with different access policy rules. The        [2] C. Bizer, T. Heath, and T. Berners-Lee. Linked Data -
privacy rules are denoted in a system specific language which           The Story So Far. International Journal on Semantic
only the system can interpret the access control. The au-              Web and Information Systems, 2009.
thors in [10] also propose an annotation based access control
                                                                   [3] P. Bonatti and D. Olmedilla. Driving and Monitoring
model. This approach enables users to annotate the resource
                                                                       Provisional Trust Negotiation with Metapolicies. In
and also to annotate users. The access control rules there-
                                                                       Sixth IEEE International Workshop on Policies for
fore specify which resource annotations can be accessed by
                                                                       Distributed Systems and Networks, 2005.
which user annotations. Although this approach might be
                                                                   [4] B. Carminati, E. Ferrari, R. Heatherly,
more flexible than other systems, it still relies on specifying
                                                                       M. Kantarcioglu, and B. Thuraisingham. A Semantic
who can access the resource.
                                                                       Web Based Framework for Social Network Access
  In [14] the authors propose a method to direct messages,
                                                                       Control. In Proceedings of the 14th ACM Symposium
such as microblog posts in SMOB, to specific users accord-
                                                                       on Access Control Models and Technologies, SACMAT
ing to their online status. The authors also propose the idea
                                                                       ’09, 2009.
of a SharingSpace which represents the persons or group
of persons who can access the messages. The authors also           [5] F. Giunchiglia, R. Zhang, and B. Crispo. Ontology
describe that a SharingSpace can be a dynamic group con-               Driven Community Access Control. Trust and Privacy
structed using a SPARQL CONSTRUCT query. However, the                  on the Social and Semantic Web, SPOT’09, 2009.
proposed ontology only allows relating the messages to a           [6] O. Hartig. Querying trust in rdf data with tsparql. In
pre-constructed group.                                                 6th Annual European Semantic Web Conference,
  In [8] the authors propose a system whereby users can                ESWC’09, 2009.
set access control to RDF documents. The access controls           [7] B. Heitmann, J. Kim, A. Passant, C. Hayes, and
are described using the Web Access Control vocabulary by               H. Kim. An Architecture for Privacy-Enabled User
specifying who can access which RDF document. Authenti-                Profile Portability on the Web of Data. In Proceedings
cation to this system is achieved using the WebID protocol             of the 1st International Workshop on Information
[15]. This protocol uses FOAF+SSL techniques whereby a                 Heterogeneity and Fusion in Recommender Systems,
user provides a certificate which contains a URL that de-               HetRec ’10, 2010.
notes the user’s FOAF profile. The public key from the              [8] J. Hollenbach and J. Presbrey. Using RDF Metadata
FOAF profile and the public key contained in the certificate             to Enable Access Control on the Social Semantic Web.
which the user provides are matched to allow or disallow               In Proceedings of the Workshop on Collaborative
access. Our approach extends the Web Access Control vo-                Construction, Management and Linking of Structured
cabulary to provide more fine-grained access control to the             Knowledge, CK’09, 2009.
data rather than to the whole RDF document.                        [9] P. Kärger and W. Siberski. Guarding a Walled Garden
                                                                        Semantic Privacy Preferences for the Social Web. The
5. CONCLUSION AND FUTURE WORK                                          Semantic Web: Research and Applications, 2010.
  In this paper we argue that there are not sufficient fine-         [10] P. Nasirifard, V. Peristeras, and S. Decker.
grained privacy preferences for Linked Data. We therefore              Annotation-Based Access Control for Collaborative
proposed a light weight vocabulary which provides classes              Information Spaces. Computers in Human Behavior,
and properties to define fine-grained privacy preferences for            2010.
RDF data. The privacy preferences define what needs to be          [11] S. Nepal, J. Zic, F. Jaccard, and G. Kraehenbuehl. A
protected, the conditions to create fine-grained restrictions;          Tag-Based Data Model for Privacy-Preserving Medical
which access control privilege will be granted and a space             Applications. Current Trends in Database Technology,
to define which attributes a requester must satisfy in order            2006.
to access the resource. The access control privileges are de-     [12] A. Passant, J. Breslin, and S. Decker. Rethinking
scribed using the Web Access Control vocabulary. We plan               Microblogging: Open, Distributed, Semantic. In
to extend the PPO to also restrict actions which are com-              Proceedings of the 10th International Conference on
monly found in Social Web applications and we also plan to             Web Engineering, ICWE’10, 2010.
extend our work to cater for conflicting privacy preferences.      [13] A. Passant, P. Kärger, M. Hausenblas, D. Olmedilla,
  Additionally, we will investigate a formal model for PPO             A. Polleres, and S. Decker. Enabling Trust and
and its relationships with RDFS and OWL entailments, to                Privacy on the Social Web. In W3C Workshop on the
ensure that preferences can also apply to inferred data (for           Future of Social Networking), 2009.
example to restrict the sub-properties or subclasses of the       [14] M. Stankovic, A. Passant, and P. Laublet. Directing
property or class being restricted). This step will also be re-        status messages to their audience in online
quired to be sure that will not be any vulnerability attacks           communities. In Proceedings of the 5th International
caused by inferred statements. Moreover, we are currently              Conference on Coordination, Organizations,
developing the Privacy Preference Manager mentioned in                 Institutions, and Norms in Agent Systems, 2010.
section 3 which provides a user-friendly interface where users    [15] H. Story, B. Harbulot, I. Jacobi, and M. Jones. FOAF
can specify privacy preferences described using the Privacy            + SSL : RESTful Authentication for the Social Web.
Preference Ontology, as well as applying the privacy prefer-           Semantic Web Conference, 2009.
ences when accessing the RDF data.