Workshop on Adaptation and Personalization for Web 2.0, UMAP'09, June 22-26, 2009


                       A User-Centric Authentication and Privacy Control
                      Mechanism for User Model Interoperability in Social
                                      Networking Sites

                                               Yuan Wang and Julita Vassileva,

                                              Department of Computer Science
                                          University Of Saskatchewan Saskatoon SK,
                                              Canada S7N5C9 1-306-966-4744

                                              yuw193@mail.usask.ca jiv@cs.usask.ca


                         Abstract. In this paper, we present a new authentication and privacy control
                         mechanism for personalized mashups of social networking sites. Current
                         authentication and privacy control mechanisms lack flexibility and
                         transparency. This mechanism can make the user model interoperation process
                         for mashups more transparent to users. Users can have a clear understanding
                         and control about which part of their data is being accessed by the mashup
                         application. This mechanism is an important part of user model interoperability
                         framework.
                         Keywords: Personalized mashup, social networking sites, user model
                         interoperation, authentication, privacy


                  1 Introduction

                  One of the most important features of Web 2.0 is that it is social: users can share
                  content with their friends and can develop social ties among each other. Social
                  features can be combined with domain-specific applications, e.g. a music application
                  like LastFM, to empower a community of users. Reusing existing user model data
                  from the domain-specific application (e.g. preferences for particular groups or music
                  genres) can minimize the effort for users, allow useful adaptations and
                  recommendations to be provided by other applications, and thus may help bridge the
                  gap across their presence in different communities. Many researchers in the User
                  Modeling field have investigated how to ensure User Model Interoperability (UMI)
                  by exchanging user model data between applications. Web-based APIs and mashups
                  provide an easier way to implement UMI. A mashup is a web- or desktop- application
                  that combines information and/or services from one or more external sources [1].
                  Social networking site mashup applications combine user social data with some
                  domain-specific application (e.g. music player/recommender, shopping, or mapping
                  application). At the time of writing, there are more than 50,000 facebook mashup
                  applications. There are two mashup application modes. The first one is where the
                  mashup application runs inside the data provider’s page within a frame or gadget,


                                                              110
Workshop on Adaptation and Personalization for Web 2.0, UMAP'09, June 22-26, 2009


                  such as facebook application or OpenSoical gadget. This makes it convenient for the
                  user, allowing interaction with several applications within the same website, avoiding
                  duplication of effort when logging in. The second mashup mode is where the mashup
                  application runs on its own web page. In this mode the user may have to log in in
                  more than one application, if user data is shared among them. There are no significant
                  differences between those models from developer perspective. In both cases, the
                  mashup application actually runs on its own server. In first mode, the social network
                  site is simply a proxy. It displays the mashup application’s page within its gadget.


                  2 Current Technologies
                  A complete UMI framework must have four parts [4]: (I) user data exposing and
                  discovery; (II) user identification mapping (III) authentication and privacy controls,
                  (IV) user data exchange. A personalized mashup application as light-weight mashup
                  application also has these four parts. The following sections will explain each part
                  briefly. This research mainly focuses on the third part: authentication and privacy
                  controls. To clarify the terms, we use “data provider” to denote the application or
                  service which publishes an open API to share user model data; we use “mashup
                  application” to denote the application which requests external user model data and
                  uses it to adapt its own service.
                             (I) User data exposing and discovery: During this process, the data provider
                  publishes user data APIs and information about the semantic and syntactic meaning of
                  the data it provides. Currently, the mashup application developer has to discover data
                  providers manually and read their APIs documents. There are some promising
                  techniques to automate this process: SAWSDL (Semantic annotation of WSDL)
                  http://www.w3.org/2002/ws/sawsdl/,         SA-REST       (Semantic      annotation     of
                  REpresentational State Transfer) [7], XRDS (extensible resource descriptor sequence)
                  and XRDS-simple (a simple and subset version of XRDS).
                             (II) User identification mapping: In order to use external user data, the
                  mashup application has to know the user’s identity in the data provider’s system.
                  Currently, the end user has to provide this information manually in the mashup
                  application. However, there are some universal identity management platforms
                  available. OpenID is the most popular one [5]. Open ID is a decentralized,
                  interoperable, extensible platform for user-centric Internet identity management.
                  OpenID provides users with a universal internet identity which can be used for many
                  online applications. Right now, there are dozens of OpenID providers (Google,
                  Yahoo, Flickr, AOL and etc) and users can choose the ones they trust as their identity
                  providers. With a universal identification management, data provider and mashup
                  provider do not need to map user’s identity across two systems.
                             (III) Authentication and privacy controls: The user data is behind the lock of
                  username and password [2]. In order to access user model data from a data provider,
                  the mashup application needs to authenticate itself to the data provider. Here, access
                  means read, edit, add or delete operation on user data. Authentication has two parts:
                  first, validating the mashup application’s identity, and second, validating whether the
                  mashup application has the right to access user data. Validating the mashup
                  application’s identity is a relatively simple task. The current solution is through API


                                                            111
Workshop on Adaptation and Personalization for Web 2.0, UMAP'09, June 22-26, 2009


                  key and secret key. In order to use the data provider’s APIs, the mashup application
                  needs to register at the data provider by presenting some basic information. After the
                  registration, the data provider assigns a pair of API key and secret key to the mashup
                  application. The API key and secret key is like username and password for identifying
                  the mashup application to the data provider application.
                            Validating whether the mashup application has the right to access user data is
                  a more difficult task. The current solution requires sending username and password of
                  the user in the data provider’s system. There are two ways to do the authentication.
                  First, the mashup application directly asks the user’s name and password of the data
                  provider system and does the authentication itself. This is risky from a user point of
                  view. Alternatively, the mashup application can redirect the user’s web page to the
                  data provider. On the data provider’s web page, the user is required to login to
                  authenticate him- or herself. After the user is authenticated, the data provider will
                  inform the user that the mashup application is trying to access his or her data and will
                  request permission to allow the mashup application to access the user data. If the user
                  gives permission, the data provider will “callback”, i.e. transfer the user’s web page
                  back to the mashup application; the data provider will also send a session key to the
                  mashup application (see Fig. 1).
                                                               Data Provider


                                              Fig.1. Safe Authentication Model

                  With this session key, the mashup application can access user data. This session key
                  is used just for one user. Different data providers have their own rules about this
                  “session” authentication. For some data provider, the session key may expire after
                  hours and the user has to authenticate again. For other data providers, the session key
                  may not expire. Some user data require authentication and some do not. This
                  “session” mechanism is inconvenient for users when a mashup invokes several data
                  providers, and needs to do many authentications to many user data providers. Before
                  actually using the mashup application, the user has to authenticate with each data
                  provider, and wait for the page redirecting. This authentication mode is not only
                  inconvenient. It doesn’t give users control over the user data interoperation process.
                  Even though only data that is publicly available online is currently shared among
                  applications, privacy concerns have been voiced and users are concerned about
                  having little understanding or control over how data is shared. Users are unable to see
                  which data is shared, how it is used, how long it is kept and have no control other than
                  not adding the third party application (the mashup). The opaqueness of the user model
                  data sharing process often makes users hesitant to use the available services.


                                                            112
Workshop on Adaptation and Personalization for Web 2.0, UMAP'09, June 22-26, 2009


                                              Fig.2 Risky Authentication Model

                  There are several mature authentication and privacy control frameworks address some
                  of these issues, such as OAuth and Shibboleth. But there are still some limitations of
                  these frameworks, which will be mentioned in section 3 related works.
                       (IV) User data exchange: this process is where user model interoperation really
                  happens. Currently, Web-based API is the major technology used for this task. The
                  most popular protocols of Web-based API are SOAP message and REST. Web-based
                  API is reliable technology. Among the four parts of the user model interoperation,
                  user data exchange is the most mature one.


                  3 Related Work

                  Berkovsky et al. [8] pointed out four major challenges for achieving UMI.
                  1. Systems are unwilling to share user models; 2. Privacy issues;
                  3. Technical considerations; 4. Semantic heterogeneity among applications
                  A lot of research has addressed the issue of semantic heterogeneity [4], [8], [9]. This
                  research mainly focuses on the second challenge. There has been also a lot of research
                  on privacy in user modeling [10], [11], [12]. Since the 1980ies, researchers have
                  studied users’ attitudes about internet privacy. They found out that users can be
                  divided into three clusters [10], [12]:
                     1. Privacy fundamentalists, comprising approximately 17% of the entire user pool,
                  generally express extreme concern about any use of their data and an unwillingness to
                  disclose information, even when privacy protection mechanisms would be in place.
                     2. Pragmatic majority, approximately 56% of the entire user pool, are generally
                  concerned about their privacy as well, but less than the fundamentalists. They are also
                  far more willing to disclose personal information when they are see potential benefits
                  and protection.
                    3. Privacy unconcerned, who takes 27%, of the entire group, tends to express mild
                  concern for privacy.
                          In the recent decade, the number of privacy fundamentalists and privacy
                  unconcerned is declining, and there is increase in the number of privacy pragmatic
                  users [10]. In other words, privacy pragmatics is the majority of internet users and


                                                            113
Workshop on Adaptation and Personalization for Web 2.0, UMAP'09, June 22-26, 2009


                  their number is still increasing. Therefore, most internet users care about privacy but
                  are also interested in personalized services. As developers, we need to motivate users
                  to disclose their data and protect their privacy. Previous research, e.g. [10], [12],
                  reports about ways to motivate users to disclose their data. In order to motivate users
                  to disclose their data, the application should tell users the benefits of personalization.
                  Moreover, users want to know how their personal information is being used and to
                  have control over this usage. Applications should be able to explain to users what
                  facts and assumptions about them are being stored, and how these are going to be
                  used. Users should be given ample control over the storage and usage of this data.
                  Trust in a web site is a very important motivational factor for the disclosure of
                  personal information. Trust is built on positive past experience, so applications should
                  allow users to incrementally supply more information as their trust in the application
                  increases.
                       Therefore, the authentication and privacy control mechanism should make the
                  UMI process transparent to the user. This direction – to give the user control over
                  which partial models should be made available to which applications - was suggested
                  recently by Kay [14]. The user should be aware of the user model data required by a
                  mashup application and the terms of use of the data. Based on this information, the
                  user can decide whether or not to allow the mashup to access and use user data. As
                  mentioned before, there are some frameworks that attempt to achieve that: such as
                  OAuth and Shibboleth. OAuth is an open protocol to allow secure API authorization
                  in a simple and standard method [2], [5]. It is a light-weight framework which has
                  already been adopted by some social networking sites, like Twitter. But OAuth cannot
                  let the user decide how to do the authorization. For example, when the user trusts a
                  mashup application and feels comfortable about letting it access his or her data, the
                  user does not want to be involved in the authorization (it is viewed as an extra
                  burden). The Shibboleth protocol is another mature framework which ensures safe
                  user data sharing between systems [13]. The user can define an attribute release
                  policy to each outside system which requires user data. There are many prerequisites
                  for using Shibboleth: the system must have secure identity management and must
                  install the required software. Shibboleth is ideal for universities and other larger
                  organization.
                        We propose a new authentication and privacy control mechanism. This
                  mechanism can facilitate privacy control by letting users customize their privacy
                  settings depending on each individual mashup application and their different privacy
                  preferences. Moreover, the user can decide how to do the authorization. This
                  mechanism does not deal with data provider discovery or semantic heterogeneity
                  directly, but it can be integrated with other mechanisms to achieve a complete user
                  model interoperation framework.


                  4 Authentication and privacy mechanism
                  As mentioned in the introduction, there are two mashup application modes. In both
                  modes, mashup applications need authentication.


                                                             114
Workshop on Adaptation and Personalization for Web 2.0, UMAP'09, June 22-26, 2009


                  4.1 Application registration
                  When a mashup application registers at a data provider, the mashup application needs
                  to list all the user data it will access during the service and the type of action on the
                  data: such as read, edit, add, and remove. Besides that, the mashup application also
                  needs to describe the terms of use of the user data, and information about who
                  provides this mashup application. The mashup application is not able to access any
                  user data which are not listed in the registration. The registration information is
                  visible for the user. Therefore, the user knows what kind of data will be used by the
                  service. When the user wants to use a new mashup application, the user data provider
                  will show this application’s registration information to the user.

                   4.2 Authorization
                  When the user invokes the mashup application for the first time, the mashup
                  application will redirect the user to the data provider. The data provider will ask the
                  user to login. After user login, the data provider will show the registration information
                  about the mashup application (as shown on Fig. 4) and ask the user if he or she wants
                  to authorize the mashup application. The user can grant the mashup application one of
                  three levels of access. The first level is access without user authentication, i.e. the
                  mashup application can access the user data it registered without user authenticating.
                  This would be very convenient for the user since it will require no further effort for
                  authentication; however, it gives the mashup application the right to access the user
                  data it registered whenever it wants. The second level of access is single
                  authentication. When the mashup application requests user data, the mashup
                  application needs to redirect the user to the data provider, and the data provider will
                  ask the user to authenticate him or her. After that, the data provider will ask the user
                  whether he or she authorizes the mashup application to access all the user data that the
                  mashup application has in its registration file. The user can choose the time period of
                  authorization: for example, 1 hour, 3 hour, or 24 hours. Within that time limitation,
                  the mashup application can access any user data in its registration file. The third level
                  is individual authentication. The user can specify which user data require an
                  individual authentication, so when a mashup application tries to access this data, it
                  will always require user authorization.


                                          Fig. 4. The Components of the mechanism


                                                            115
Workshop on Adaptation and Personalization for Web 2.0, UMAP'09, June 22-26, 2009


                  4.3. User applications list
                  Each data provider hosts a list of mashup applications which have requested to
                  receive user data for each user about whom the data provider keeps a user model. In
                  that list, the user can see overview of the all mashup applications that the user has
                  authorized. The user can discontinue or change the authorization at any time.

                  4.4 User Data Policy
                  To facilitate user decision making, the user can save his or her privacy settings. The
                  User Data Policy is a file hosted at the data provider system. It has two parts: policy
                  about application providers and policy about data usage. The policy about
                  application providers contains a list of trusted application providers and a list of
                  blocked application providers. Note that this list contains providers, not individual
                  applications. Users can view and manipulate these lists based on external information,
                  e.g. provider reputation services, press, etc, which may change their level of trust in
                  particular application providers. Some providers, of course, may be unknown to the
                  user, and not be in either of the two lists. In the policy about data usage, users can
                  classify the data kept about them by the data provider into three levels: open-level,
                  important-level, and crucial–level. The open-level data is accessible to all application
                  providers except those in the blocked provider list. The important-level data is only
                  open to the application providers on the “trusted” list. The crucial-level data is not to
                  be undisclosed to any providers. The user can change both parts of the User Data
                  Policy at any time. The purpose of the User Data Policy is to facilitate authorizing
                  new mashup applications. When a user is authorizing a new mashup application, the
                  data provider system can automatically compare the mashup application registration
                  information with the User Data Policy to see whether there is a conflict. For example,
                  the mashup application requires important-level data, but the application provider is
                  not on the user’s trusted list. If there is no conflict, the application will be authorized.
                  Otherwise, the user data provider system will inform the user about the conflict, and
                  the user can decide whether to change the policy (add the application provider to the
                  trusted provider list) or not authorize the application. If the mashup application
                  requires crucial-level data, the user has the choice of rejecting the authorization or
                  still allowing it by changing the user data policy by moving the data to important level
                  and adding the application provider to the trusted list.

                  4.5 Update application
                  Mashups can change their requirements for user data at any time. An updated mashup
                  application has to update its registration information at all user data providers from
                  which it receives user data. Data providers maintain version control on mashup
                  application registration and user application list. The registration file of a mashup
                  application keeps a version number for the application. In the user’s application list,
                  the mashup applications are also listed with their version numbers. When a mashup
                  application updates its registration information, the data provider will increase the
                  mashup application’s version number. So, the version numbers for this mashup in its
                  registration and in a user’s application list will not match anymore.


                                                              116
Workshop on Adaptation and Personalization for Web 2.0, UMAP'09, June 22-26, 2009


                  4.6 Overall Workflow
                  Every time, when mashup application tries to access user data from the data provider,
                  the data provider will check if the versions of the registration and user’s application
                  list match. If they do not match, the data provider will request user authorization. The
                  mashup application will redirect user to the data provider, the data provider will show
                  the update of the mashup registration to the user, and ask for authorization. If the user
                  authorizes the updated mashup application, the version number on the user’s
                  application list will be updated according to the registration information. When a
                  mashup application requests user data, first the data provider will check the mashup
                  application’s API and secret key. After that, the data provider will check whether the
                  user data mashup application has requested is listed in the mashup application’s
                  registration. After that, the data provider will check whether the mashup application is
                  on the user’s application list. After that, the data provider will check if there is a
                  match between the versions of the mashup application registration information and
                  the user’s application list. In the final step, the data provider will check the authorized
                  access rights of the mashup from the user’s application list.


                  5 Discussion
                  The proposed mechanism is user-centric; the user can see what kind of data is
                  required for the mashup application and can authorize the mashup application’s
                  access to user data in a flexible way. The proposed mechanism refines the
                  authentication process. The user can control his/her level of involvement in the
                  authentication. If the user trusts the mashup application, he or she does not need to be
                  involved in authentication at all. If user wants, he or she can control each step of user
                  model interoperation. The user can chose to control only on the sensitive data’s
                  interoperation. Comparing with Shibboleth, this mechanism is light-weight; it does
                  not require installing any software. It is ideal for small and middle-level application.
                            This mechanism also has some limitations. It makes the authorization
                  process more complex. It requires more user involvement the first time when the user
                  uses a mashup application. It also puts limitations on the mashup application
                  development. Developers have to openly declare what kind of user data is required.
                  Developers of applications that share user data and serve as data providers have to
                  implement the components of the mechanism (see Fig. 4): a component that receives
                  and updates the registration files of mashup applications, the user application list and
                  user data policy, as well as an interface for the user to view and modify the user
                  application list and user data policy.
                            The impact of mashup performance is not clear yet. If the user grants the
                  mashup application the highest access rights, the performance should be the same as
                  without the mechanism. But if user requires individual data authentication, the
                  performance would be worse. Yet the user may be willing to accept the worse
                  performance in exchange for enhanced privacy. The scope of the mechanism does not
                  allow it to enforce how the mashup application treats user data. In the registration, the
                  mashup application has to declare how it is going to treat the data: how long it will
                  keep it, whether it will transfer the data to other parties or not, if it will disclose the
                  data to other users or not. However, this mechanism cannot enforce the mashup


                                                             117
Workshop on Adaptation and Personalization for Web 2.0, UMAP'09, June 22-26, 2009


                  application’s compliance to its own declaration. Trust and reputation management
                  mechanisms can be used as an orthogonal approach for ensuring that mashup
                  application providers have an incentive to treat user data according to the registration.
                           Finally, social network systems and existing mashup applications on the
                  social web store also a lot of user-contributed data that can be used later in data-
                  mining to develop new user profile data, not explicitly represented at the moment of
                  sharing. It is an open question how to handle the potential privacy threats arising from
                  harvesting user-contributed data.
                           The data about the user’s social network presents further issues. So far we
                  were talking about exchanging user data across applications only, but these
                  applications typically have many users. Will these users be allowed to see the user’s
                  data or not? Can users define rights for accessing data to their friends / social network
                  that can be propagated from one application to another? Some work on sharing data in
                  blogs addresses this issue [14].


                  6 Future Work

                  We have implemented the proposed mechanism in a mock-up social network site
                  environment. We plan to design several scenarios involving some sensitive user data
                  and do an evaluation of the mechanism with real users based on these scenarios.
                  There are several hypotheses we want to test during the evaluation: First, that the user
                  data that can be shared is shown to the user in an understandable way. Second, that
                  the user can easily express his or her privacy control settings through the User Data
                  Policy. Third, that the user understands from the mashup application registration file
                  (displayed in an appropriate way) why the application needs his or her model, the
                  benefits for user model interoperation and how the application treats the user data.
                  Fourth, this mechanism should help to increase user participation with respect to
                  adding new mashup applications in an experimental group that uses the framework, in
                  comparison with a control group which use the traditional authentication and privacy
                  mechanism. We hope to be able to test these hypotheses with a large number of users
                  on a social network site and will use questionnaires and collect statistics about user’s
                  participation that will be analyzed to validate or refute the hypotheses. In the next
                  stages, we will combine this mechanism with services for user model data semantic
                  translation, service discovery, and user identity mapping mechanisms to achieve a
                  complete user model interoperation framework.


                  7 Summary

                  Personalized mashups provide a new way to do user model interoperation. Current
                  mashup solutions face several challenges, including insufficient authentication and
                  privacy control. This paper proposes a user-centric mechanism to facilitate
                  authentication and improve user privacy control. Sharing user data on the social web
                  raises many important issues. This mechanism addresses the privacy of sharing user
                  model data that is explicitly represented by the application.


                                                            118
Workshop on Adaptation and Personalization for Web 2.0, UMAP'09, June 22-26, 2009


                  References

                  1.    Thang, M.D., Dimitrova, V., Djemame, K.: Personalised Mashups: Opportunities and
                        Challenges for User Modelling. In Proceedings of User Modeling. 2007, 415-419. (2007)
                  2.    Musser, J.: Open APIs: State of the Market presentation. Qcon San Franciso (2008)
                        http://qconsf.com/sf2008/file?path=/qcon-sanfran
                        2008/slides//JohnMusser_Web_As_Platform.pdf
                  3.    Liu, H., Mae, P.: InterestMap: Harvesting Social Network Profile for Recommendations.
                        Proc. 10th International Conference on Intelligent User Interfaces (2005)
                  4.    Carmagnola, F., Dimitrova, V.: An Evidence-Based Approach to handle Semantic
                        Heterogenity in Interoperable Distributed User Model. Proc. 13th International Conference
                        on Intelligent User Interfaces (2008)
                  5.    Recordon, D.: “Blowing up” Social Networks by Going Open (2008)
                        http://www.slideshare.net/daveman692/blowing-up-social-networks-by-going-open-
                        presentation
                  6.    A.P. Sheth, K, Gomadam, J. Lathem.: SA-REST: semantically interoperable and Easier-
                        to-Use services and Mashup. IEEE Internet Computing, vol.11, no.6, 2007, pp.91-94
                        (2007)
                  7.    Reed, D., Chasen, L., Tan, W.: OpenID identity discovery with XRI and XRDS ACM
                        International Conference Proceeding Series, Vol.283 Proceeding of the 7th symposium on
                        identity and trust on the Internet Pages 19-25 2008, ACM, New York,USA (2008)
                  8.    Berkovsky, S., Kuflik, T., Ricci, F.: Mediation of user models for enhanced
                        personalization in recommender systems. In Proceedings of User Model. User-Adapt.
                        Interact. 245-286. (2008)
                  9.    Heckmann, D., Schwartz, T., Brandherm, B., Kroner, A.: Decentralized User Modeling
                        with UserML and GUMO. Proceedings 10th International Conference on User Modeling,
                        (2005) LNAI 3538 Springer, Berlin-Heidelberg, pp.428-432
                  10.   Kobsa, A.: Privacy-Enhanced Personalization. Communication of ACM, vol.55, pp24-33
                        ACM, New York, NY, USA (2007)
                  11.   Anwar, M.,Greer,J.: Role and Relationship-Based Identity Management for Private yet
                        Accoutable E-Learning, Trust Management II, vol.262, pp.343-358 Springer ,Boston
                        (2008)
                  12.   Cranor,L.F., Reagle,J., Ackerman, M.S.: Beyond Concern: Understanding net users’
                        attitudes about online privacy. AT&T labs-research technical report TR 99.4.3,
                        http://www.research.att.com/library/trs/trs/99/99.4/. (1999)
                  13.   Macquarie E-Learning Center of Excellence (MELCOE), Macquarie University:
                        Shibboleth , October-November 2008 , AAF Shibboleth Rollout Workshop.
                        http://www.federation.org.au/workshop/AAF%20Shibboleth%20Rollout%20Workshop%
                        202008.1.pdf
                  14.   Kay,J.: Lifelong Learner Modeling for Lifelong Personalized Pervasive Learning. IEEE
                        Transactions on Learning Technologies Volume 1, Issue: 4 (2008)
                  15.   Indratmo., Vassileva,J.:A Usability Study of an Access Control System for Group Blogs.
                        Proc. ACM International Conference on Weblogs and Social Media. Boulder, CO (2007)


                                                                119