Consent Recommender System: A Case Study on LinkedIn Settings Rosni K V, Manish Shukla, Vijayanand Banahatti, Sachin Lodha TCS Research Labs, India {rosni.kv,mani.shukla,vijayanand.banahatti,sachin.lodha}@tcs.com Abstract 2016) discussed the specific ways in which vague or unclear language hinders the comprehension of enterprise practices. Privacy is an increasing concern in the digital world, espe- This paradigm represented one extreme of the data privacy cially when it has become a common knowledge that even management landscape where the data-subject had little or high profile enterprises process data without data-subject’s consent. In certain cases where data-subject’s consent was no control over her data with respect to its usage and shar- taken, it was not linked to the proper purpose of process- ing. ing. To address this growing concern, newer privacy regula- Some enterprises allowed data-subjects to access their tions and laws are emerging to empower a data-subject with data and provide consent for certain specific purposes such informed and explicit consent through which she can allow as sharing of personal email or demographic data with third or revoke usage of her personal data. However, it has been party. However, such privacy preference controls provided shown that privacy self-management does not provide the ex- by enterprises were either limited or there was a discon- pected results. This is mainly due to information overload as nect from privacy policy (Anthonysamy, Greenwood, and data-subjects use multiple services entailing variety of pur- Rashid 2013) or it was hard to use them (Madden 2012). poses, and hence, resulting in a very large number of consent requests. This may lead to consent fatigue as data-subject is Further, these controls did not stop an enterprise from an- now expected to provide informed consent for each associ- alyzing the data for gaining additional insights into data- ated purpose. The consent fatigue in data-subjects can lead to subject’s behavior. More recently, these concerns were ad- either incorrect decision making or opting for default values dressed by newer privacy regulations and acts in different provided by the enterprise, and thus, defeating the purpose of geographies, for example, GDPR in EU (Voigt and Von dem new data privacy regulations. Bussche 2017) and CCPA in California (de la Torre 2018). In this work, we discuss the factors influencing the informed These data protection regulations are designed to protect the consent of a data-subject. Further, we propose a ‘consent rec- personal information of individuals by restricting how such ommender system’ based on Factorization Machines (FMs) information can be collected, used and disclosed by having to assist the data-subject and thereby avoiding consent fa- proper informed consent from data-subjects (Barnard-Wills, tigue. Our consent recommender system effectively models Chulvi, and De Hert 2016). For example, France’s National the interaction between the different factors which influence Data Protection Commission (CNIL) penalized Google for a data-subject’s informed consent. We discuss how this setup extends for cold start data-subjects facing the decision prob- not having a valid legal basis to process the personal data of lem with consent requests from multiple enterprises. Addi- the users of its services, especially for ads personalization tionally, we demonstrate the scenario of consent recommen- purposes1 . dation as a prediction problem with minimum attributes avail- Informed consent is beginning to form the foundation of able from LinkedIn’s privacy settings. data protection law in many jurisdictions. It is intuitively considered as an appropriate method to ensure the protection of a data-subject’s autonomy as it allows her to have control 1 Introduction over her personal data (Voigt and Von dem Bussche 2017; With ever increasing digitalization we experience that enter- Dwyer III, Weaver, and Hughes 2004). However, if a data- prises capture consumer data for understanding their behav- subject interacts with multiple services having consent re- ior and for offering better personalized services. More than quirement for many purposes (defined in Section 3) then often the captured data contains personal and sensitive infor- it leads to information overloading while making decision, mation of the consumer (also referred to as ‘data-subject’), and hence, consent fatigue. In biomedical domain consent and thus, leads to privacy concerns (Andrade, Kaltcheva, and fatigue is a well discussed topic (Ploug and Holm 2013). Weitz 2002; Malhotra, Kim, and Agarwal 2004; Flavián and Solove (Solove 2012) and Casteren (Casteren 2017) have Guinalı́u 2006). Till recently, the data privacy landscape was studied about consumer’s privacy self-management and their more enterprise centric with long and incomprehensible pol- icy documents and default opt in for data sharing and us- 1 https://www.cnil.fr/en/cnils-restricted-committee-imposes- age (Cranor et al. 2013). In her work, Priya Kumar (Kumar financial-penalty-50-million-euros-against-google-llc Pre process (extract the required settings, one-hot encoding user indexing, purpose Model User Preference Score related information) Survey Privacy Settings Pre-Processed Data Responses score >= threshold? Query Vector New User Yes No Allow Deny Figure 1: Recommender System Overview ability to make meaningful decisions with information over- describes the implication of our work, future research possi- load. A recent study (Degeling et al. 2018) discusses the bilities and the limitation of our work with some concluding impact of GDPR on web applications and services as well remarks in section 7. as new issues arising from the same. Two key takeaways from their work are: a) The majority of websites updated 2 Related Work their privacy policies in the last two years, and, b) Average Often services and applications capture more than required text length in policy document rose from a mean of 2,145 user data for analytics or generating profit by selling it to words in March 2016 to 3,044 words in March 2018 (+41% third party. An example of this was discussed in (Balebako in 2 years) and increased another 18% until late May (3,603 et al. 2013) where they showed that even well-known mobile words). The consent fatigue may either result in wrong de- applications capture sensitive data of data-subjects and then cision making by data-subject or providing implicit consent share it with third party without their cognizance. However, by not taking any action. with latest data privacy regulations a data-subject’s consent In this work, we explore the problem of consent fatigue becomes necessary to process her data. Substantial amount due to information overload and frequent decision mak- of work is done for understanding privacy concerns of data- ing. To address this issue we proposed and implemented a subject (Liu et al. 2016; Olejnik et al. 2017; Knijnenburg consent recommender system for LinkedIn application. Our 2014; Sadeh and Hong 2014; Liu, Lin, and Sadeh 2014; work enables a LinkedIn user in identifying appropriate pri- Sadeh et al. 2009; Wijesekera et al. 2017). vacy controls and its corresponding setting. It is especially In their work, Sadeh et al analyzed the sensitive data re- useful for cold-starting a new user for whom no prior histor- quested by a mobile app and the purposes associated with ical privacy preferences are available. The main contribution it (Sadeh and Hong 2014). Liu et al, detected user profiles of our work consists of a novel combination of Factorization based on the user application permission settings (Liu, Lin, Machine (FM) (Rendle 2010; 2012) and factors affecting an and Sadeh 2014). They further used Singular Value Decom- individuals decision making process for predicting their pri- position (SVD) for addressing the issues related to sparsity vacy preference. That said, the details of our contribution are and dimensionality. In (Wijesekera et al. 2017), authors re- as follows: duce the burden on users by automating the decision making • We conducted a survey on 50 data-subjects to identify fac- process in smartphones. tors that can influence their decision-making process. Fur- Researchers have also looked into the privacy preference ther, we collected LinkedIn privacy setting data for each recommender system for social networks. Ghainour et al participant for building our recommendation model. (Ghazinour, Matwin, and Sokolova 2016) proposed a rec- ommender system for privacy settings in social networks, • In this work we have shown that the privacy recommenda- particularly for Facebook. They modeled user’s Facebook tion problem can be modeled as a prediction problem. For privacy settings of photo albums by independently consid- that we used Factorization Machine (FM) (Rendle 2010; ering different attributes, for example, personal profile and 2012) for consent recommendation. This also helped in interests. In this paper, we also make use of the pairwise in- analyzing the pairwise interaction of attributes for learn- teraction of attributes. As it helps in learning reliable weights ing reliable weights. Further, we showed that the accuracy by taking the inner product of lower dimensional vectors. of our proposed model is around 88%. Also, we discussed In a recent work, (Naeini et al. 2017) focused on privacy the change in accuracy (in terms of precision, recall and expectations and preferences in IoT data collection scenar- F1-score) with respect to the different combination of fea- ios. Naeini et al (2017) further showed that privacy pref- tures. erences are diverse, context dependent and participants are The rest of the paper is organized as follows. Related work more likely to consent to data if it benefits them. Addition- is presented in section 2. Architecture and system descrip- ally, they were able to predict data-subjects preferences af- tion are given in section 3. The survey methodology, demog- ter three data-collection scenarios. The work presented in raphy details and result analysis are discussed in section 4. (Naeini et al. 2017) comes closer to our work. However, The experimental results are shown in section 5. Section 6 their main focus is on improving the privacy notices for IoT n 1 0 … 1 0 … 1 0 … 1 0 … 1 0 … 1 1 0 … 0 1 … 0 1 … 0 1 … 0 0 … 0 m 0 1 … 1 0 … 1 0 … 1 0 … 0 1 … 1 0 0 … 0 0 … 1 0 … 0 1 … 1 0 … 1 u1 u2 … d1 d2 … p1 p2 … pc1 pc2 … psc1 psc2 … User (u) Data Field (d) Purpose (p) Purpose Category Purpose Sub-category Consent (c) α Figure 2: Input Matrix to Factorization Model. Where, α is the set of attributes, m is the number of samples and n is the number of features. For further description refer to Section 3 and 3.1. devices and develop more advanced personal privacy assis- (Degeling et al. 2018). We extracted the privacy setting of tants, whereas, we are addressing the problem of informa- each participant in our experiment. The collected data is pro- tion overload, and hence, the issue of consent fatigue in post cessed to create a suitable feature vector for training the FM GDPR and CCPA era. model using TensorFlow (Abadi et al. 2016). We tested the accuracy of model by splitting the collected data into train- 3 System Description ing and testing and reported the results in Section 5. Definitions: Some basic definitions of the terms as per GDPR (Voigt and Von dem Bussche 2017): 3.1 Factorization Machines (FM) Our data is described in the matrix format X ∈ Rm×n , 1. data-subject is an individual whose personal data is col- wherein, xi ∈ Rn is the ith row that represents the combi- lected, held or processed. In this paper terms consumer nation of a data-subject and a particular privacy setting with and data-subject are used interchangeably. additional attributes as binary indicator variables. The re- 2. personal data shall mean any information relating to an sponse variable y i ∈ R represents the consent value for ith identified or identifiable natural person (‘data subject’) feature vector. Figure 2 shows the input matrix representa- 3. consent is defined as a data-subject’s informed and unam- tion used in this work. biguous agreement to process her data. Why FM for Consent Recommendation? The Equa- tion 1 shows the traditional linear regression model, where, 4. purpose of processing data refers to the need and unam- w0 ∈ R and W ∈ Rn are bias and weights for features biguous reason for collecting, accessing and processing respectively. For any two given features we can indepen- data-subject’s data. dently learn the weight parameters using the model of Equa- Problem Statement: Let U be the set of data-subjects tion 1 with linear time complexity. However, this model is such that U = {u1 , . . . , uN }. Further, let S be a service not suitable for learning the pairwise interaction of features provider (LinkedIn in our case), that processes large amount as discussed in (Rendle 2010; 2012). A polynomial regres- of data fields D = {d1 , . . . , dK }. Let P = {p1 , . . . , pX } sion model with order 2 can capture the parameters for pair- be the set of clear and unambiguous purposes under which wise interaction, but, its time complexity is O(n2 ). S processes D. For a given purpose pi ∈ P , there is an as- n sociated Di ⊆ D. The service provider S will only process X Di for the purpose pi . Similarly, a data field dj ∈ D could ŷ(x) := w0 + wi xi (1) i=1 be linked to multiple purposes Pj ⊆ P . Also, purpose pi is associated with a set of attributes (αi ) (e.g., description, pur- In a consent recommendation system various factors in- pose category, sensitivity of requested data field, etc.), such teract and influence each other and that is why we have se- SX that α = i=1 αi . lected FM as our model. It solves the issue by factorizing the Figure 1 describes the overall flow of our proposed rec- W as a lower dimensional factor matrix. The model equation ommendation system. We selected LinkedIn for building our from (Rendle 2012) is given below: recommendation model because its a popular professional networking site and we found their privacy settings very n X n X n X k X comprehensive, including, handling of GDPR related con- ŷ(x) := w0 + w i xi + x i x i0 vi,j vi0 ,j (2) cerns2 . The modification in their policy was notified via a i=1 i=1 i0 =i+1 j=1 banner on their landing page. In case a data-subject keeps on using their service without modifying any settings then In Equation 2, model parameters are w0 ∈ R, w ∈ Rn and it is considered as implicit consent which is discussed by V ∈ Rn×k . Further, vi and vi0 in V represents the ith and (i0 )th variables with k latent factors. The first part of the 2 https://www.linkedin.com/help/linkedin/topics/6701/6702 above equation models the linear interaction, and, second 1 2 3 4 Figure 3: LinkedIn’s Privacy Settings. Example of purpose and related attribute is highlighted and numbered. 1. Purpose Category (e.g. Account), 2. Purpose Sub Category (e.g. General advertising preferences), 3. Purpose (e.g. Insights on websites you visited ), 4. Setting Information comprises data field and consent value (e.g. toggle button representing ‘yes’) part shows the pairwise interaction of variables with low participants did not have any personally identifiable infor- rank(k) using their inner product. This effectively helps to mation. The study consisted of three sections: a) an online estimate the parameters in highly sparse dataset. The Equa- survey focused on understanding respondent’s basic demo- tion 2, is of order 2. We can have higher order variable inter- graphics, b) Internet User’s Information Privacy Concern actions as shown below (Rendle 2010): (IUIPC) survey(Malhotra, Kim, and Agarwal 2004), and c) some additional questions to support our design, so as to un- n derstand how active the participant is in social networking platforms, especially, in this case LinkedIn (refer to Section X ŷ(x) = w0 + w i xi + i=1 4.2). d X n n l ! kl Y l ! The participants were asked to provide us their privacy X X Y X (l) settings information from LinkedIn. We processed the set- ··· x ij vij ,f tings information and related description for building binary l=2 i1 =1 il =il−1 +1 j=1 f =1 j=1 indicator feature vectors (xi ∈ Rn , refer to Section 3.1). We (3) considered each section title as a purpose that comes un- Where, V(l) ∈ Rn×kl , kl ∈ N+ 0 and, ∀l ∈ {2, . . . , d}, with der three categories (privacy, advertisement and communi- d as the order. cation) and 11 subcategories during our study. The purpose Prediction of Consent: Given a feature vector x, Equa- information comprised of one or more control buttons de- tion 3 quantifies the consent. The recommendation can be noted as setting information (refer to Figure 3). Each type generated by thresholding the value of ŷ(x). Therefore, the of variables such as setting, purpose and its attributes were predicted consent Cp is defined as: encoded as one-hot vector.  1, allow if ŷ(x) ≥ θ 4.1 Additional Survey Questions Cp (x) = (4) 0, deny if ŷ(x) < θ Participants were asked to rate their comfort level with ser- vices using and sharing their personal information on a 4 Methodology 5-point Likert scale: Q1: I am comfortable with LinkedIn This section describes the steps involved in our data collec- use/share my personal information or activity data for any tion procedure. We selected the participants with an active purposes. Q2: I am comfortable with other social networks LinkedIn account with last login activity not older than 15 (example, Facebook, Twitter, Google+) use/share my per- days. We presented a consent form prior to survey that ex- sonal information or activity data for any purposes plained to each participant about the collected data, its use in To assess the change in a participant’s behavior, we asked our study, and the retention period of the data. Those partic- the question Q1 and Q2 as Q3 and Q4 respectively with the ipants who gave consent for data collection and processing following updated scenario: were allowed to volunteer further. The data collected from The enterprise explicitly says that for what purpose it is 40 year, and 8% never changed their setting and have given im- 20 plicit consent for their data use. Figure 4 shows the results 35 from our survey. It is apparent that the ‘Agree, Disagree and 30 Neutral’ count value changes from ‘Q1’ to ‘Q2’ and from 15 ‘Q3’ to ‘Q4’. We used this insight and included purpose and 25 it’s attributes for building our prediction model. In Figure 4, count count 20 we can see that the most of the participants tend to make 10 15 their personal information visible to their social network. However, some participants kept their information visible to 5 10 the public in LinkedIn but not on other social networking 5 sites. We conjecture that a participant could benefit by dis- closing the professional information as it helps them build- 0 Q1 Q2 Q3 Q4 0 Q5 Q6 ing new professional connects, and hence, possibility of new variable variable job opportunities. This finding is coherent with the observa- Agree Strongly Disagree Neutral Strongly Agree Public Your network tion from Geffet et al (Zhitomirsky-Geffet and Bratspiess Disagree Private 2016). These insights suggest that the reputation of an en- terprise and the potential benefits to the data-subject could Figure 4: Survey Result influence consent decision. 5 Experiment Analysis IUIPC score Range Mean SD We surveyed 50 participants for LinkedIn with maximum of Control 1-5 4.42 0.60 174 privacy settings, 42 purposes, 4 purpose categories (3 Awareness 1-5 4.65 0.54 values used here) and 11 purpose subcategories. Total we Collection 1-5 4.29 0.68 had 5584 samples (m) with 281 features (n = 50 + 174 + 42+4+11), for m and n refer to Section 3.1. If a participant Table 1: IUIPC Score Details gives her consent for a given data field and purpose then the state of the control is considered as ‘1’, that is the control is selected, otherwise it will be ‘0’. Further, we utilized the using the information and it’s privacy practice is certi- TensorFlow implementation of FM algorithm (TFFM) with fied by a trusted organization. ADAM optimizer (Mikhail Trofimov 2016). Learning rate was kept as 0.001 and the threshold value (θ) was set as 0.5. ‘Q5’ and ‘Q6’ were formulated to understand participants In our experiment, we randomly divided all the partici- opinion on visibility of their personal data on LinkedIn and pants in 10 bins. We iterated over these 10 bins, using one other social networking sites. Q5: If you are disclosing your bin for testing purpose and the remaining 9 bins for train- personal information in LinkedIn, who can see your per- ing our model. Finally, We averaged out the accuracy ob- sonal information? Q6: If you are disclosing your personal tained from the 10 iterations, shown in Table 2. The sensi- information in other social networks (example, Facebook, tivity analysis of f1-score with respect to the rank is shown Twitter, Google+), who can see your personal information? in Figure 5. It can be observed that there is change in ac- curacy with different degree of feature combination (order). 4.2 Survey Result Analysis Further, the size of the dataset is limited which may lead to Dataset Demographics. Sampled population from our re- the fluctuations in the line plot as rank increases. It would be search lab consists of data-subjects with an active LinkedIn interesting to use some contextual information such as text account and an active user of at least one more social net- from purpose description to understand the meaning behind working service. The number of participants who gave their latent factors (V ∈ Rn×k in Equation 2). The complexity of consent for data collection experiment were 50. Out of these different models is given in Table 3. 50 participants 54% were Male and 46% were Female. 96% Mean Square Error, Precision and Recall: We analyzed of the participants were from age group 22-30 years. The the Mean Square Error (MSE), precision, recall and f1-score minimum educational qualification within the sample pop- with different order and rank combinations. The results are ulation was under-graduate degree, whereas, the highest shown in Table 2. Initially we considered all the purpose qualification was Doctor of Philosophy (PhD). Also, 68% attributes in our TFFM model. Further, we assessed the im- of the participants were highly active (more than once in a pact of purpose attributes by removing each attribute one week) on LinkedIn’s social networking platform. by one. From experiments we figured that rank(k) 17 gives Findings. In the entry level survey the participants scored better results in terms of accuracy. Moreover, we compared relatively well on IUIPC scale for control, awareness and TFFM results with Linear Support Vector Machine (SVM) collection of personal information as reported in Table 1. and polynomial SVM. Linear SVM showed marginal im- This indicates that participants have reasonably high level of provement over TFFM model as linear models work better privacy concerns. From the survey we found that 20% par- with less amount of data. However, as explained in Section ticipants have modified their privacy settings only at the time 3.1, TFFM can work as a consent recommendation system of registration, 42% modify once in a quarter, 30% once in a given its linear complexity, scalability with larger datasets Models f1-score precision recall MSE Linear SVM 0.89 0.87 0.94 - No Rank SVM (kernel=‘poly’) 0.82 0.69 1.0 - TFFM (d=1) 0.88 0.87 0.89 0.135 d=2 0.87 0.85 0.89 0.167 TFFM d=3 0.87 0.86 0.89 0.159 d=4 0.87 0.86 0.89 0.161 TFFMx=A 0.80 0.85 0.76 0.231 Order (d=3) TFFMx=B 0.84 0.84 0.84 0.274 TFFMx=A+B 0.72 0.85 0.64 0.313 Table 2: Evaluation in terms of f1-score, precision, recall and mean square error (MSE) for rank = 17 (where, rank = k in Equation 2) and order d. TFFMx is the TFFM model without purpose attributes ‘x’. Where ‘x’ can be Purpose Category (A), Purpose Sub Category (B) or both (A+B). Variants of TFFM model compared with SVM linear model and SVM with ’poly’ kernel. It is observed that order d=3 performs better among other orders. Linear SVM performs slightly better than TFFM. Also, TFFM with all purpose attributes performs better than the model without purpose attributes Model Order Complexity subject’s decision making process for consent. Furthermore, the survey results showed that data-subjects are more com- FM d O(kd nd ) (straight forward) fortable in sharing information with enterprises providing FM d O(kn) (reformulated) professional services. FM d O(ks̄D ) (under sparsity) SVM 2 O(n2 ) Future Work. Informed consent from data-subject is piv- otal in data privacy regulations and safeguarding their inter- ests. However, privacy policies are complex, and even with Table 3: Complexity of Models (Rendle 2010) with different relevant educational qualification data-subjects find it diffi- cases, where k is the number of latent factors, d is the order, s̄D denotes the non zero elements from the data (s̄D =2 for matrix fac- cult to make proper choices. Therefore, there is a need for torization). personal digital assistant that can also help a data-subject in making consent decisions. For future work we will refer to (Liu et al. 2016; Naeini et al. 2017) as our baseline. As con- sent is pivotal concept in most of the regulations, therefore, and can accommodate different contextual factors. It can be we envision that it will be required even if the enterprise inferred from Table 2 that SVM with ‘poly’ kernel is over- were to process homomorphically encrypted data (Gentry fitting with the data. Also, in his work Steffen Rendle (Ren- and Boneh 2009). dle 2010) showed that SVM with ‘poly’ kernel fails with two Implicit consent for data collection, sharing and process- way interactions. ing is possible due to multiple reasons. Three main reasons Cold start vs warm start: The cold-start recommenda- contributing to implicit consent are: a) consent fatigue, b) tion scenario appears when there are no prior preferences for data-subjects unawareness, and c) complex privacy policy users or items, whereas, warm-start arises when prior pref- document. This may lead to a sense of false compliance and erences are available. security (Degeling et al. 2018). A potential area to explore FM model works with attributes or categories of input is to identify possible breach of compliance regulations due data represented as binary indicators (Rendle 2012). The to a data-subject’s implicit consent. flexibility of this model helps us to deal with cold-start In this work we built our recommender system by training users/items even when we lack prior preferences. Here, the our model on data gathered from LinkedIn. In post GDPR purpose related attributes of input data are helpful for pre- and CCPA era, all the service providers of varying type are dicting the new data-subject’s consent. expected to comply with them. However, more than often it is not feasible to gather sufficient data to build a model 6 Discussion and Implication for each one of them. To address this issue transfer learning Contributions. Our work makes some useful contributions could be a possible area to look into. Assuming the consent in the context of information overload and resulting con- requests from the other service has the same flavour of pur- sent fatigue due to multiple purposes for whom consent is poses and related attributes. needed. We have shown that consent recommendation could Apart from European Union’s GDPR, many other coun- be modeled as a prediction problem. Our recommender sys- tries are looking into their own version of data privacy tem has an accuracy of 87% for data-subjects with no prior laws and regulations. For example, Protection of Personal preferences or usage history. For warm-start data-subjects Information Act, 2013 (POPI Act) of South Africa, Per- the system is expected to perform even better. We also iden- sonal Information Protection and Electronic Documents Act tified certain factors which may heavily influence a data- (PIPEDA) from Canada, Singapore Personal Data Protec- Performance of model by varying rank order 1 order 2 1.0 0.8 f1-score 0.6 0.4 order 3 order 4 1.0 0.8 f1-score 0.6 0.4 5 10 15 20 25 30 35 40 5 10 15 20 25 30 35 40 Rank of matrix Rank of matrix TFFM TFFM without categorical information Figure 5: Performance of model by varying rank for different orders. Note that order = 1 is similar to linear models where there is no significance of latent factors. tion Act, 2012, and Data Protection Act in India. In future consent, information about data field sensitivity and its re- we would like to do a user study and analyze the effect of tention period should matter, but it was hard to extract this their demographics on the decision making process. information from the experimental setup. Limitations. Our findings are based on study of privacy settings of a single web-application. This prediction model 7 Conclusion developed for LinkedIn might not be suitable for a dating site or a photograph sharing site. However, there is a possi- In this work, we explored the issues pertaining to informa- bility of exploring the application of transfer learning and tion overload and consent fatigue due to complex privacy checking the efficacy of our model on other applications. policies and new regulations requiring consent for various We could collect only limited number of participant’s pri- purposes. We addressed this issue by implementing a con- vacy settings. In order to obtain a more reliable confidence sent recommender system for LinkedIn. Furthermore, we metric, we will carry out experiments with more partici- demonstrated that the recommendation problem could be pants. Also, in this work we have not quantified the degree modeled as a prediction problem. Our analysis of survey re- of fatigue. It will be interesting to see how it will affect the sponses and LinkedIn data enabled us to identify some im- recommendation model. A possible way to assess it is to ob- portant factors which can influence a data-subject’s decision serve a data-subject’s interaction with the application. making process. We hope that our work will be useful in identifying the issues pertaining to consent fatigue and build The information we obtained from the self reported re- interest for further research in this area. sponses of the participants may suffer from ‘Privacy Para- dox’ (Norberg, Horne, and Horne 2007). Even though most of the participants were highly concerned about their pri- References vacy, but, their actual behavior towards consent request may Abadi, M.; Barham, P.; Chen, J.; Chen, Z.; Davis, A.; Dean, change in real life. Further, we could not analyze whether the J.; Devin, M.; Ghemawat, S.; Irving, G.; Isard, M.; et al. participants are going to change the privacy settings later or 2016. Tensorflow: a system for large-scale machine learn- not. ing. In OSDI, volume 16, 265–283. We conclude that a lot of factors can affect a data-subjects consent depending on the purpose of processing data. How- Andrade, E. B.; Kaltcheva, V.; and Weitz, B. 2002. Self- ever, the unavailability of factors in the real world setting disclosure on the web: The impact of privacy policy, reward, challenged us in our experiments. For example, the time of and company reputation. ACR North American Advances. consent request, benefit to a data-subject in exchange for Anthonysamy, P.; Greenwood, P.; and Rashid, A. 2013. So- cial networking privacy: Understanding the disconnect from Malhotra, N. K.; Kim, S. S.; and Agarwal, J. 2004. Internet policy to controls. Computer 46(6):60–67. users’ information privacy concerns (iuipc): The construct, Balebako, R.; Jung, J.; Lu, W.; Cranor, L. F.; and Nguyen, the scale, and a causal model. Information systems research C. 2013. ”little brothers watching you”: Raising awareness 15(4):336–355. of data leaks on smartphones. In Proceedings of the Ninth Mikhail Trofimov, A. N. 2016. tffm: Tensorflow implemen- Symposium on Usable Privacy and Security, SOUPS ’13, tation of an arbitrary order factorization machine. https: 12:1–12:11. New York, NY, USA: ACM. //github.com/geffy/tffm. Barnard-Wills, D.; Chulvi, C. P.; and De Hert, P. 2016. Data Naeini, P. E.; Bhagavatula, S.; Habib, H.; Degeling, M.; protection authority perspectives on the impact of data pro- Bauer, L.; Cranor, L.; and Sadeh, N. 2017. Privacy expecta- tection reform on cooperation in the eu. Computer Law & tions and preferences in an iot world. In Proceedings of the Security Review 32(4):587–598. 13th Symposium on Usable Privacy and Security (SOUPS). Casteren, D. v. 2017. Consent now and then. Ph.D. Disser- Norberg, P. A.; Horne, D. R.; and Horne, D. A. 2007. The tation, Queensland University of Technology. privacy paradox: Personal information disclosure intentions versus behaviors. Journal of Consumer Affairs 41(1):100– Cranor, L. F.; Idouchi, K.; Leon, P. G.; Sleeper, M.; and Ur, 126. B. 2013. Are they actually any different? comparing thou- sands of financial institutions privacy practices. In Proc. Olejnik, K.; Dacosta, I.; Machado, J. S.; Huguenin, K.; WEIS, volume 13. Khan, M. E.; and Hubaux, J. 2017. Smarper: Context-aware and automatic runtime-permissions for mobile devices. In de la Torre, L. 2018. A guide to the california consumer 2017 IEEE Symposium on Security and Privacy, SP 2017, privacy act of 2018. Available at SSRN. San Jose, CA, USA, May 22-26, 2017, 1058–1076. Degeling, M.; Utz, C.; Lentzsch, C.; Hosseini, H.; Schaub, Ploug, T., and Holm, S. 2013. Informed consent and routin- F.; and Holz, T. 2018. We value your privacy... now take isation. Journal of Medical Ethics 39(4):214–218. some cookies: Measuring the gdpr’s impact on web privacy. Rendle, S. 2010. Factorization machines. In Data Mining arXiv preprint arXiv:1808.05096. (ICDM), 2010 IEEE 10th International Conference on, 995– Dwyer III, S. J.; Weaver, A. C.; and Hughes, K. K. 2004. 1000. IEEE. Health insurance portability and accountability act. Security Rendle, S. 2012. Factorization machines with libfm. ACM Issues in the Digital Medical Enterprise 72(2):9–18. Transactions on Intelligent Systems and Technology (TIST) Flavián, C., and Guinalı́u, M. 2006. Consumer trust, per- 3(3):57. ceived security and privacy policy: three basic elements of Sadeh, J. L. B. L. N., and Hong, J. I. 2014. Modeling users loyalty to a web site. Industrial Management & Data Sys- mobile app privacy preferences: Restoring usability in a sea tems 106(5):601–620. of permission settings. In Symposium on Usable Privacy Gentry, C., and Boneh, D. 2009. A fully homomorphic en- and Security (SOUPS). Citeseer. cryption scheme, volume 20. Stanford University Stanford. Sadeh, N.; Hong, J.; Cranor, L.; Fette, I.; Kelley, P.; Ghazinour, K.; Matwin, S.; and Sokolova, M. 2016. Your- Prabaker, M.; and Rao, J. 2009. Understanding and privacyprotector, a recommender system for privacy settings capturing peoples privacy policies in a mobile social net- in social networks. arXiv preprint arXiv:1602.01937. working application. Personal and Ubiquitous Computing 13(6):401–412. Knijnenburg, B. P. 2014. Information disclosure profiles for segmentation and recommendation. In SOUPS2014 Work- Solove, D. J. 2012. Introduction: Privacy self-management shop on Privacy Personas and Segmentation. and the consent dilemma. Harv. L. Rev. 126:1880. Kumar, P. 2016. Privacy policies and their lack of clear Voigt, P., and Von dem Bussche, A. 2017. The EU General disclosure regarding the life cycle of user information. In Data Protection Regulation (GDPR), volume 18. Springer. 2016 AAAI Fall Symposium Series. Wijesekera, P.; Baokar, A.; Tsai, L.; Reardon, J.; Egelman, S.; Wagner, D.; and Beznosov, K. 2017. The feasibility of Liu, B.; Andersen, M. S.; Schaub, F.; Almuhimedi, H.; dynamically granted permissions: Aligning mobile privacy Zhang, S. A.; Sadeh, N.; Agarwal, Y.; and Acquisti, A. 2016. with user preferences. In Security and Privacy (SP), 2017 Follow my recommendations: A personalized privacy assis- IEEE Symposium on, 1077–1093. IEEE. tant for mobile app permissions. In Twelfth Symposium on Usable Privacy and Security (SOUPS 2016), 27–41. Den- Zhitomirsky-Geffet, M., and Bratspiess, Y. 2016. Profes- ver, CO: USENIX Association. sional information disclosure on social networks: The case of facebook and linked in in israel. Journal of the Associa- Liu, B.; Lin, J.; and Sadeh, N. 2014. Reconciling mobile tion for Information Science and Technology 67(3):493–504. app privacy and usability on smartphones: Could user pri- vacy profiles help? In Proceedings of the 23rd International Conference on World Wide Web, WWW ’14, 201–212. New York, NY, USA: ACM. Madden, M. 2012. Privacy management on social media sites. Pew Internet Report 1–20.