Enhancing Privacy Awareness in Online Social Networks: a Knowledge-Driven Approach Ruggero G. Pensa University of Turin, Dept. of Computer Science Turin, Italy I-10149 ruggero.pensa@unito.it Extended abstract vacy breach event has multiplied the interests in the protection of human dignity and personal data, and Online social networks are permeating most aspects privacy has become a primary concern among social of our life. More than two billions active social ac- network providers and web/data scientists. counts are producing petabytes of behavioral and in- teraction data daily. At the same time, the famous Although social platforms often provide some kind “six degrees of separation” theory has been far exceed of notification intended to inform their users about in Facebook, where an average degree of 3.57 has been the risks of private information disclosure, many peo- recently observed. This massive interconnection in- ple simply overlook the dangers due to the uncon- trinsically exposes social network users to the risk of trolled disclosure of their (and others’) personal data. privacy leakage. Therefore, following the recent scandals, most social If, from one hand, many users are informed about media have considerably improved their tools for con- the risks linked to the disclosure of sensitive informa- trolling the privacy settings of the user profile (e.g., tion (private life events, sexual preferences, diseases, Instagram can now limits the visibility of stories to political ideas, among others), on the other hand the “close-friend”), but such tools are often hidden and awareness of being exposed to privacy breaches each not that user-friendly. Consequently, they are barely time we disclose information that apparently is not utilized by most users. Recent machine learning and sensitive is still insufficiently widespread. In this re- data mining studies try to go beyond these limitations gard, daily activities may reveal information that can by proposing some measures of users’ profile privacy be used by others in a negative manner. For example, based on the way they customize their privacy set- a GPS tag far from home or pictures taken during a tings, or lightening the customization process of the journey may alert potential burglars, or the disclosure privacy settings by means of guided tools and wiz- of family relationships may expose our own or other ards [FL10, SWN+ 18]. Privacy measures, in partic- family members’ privacy to criminal offence risks, as ular, when associated to popup alerts or other visual well as source of tort liability. Most troubling of all, it components, may enhance user’s perception of privacy, has been shown that by leveraging Facebook user’s ac- according to the principles of Privacy by Design spec- tivity it is possible to infer some very private traits of ifications [Cav12]. These metrics usually require a the user’s personality [KSG13]. This inference capabil- separation-based policy configuration: in other terms, ity has been recently exploited to help propel Donald the users decide “how distant” a published item may Trump to victory in the last U.S. presidential elec- spread in the network. Typical separation-based pri- tions and was at the very center of the Facebook– vacy policies for profile item/post visibility include: Cambridge Analytica scandal in early 2018. This pri- visible to no one, visible to friends, visible to friends of friends, public. However, this policy fails when the Copyright © CIKM 2018 for the individual papers by the papers' number of user friends becomes large. According to a authors. Copyright © CIKM 2018 for the volume as a collection well-known anthropological theory, in fact, the maxi- by its editors. This volume and its papers are published under mum number of people with whom one can maintain the Creative Commons License Attribution 4.0 International (CC stable social (and cybersocial) relationships (known as BY 4.0). Dunbar’s number) is around 150, but the average num- ber of user friends in Facebook is more than double. This means that many social links are weak (offline and online interactions with them are sporadic), and simulated networks but also on a large network of real a user who sets the privacy level of an item to “visible Facebook users. to friends” probably is not willing to make that item visible to all her friends. Other studies try to make Acknowledgements the customization process of the privacy settings less The work presented in the talk is supported by Fon- frustrating. However, a consensus on how to identify dazione CRT (grant numbers 2015-1638 and 2017- a trade-off between privacy protection and exploita- 2323). tion of social network potentials is still far from being achieved. References Hence, in this talk, we show our theoretical frame- work (first presented in [PB17]) to i) measure the pri- [BP17] Livio Bioglio and Ruggero G. Pensa. Im- vacy risk of the users and alert them whenever their pact of neighbors on the privacy of individ- privacy is compromised and ii) help the exposed users uals in online social networks. In Proceed- customize semi-automatically their privacy level by ings of the International Conference on limiting the number of manual operations thanks to an Computational Science, ICCS 2017, 12- active learning approach. Moreover, instead of using a 14 June 2017, Zurich, Switzerland, volume separation-based policy for computing the privacy risk, 108 of Procedia Computer Science, pages we adopt a circle-based formulation of the privacy score 28–37. Elsevier, 2017. proposed in [LT10]. We show experimentally that our [Cav12] Ann Cavoukian. Privacy by design [leading circle-based definition of privacy score better capture edge]. IEEE Technol. Soc. Mag., 31(4):18– the real privacy leakage risk. Moreover, by investigat- 19, 2012. ing the relationship between the privacy measure and the privacy preferences of real Facebook users, we show [FL10] Lujun Fang and Kristen LeFevre. Privacy that our framework may effectively support a safer and wizards for social networking sites. In Pro- more fruitful experience in social networking sites. ceedings of WWW 2010, pages 351–360. Additionally, we argue that the privacy risk is not ACM, 2010. just a matter of users’ preferences (i.e. to which friends a user is wishing to disclose each particular [KSG13] Michal Kosinski, David Stillwell, and action/post); it is also heavily affected by the char- Thore Graepel. Private traits and acteristics of the social network they belong to., i.e., attributes are predictable from digital their centrality within the network and the attitude of records of human behavior. PNAS, their friends towards privacy. According to a recent 110(15):5802–5805, 2013. computational science study [BP17], even restraining [LT10] Kun Liu and Evimaria Terzi. A framework privacy settings are ineffective when the user is located for computing the privacy scores of users in within an unsafe network, i.e., a network where the online social networks. TKDD, 5(1):6:1– majority of nodes have little or no awareness about 6:30, 2010. their own and others’ privacy. This leads to the in- tuition that privacy risk in a social network may be [PB17] Ruggero G. Pensa and Gianpiero Di Blasi. modeled similarly as page authority in a hyperlink A privacy self-assessment framework for graph of web pages. According to a well-known the- online social networks. Expert Syst. Appl., ory, more authoritative web sites are likely to receive 86:18–31, 2017. more links from other web sites that are authoritative in their turn. In this talk, we make the hypothesis [PBB19] Ruggero G. Pensa, Gianpiero Di Blai, that the concept of “importance” of a web-page can and Livio Bioglio. Network-aware privacy be transposed into the concept of “privacy risk” of risk estimation in online social networks. users in a social network as follows: the more an indi- Social Netw. Analys. Mining, 9(1):15:1– vidual is surrounded by friends that are careless about 15:15, 2019. their privacy, the more the privacy of that individual is [SWN+ 18] Xuemeng Song, Xiang Wang, Liqiang Nie, likely to be exposed to concrete privacy leakage risks. Xiangnan He, Zhumin Chen, and Wei Liu. Then, we present a new network-aware computational A personal privacy preserving framework: method for measuring the privacy risk (first published I let you know who can see what. In Pro- in [PBB19]), and report on a social experiment we per- ceedings of ACM SIGIR 2018, Ann Arbor, formed, which involves more than one hundred Face- MI, USA, July 08-12, 2018, pages 295– book users. Thanks to this experiment, we show the 304. ACM, 2018. effectiveness of our privacy measure not only on two