Enhancing Privacy Awareness in Online Social Networks:
            a Knowledge-Driven Approach

                                                 Ruggero G. Pensa
                                  University of Turin, Dept. of Computer Science
                                                Turin, Italy I-10149
                                             ruggero.pensa@unito.it


Extended abstract                                                vacy breach event has multiplied the interests in the
                                                                 protection of human dignity and personal data, and
Online social networks are permeating most aspects               privacy has become a primary concern among social
of our life. More than two billions active social ac-            network providers and web/data scientists.
counts are producing petabytes of behavioral and in-
teraction data daily. At the same time, the famous                   Although social platforms often provide some kind
“six degrees of separation” theory has been far exceed           of notification intended to inform their users about
in Facebook, where an average degree of 3.57 has been            the risks of private information disclosure, many peo-
recently observed. This massive interconnection in-              ple simply overlook the dangers due to the uncon-
trinsically exposes social network users to the risk of          trolled disclosure of their (and others’) personal data.
privacy leakage.                                                 Therefore, following the recent scandals, most social
   If, from one hand, many users are informed about              media have considerably improved their tools for con-
the risks linked to the disclosure of sensitive informa-         trolling the privacy settings of the user profile (e.g.,
tion (private life events, sexual preferences, diseases,         Instagram can now limits the visibility of stories to
political ideas, among others), on the other hand the            “close-friend”), but such tools are often hidden and
awareness of being exposed to privacy breaches each              not that user-friendly. Consequently, they are barely
time we disclose information that apparently is not              utilized by most users. Recent machine learning and
sensitive is still insufficiently widespread. In this re-        data mining studies try to go beyond these limitations
gard, daily activities may reveal information that can           by proposing some measures of users’ profile privacy
be used by others in a negative manner. For example,             based on the way they customize their privacy set-
a GPS tag far from home or pictures taken during a               tings, or lightening the customization process of the
journey may alert potential burglars, or the disclosure          privacy settings by means of guided tools and wiz-
of family relationships may expose our own or other              ards [FL10, SWN+ 18]. Privacy measures, in partic-
family members’ privacy to criminal offence risks, as            ular, when associated to popup alerts or other visual
well as source of tort liability. Most troubling of all, it      components, may enhance user’s perception of privacy,
has been shown that by leveraging Facebook user’s ac-            according to the principles of Privacy by Design spec-
tivity it is possible to infer some very private traits of       ifications [Cav12]. These metrics usually require a
the user’s personality [KSG13]. This inference capabil-          separation-based policy configuration: in other terms,
ity has been recently exploited to help propel Donald            the users decide “how distant” a published item may
Trump to victory in the last U.S. presidential elec-             spread in the network. Typical separation-based pri-
tions and was at the very center of the Facebook–                vacy policies for profile item/post visibility include:
Cambridge Analytica scandal in early 2018. This pri-             visible to no one, visible to friends, visible to friends
                                                                 of friends, public. However, this policy fails when the
Copyright © CIKM 2018 for the individual papers by the papers'   number of user friends becomes large. According to a
authors. Copyright © CIKM 2018 for the volume as a collection    well-known anthropological theory, in fact, the maxi-
by its editors. This volume and its papers are published under   mum number of people with whom one can maintain
the Creative Commons License Attribution 4.0 International (CC   stable social (and cybersocial) relationships (known as
BY 4.0).
                                                                 Dunbar’s number) is around 150, but the average num-
                                                                 ber of user friends in Facebook is more than double.
                                                                 This means that many social links are weak (offline
and online interactions with them are sporadic), and        simulated networks but also on a large network of real
a user who sets the privacy level of an item to “visible    Facebook users.
to friends” probably is not willing to make that item
visible to all her friends. Other studies try to make       Acknowledgements
the customization process of the privacy settings less      The work presented in the talk is supported by Fon-
frustrating. However, a consensus on how to identify        dazione CRT (grant numbers 2015-1638 and 2017-
a trade-off between privacy protection and exploita-        2323).
tion of social network potentials is still far from being
achieved.
                                                            References
    Hence, in this talk, we show our theoretical frame-
work (first presented in [PB17]) to i) measure the pri-     [BP17]     Livio Bioglio and Ruggero G. Pensa. Im-
vacy risk of the users and alert them whenever their                   pact of neighbors on the privacy of individ-
privacy is compromised and ii) help the exposed users                  uals in online social networks. In Proceed-
customize semi-automatically their privacy level by                    ings of the International Conference on
limiting the number of manual operations thanks to an                  Computational Science, ICCS 2017, 12-
active learning approach. Moreover, instead of using a                 14 June 2017, Zurich, Switzerland, volume
separation-based policy for computing the privacy risk,                108 of Procedia Computer Science, pages
we adopt a circle-based formulation of the privacy score               28–37. Elsevier, 2017.
proposed in [LT10]. We show experimentally that our
                                                            [Cav12]    Ann Cavoukian. Privacy by design [leading
circle-based definition of privacy score better capture
                                                                       edge]. IEEE Technol. Soc. Mag., 31(4):18–
the real privacy leakage risk. Moreover, by investigat-
                                                                       19, 2012.
ing the relationship between the privacy measure and
the privacy preferences of real Facebook users, we show     [FL10]     Lujun Fang and Kristen LeFevre. Privacy
that our framework may effectively support a safer and                 wizards for social networking sites. In Pro-
more fruitful experience in social networking sites.                   ceedings of WWW 2010, pages 351–360.
    Additionally, we argue that the privacy risk is not                ACM, 2010.
just a matter of users’ preferences (i.e. to which
friends a user is wishing to disclose each particular       [KSG13]    Michal Kosinski, David Stillwell, and
action/post); it is also heavily affected by the char-                 Thore Graepel.       Private traits and
acteristics of the social network they belong to., i.e.,               attributes are predictable from digital
their centrality within the network and the attitude of                records of human behavior.        PNAS,
their friends towards privacy. According to a recent                   110(15):5802–5805, 2013.
computational science study [BP17], even restraining        [LT10]     Kun Liu and Evimaria Terzi. A framework
privacy settings are ineffective when the user is located              for computing the privacy scores of users in
within an unsafe network, i.e., a network where the                    online social networks. TKDD, 5(1):6:1–
majority of nodes have little or no awareness about                    6:30, 2010.
their own and others’ privacy. This leads to the in-
tuition that privacy risk in a social network may be        [PB17]     Ruggero G. Pensa and Gianpiero Di Blasi.
modeled similarly as page authority in a hyperlink                     A privacy self-assessment framework for
graph of web pages. According to a well-known the-                     online social networks. Expert Syst. Appl.,
ory, more authoritative web sites are likely to receive                86:18–31, 2017.
more links from other web sites that are authoritative
in their turn. In this talk, we make the hypothesis         [PBB19]    Ruggero G. Pensa, Gianpiero Di Blai,
that the concept of “importance” of a web-page can                     and Livio Bioglio. Network-aware privacy
be transposed into the concept of “privacy risk” of                    risk estimation in online social networks.
users in a social network as follows: the more an indi-                Social Netw. Analys. Mining, 9(1):15:1–
vidual is surrounded by friends that are careless about                15:15, 2019.
their privacy, the more the privacy of that individual is   [SWN+ 18] Xuemeng Song, Xiang Wang, Liqiang Nie,
likely to be exposed to concrete privacy leakage risks.               Xiangnan He, Zhumin Chen, and Wei Liu.
Then, we present a new network-aware computational                    A personal privacy preserving framework:
method for measuring the privacy risk (first published                I let you know who can see what. In Pro-
in [PBB19]), and report on a social experiment we per-                ceedings of ACM SIGIR 2018, Ann Arbor,
formed, which involves more than one hundred Face-                    MI, USA, July 08-12, 2018, pages 295–
book users. Thanks to this experiment, we show the                    304. ACM, 2018.
effectiveness of our privacy measure not only on two