=Paper=
{{Paper
|id=Vol-1671/paper6
|storemode=property
|title=Rigorous and Flexible Privacy Models for Utilizing Personal Spatiotemporal Data
|pdfUrl=https://ceur-ws.org/Vol-1671/paper6.pdf
|volume=Vol-1671
|authors=Yang Cao
|dblpUrl=https://dblp.org/rec/conf/vldb/Cao16
}}
==Rigorous and Flexible Privacy Models for Utilizing Personal Spatiotemporal Data==
<pdf width="1500px">https://ceur-ws.org/Vol-1671/paper6.pdf</pdf>
<pre>
           Rigorous and Flexible Privacy Models for Utilizing
                    Personal Spatiotemporal Data

                                                         Yang Cao
                                             Supervised by Masatoshi Yoshikawa
                                                  Kyoto University, Japan
                                                soyo@db.soc.i.kyoto-u.ac.jp

ABSTRACT                                                                  anonymized medical records and was first made available
Personal data are the new oil. Vast amounts of spatiotem-                 in 1996 for medical research and public health purposes.
poral data generated by individuals have been collected and               Latanya Sweeney [2] linked the anonymized GIC database
analyzed, such as check-in data, trajectories, web browsing               (which retained the birthday, sex, and ZIP code of each pa-
data and timestamped medical records. These personal data                 tient) with voter registration records to identify the medical
are a valuable resource for businesses and also have the po-              records of the governor of Massachusetts. In 2006, AOL Inc.
tential to provide significant social benefits through sharing            released more than 650 thousand users’ anonymized search
and reuse. However, privacy concerns hinder the wider use                 logs for research purposes, but researchers [3] found it is
of these personal data. To this end, dozens of privacy pro-               possible to de-anonymize some users by mining these times-
tection models have been proposed for privacy-preserving                  tamped records. A recent privacy leakage case was related
data publishing (PPDP). -differential privacy has emerged                to the datasets released for the Netflix Prize competition for
as a de facto privacy model for PPDP because of its rigor-                recommender algorithms. Researchers [4] verified that the
ous theoretical guarantees, but it has also been criticized as            datasets could be de-anonymized by linking them with the
impractical in many scenarios. We attempt to design rig-                  IMDB dataset; as a result, Netflix became the target of a
orous and flexible privacy models that able to bridge the                 lawsuit and was obliged to terminate this well-known com-
gap between theoretical limitations and practical needs. In               petition in 2010. In Japan, the East Japan Railway Com-
this article, we first motivate the importance of rigorousness            pany (JR) faced a scandal in 2013 because of the selling of
and flexibility for privacy models and then present two pri-              customer data collected through the use of SUICA cards.
vacy models that extend differential privacy in a practical               These privacy leakage cases illustrate the conflict between
manner.                                                                   increasing public concerns about data privacy and the need
                                                                          for personal data publishing. Therefore, privacy issues have
                                                                          become a critical problem in the big data era.
1.    INTRODUCTION                                                           Personal spatiotemporal data are highly sensitive even if
   New opportunities are arising to enrich our understand-                they are anonymized. Studies [5] [6] have shown that only
ing of the physical world by utilizing personal big data.                 four spatiotemporal data points from anonymized mobile
Such data generated by individuals that possess spatial and               datasets and credit card records (with purchase times and
temporal properties are called personal spatiotemporal data               locations) are sufficient to uniquely identify 90% of individ-
and include information such as people’s movements, search                uals. Another study [7] has demonstrated that it is possible
logs, shopping records, social activities and medical histo-              to track smart phone locations simply by monitoring bat-
ries. Companies such as Google, Amazon and Facebook                       tery consumption data (timestamps and remaining battery
have created enormous wealth by using such personal data.                 life), which do not require the users’ permission to be shared.
These data also have the potential to support significant                 Therefore, because of the unique and rich patterns that are
initiatives for the public good, such as innovative services,             present in personal spatiotemporal data, a rigorous privacy
traffic monitoring, disease surveillance and health promo-                model is needed to utilize personal spatiotemporal data in
tion. Therefore, it is important to enable the sharing and                a privacy-preserving manner.
reuse of personal spatiotemporal data.                                       At the same time, privacy models should be flexible for the
   However, several well-known cases of privacy leakage have              following two reasons. First, because privacy protection im-
arisen in attempts to provide privacy-preserving data pub-                plies the hiding or perturbing of sensitive information1 , the
lishing (PPDP) [1]. One of the most famous privacy leakage                privacy level should be tunable (to an appropriate level) for
cases occurred with regard to the Massachusetts Group In-                 different applications to achieve a suitable trade-off between
surance Commission (GIC) medical dataset, which contains                  privacy and utility. Second, privacy as a complex social con-
                                                                          cept should be modifiable in accordance with different social
                                                                          contexts. For example, different users might have different

                                                                          1
                                                                            Although some cryptographic approaches (e.g., homomor-
                                                                          phic encryption) can be used for PPDP, because they
Proceedings of the VLDB 2016 PhD Workshop, September 9, 2016. New         limit the possible data recipients, we here discuss primarily
Delhi, India.                                                             perturbation-based approaches that enable a wider sharing
Copyright (c) 2016 for this paper by its authors. Copying permitted for   of data.
private and academic purposes.
preferences regarding privacy; for a single user, the desired            uID   time   loc.
                                                                          u1     t1       l1
protection levels might also be different at different times              u3     t1       l1
                                                                                                    t1 t2 t3 t4 t5 …
                                                                                                                                t1 t2 t3 t4     t5   …

and places.                                                               u2     t2       l2                               l1   2   0   0   0   1    …
                                                                          u3     t2       l3   u1 l1                l2 …
  In the following, we briefly review the background for this             u3     t3       l4
                                                                                                                           l2   0   1   0   0   2    …

                                                                          u1     t5       l2
                                                                                               u2       l2          l1 …
                                                                                                                                                     …
research in Section 2 and then present two proposed privacy               u2     t5       l1
                                                                                                                           l3   0   1   0   0   0
                                                                                               u3 l1    l3 l4       l2 …   l4                        …
models in Section 3 and Section 4.                                        u3
                                                                           …
                                                                                 t5
                                                                                 …
                                                                                          l2
                                                                                          …
                                                                                                                                0   0   1   0   0


                                                                          (a) Raw data.             (b) Trajectory data.    (c) Aggregate data.

2.    BACKGROUND                                                       Figure 1: Raw data and aggregate information.
                                                                     Below, we summarize two crucial weaknesses of differen-
2.1    Previous Privacy Models                                    tial privacy that hinder its extensive use in data publishing.
   Previous studies have considered two major families of pri-         • One size fits all. -differential privacy uses a single
vacy models: k-anonymity [2] and -differential privacy [8].             parameter  to represent the protection level for all
k-anonymity [2] was proposed in 2002 and was the first pri-              users in a database rather than providing a personal-
vacy model to be widely accepted by practitioners. However,              ized privacy protection level.
Machanavajjhala [9] identified a serious flaw of k -anonymity
by demonstrating the possibility of homogeneity attacks and            • Data independence assumption. Differential pri-
background knowledge attacks and, to resolve this flaw, pro-             vacy assumes that the tuples in a database are inde-
posed an improved privacy model based on k -anonymity                    pendent; however, real-world data tend to be tempo-
named `-diversity. However, `-diversity had its own short-               rally or spatially correlated.
comings, which motivated further improvement to yield sub-
sequent new privacy models, such as t-closeness [10] and M -      2.3     Problem setting
invariance [11]. Researchers have realized that the essential       Here, we describe a scenario of the publishing of aggre-
defect of the k-anonymity family is that the privacy guar-        gate spatiotemporal data. We consider a scenario in which
antee depends on the background knowledge possessed by            a trusted server collects spatiotemporal data points from
adversaries. In 2006, Dwork et al. proposed -differential        users and continuously stores them in database Di at each
privacy [8], which has received increasing attention because      timestamp i ∈ [1, t]. Let t denote the current timestamp.
it is guaranteed to be rigorous and mathematically provable.      Each data point huID, time, loci is a row in Di (Fig. 1(a)).
Differential privacy ensures protection against adversaries       Suppose that locs is the set of all locations in which we
with arbitrary background knowledge.                              are interested (POIs); then, the server wishes to publish a
                                                                  vector ri at each timestamp i that consists of the counts
2.2    Differential Privacy                                       of loc ∈ locs that appear in Di (Fig.1(c)), i.e., the answer
   Intuitively, differential privacy guarantees that any single   to the count query Qc : Di → R|locs| . Without loss of
user’s data in the database have only a “slight” (bounded         generality, we assume that each user appears at only one
by ) impact on changes to the outputs. The parameter            location at most at each timestamp. We need to transform
that is used to control the privacy level is defined as a real    the aggregate data depicted in Figure 1(c) into a secure form
positive number. A lower value of  corresponds to a lower        for publishing because they are computed directly from the
likelihood of privacy leakage. In the case of a suitably small    sensitive raw data. The most straightforward method is to
, an adversary cannot associate any piece of information         employ the Laplace mechanism [12] (add random Laplace
with a specific individual by mining the answers to queries.      noise) to achieve DP. In accordance with the limitations of
Definition 1 (Neighboring Database [8]). If D is a database       DP discussed above, two problems are encountered when
and D0 is nearly identical to D but differs by one record, then   applying DP in practice, as follows:
we say that D and D0 are two neighboring databases.                    • How can personalized differential privacy be achieved?
Definition 2 (-Differential Privacy, -DP [8]). Let Λ be a            • How can differential privacy be guaranteed even if the
randomized algorithm, and let R represent all possible out-              data are temporally or spatially correlated?
puts of Λ. The algorithm Λ satisfies -DP if the following
                                                                     We present two privacy models, `-trajectory privacy (Sec-
holds for any r ∈ R and any two neighboring databases D
                                                                  tion 3) and spatiotemporal privacy (Section 4), to solve the
and D0 .
                                                                  above problems.
            P r[Λ(D) = r] ≤ exp() · P r[Λ(D0 ) = r]        (1)
                                                                  3.     `-TRAJECTORY PRIVACY
   In the inequality expressed in (1), a positive parameter          In a previous work [14], we defined a new privacy model
, called the privacy budget, is given in advance. It is used     that we called `-trajectory privacy to guarantee that any `-
to control the level of privacy protection. A lower values of     length trajectories for each user are protected under differ-
 corresponds to higher privacy and more randomness, and          ential privacy. We formalized our privacy definition, proved
vice versa. The global sensitivity (or sensitivity for short)     its feasibility, and designed efficient mechanisms to imple-
GS(Q) of query Q is the maximum distance L1 between the           ment it. This privacy model is personalized, i.e., different
query results for any two neighboring databases.                  users can specify different lengths of `-trajectories (` succes-
   Two widely used methods of achieving differential privacy      sive data points) depending on their privacy requirements,
are the Laplace mechanism [12], which adds random noise to        as illustrated in Figure 2(b). A closely related privacy model
the real data to prevent the disclosure of sensitive informa-     called w-event privacy [15] protects the data within sliding
tion, and the Exponential mechanism [13], which randomly          windows (all data points in w with successive timestamps),
returns the desired item with some calibrated probability.        as shown in Figure 2(a). In the following, we present the
definition of the `-trajectory privacy model and related ex-                                                             loc3
perimental results.                                                                                                                   loc5       colleague         u2          couple
  We first define the neighboring databases in our setting.
                                                                                                             loc4
Definition 3 (`-Trajectory Neighboring Stream Prefixes).                                                                                            u1                           u3

Let St0 be a near copy of the trajectory stream prefixes St                                                     (a) Road Network                           (b) Social Ties

that differs in ` neighboring databases Di0 . St and St0 are
                                                                                                             temporal correlation                    spatial correlation
neighboring `-trajectory stream prefixes if one can be ob-                                                          for single user                          for user-user
tained from the other by modifying single or multiple locs                                                    D1      D2        D3     …                        r1       r2      r3 …
in any one `-trajectory `u,k (recall that an `-trajectory is a                                                7:00       8:00        9:00    …
                                                                                                                                                             7:00       8:00     9:00   ..

set of ` spatiotemporal data points). We say that St and St0                                                                                                   0         2        2
                                                                                                                                                    loc1                                ..
                                                                                                        u1     loc3      loc1        loc1    …
                                                                                                        u2     loc2      loc1        loc1    …
                                                                                                                                                    loc2       2         0        0     ..
                                                                                                                                                               1         0        1
are neighboring with respect to `u,k .                                                                  u3     loc2      loc4        loc5    …
                                                                                                                                                    loc3
                                                                                                                                                    loc4       1         1        0
                                                                                                                                                                                        ..
                                                                                                                                                                                        ..
                                                                                                        u4     loc4      loc5        loc3    …      loc5       0         1        1     ..

   Intuitively, `-trajectory privacy attempts to ensure that                                           (c) Spatiotemporal Data                              (d) Raw Aggregates

any `-trajectory for any single user has a only “slight” (bounded                              Figure 3: The vulnerability of differential privacy to
by ) impact on the outputs.                                                                   correlated spatiotemporal data.
Definition 4 (`-Trajectory -Differential Privacy). Let Λ be                                   a varying degree of knowledge of the spatiotemporal corre-
an algorithm that takes prefixes of trajectory streams St =                                    lations of the data of interest. We demonstrate that the
{D1 , · · · , Dt } as inputs. Let Nt = {n1 , · · · , nt } be a possi-                          risk of a DP mechanism may increase over time given the
ble perturbed output stream of Λ. If the following holds for                                   existence of such adversaries. To prevent such privacy leak-
any `-trajectory neighboring St and St0 ,                                                      age, we propose a new privacy definition, ST -differential
                                                                                               privacy, and present novel mechanisms to satisfy this pri-
              P r [Λ(St ) = Nt ] ≤ e · P r Λ(St0 ) = Nt
                                                         
                                                                 (2)
                                                                                               vacy definition. In the following, we first demonstrate the
then we say that Λ satisfies `-trajectory -differential privacy                               problem by means of a motivating example, present the def-
(or `-trajectory privacy for brevity).                                                         inition of realistic adversaries, and briefly describe measures
   The following theorem provides insight regarding the im-                                    for protecting against such adversaries. This work has been
plementation of `-trajectory privacy.                                                          submitted to an international conference.
Theorem 1. Let Λ be an integrated algorithm that takes
stream prefixes St = {Di , i ∈ [1, t]} as inputs and produces
                                                                                               4.1      A Motivating Example
Nt = {ni , i ∈ [1, t]} as outputs. Λ consists of a series of                                      The following example (illustrated in Figure 3) shows that
sub-algorithms {Ai , i ∈ [1, t]}, each of which takes Di as its                                the existence of adversaries with knowledge of the spatiotem-
input and outputs noisy data ni with independent random-                                       poral correlations of the data may degrade the expected pri-
ness. Presume that Ai ensures εi -DP, and let τu,k be the                                      vacy guarantee of DP.
set of timestamps dominated by an arbitrary trajectory `u,k ;                                  Example 1. Consider the scenario of spatiotemporal data
then, Λ satisfies `-trajectory privacy if                                                      publishing illustrated in Figure 3. A trusted server continu-
                                               X                                               ously collects users’ locations at each time point. Our goal is
                               ∀u, ∀k,                 εi ≤                             (3)   to publish the aggregates shown in Table (d) while protecting
                                              i∈τu,k
   Based on the above theorem, we designed an algorithmic                                      against adversaries who may have some knowledge regarding
framework for publishing differentially private aggregates                                     membership in the database. We assume that each user ap-
that satisfy `-trajectory privacy. The experimental results                                    pears at only one location at most at each time point. Then,
show that our approach is effective and efficient compared                                     according to the Laplace mechanism [12], adding Lap(1/)
with previous works [16] and [15]. More details can be found                                   noise2 to perturb the data can achieve -DP. However, from
in our paper [14].                                                                             auxiliary information such as road network constraints, an
                                                                                               attacker may know the mobility patterns of a targeted vic-
                                                                                               tim; for example, the adversary may know that the victim
4.    ST -DIFFERENTIAL PRIVACY                                                                “always arrives at loc5 after visiting loc4 ,” as illustrated in
   In existing studies, traditional DP mechanisms (e.g., the                                   Figure 3(a), which leads to the patterns indicated by the solid
Laplace mechanism) are employed as primitives. These mech-                                     lines in Figure 3(c) and (d). Consequently, because of the
anisms assume that the data are independent or that adver-                                     composability of DP, adding Lap(1/) noise guarantees only
saries do not have knowledge of the data correlations. How-                                    2-DP against this attacker. We refer to the transition pat-
ever, data collected from the real world tend to be tempo-                                     tern of a single user as temporal correlation. By contrast,
rally or spatially correlated, and such correlations could be                                  when an attacker has the following knowledge (illustrated in
acquired by adversaries. In this study, we introduce a new                                     Figure 3(b)), privacy leakage will also occur: (1) Users u1
class of adversaries named realistic adversaries who have                                      and u2 are always at the same location during working hours
                                                                                               (e.g., they are colleagues). (2) Users u2 and u3 are always
              t1   t2     t3   t4   t5 …                t1    t2    t3   t4   t5 …
                                                                                               at the same location during leisure time (e.g., they are new-
              ho
                                                                                               lyweds). We refer to such geographical similarities between
         u1   me                    bar   …        u1   ho                    bar    …
                                                        me
                                                                                               users as spatial correlations. In Figure 3(c) and (d), it is
                                    ho
         u2        bar                    …        u2                         ho
                                                                                     …
                                    me                       bar              me               easy to see the resulting patterns, which are illustrated as
              ho   offi   gy
         u3   me   ce     m         bar   …        u3   ho
                                                        me
                                                             offi
                                                             ce
                                                                    gy
                                                                    m         bar    …         dashed lines. Therefore, because of the high sensitivity in-
          (a) w-event privacy (w=3).               (b) l-trajectory privacy (l=3).
                                                                                               duced by the highly correlated nature of the data, the addition
                                                                                               of Lap(1/) noise achieves only 2-DP.
Figure 2: Illustration of `-trajectory privacy com-
pared with w-event privacy [15].                                                               2
                                                                                                   Lap(b) denotes a Laplace distribution with variance 2b2 .
                                                                                                             ε= 1.0
4.2     Realistic Adversaries                                                                                      0.9
                                                                                                                         (a) Strong temporal corr.     (a)
                                                                                                                         (b) Moderate temporal corr.
   To formalize realistic adversaries (RAs) and their impact                                                       0.8
                                                                                                                         (c) No temporal corr.
                                                                                            0.7                                               (b)
on privacy leakage, we first study commonly used models                                                                                       0.475 0.496


                                                                                                             TPL
                                                                                            0.6
                                                                                                                                   0.422 0.45
for describing temporal and spatial correlations as means of                                0.4
                                                                                                                 0.302
                                                                                                                       0.349
                                                                                                                             0.388

                                                                                            0.3            0.247
expressing the prior knowledge of RAs. We then define a                                              0.181
                                                                                            0.2
                                                                                                 0.1                                          (c)
new privacy model, ST -Differential Privacy (ST -DP), to                                  0.1

account for RAs.                                                                                t=1 2        3     4     5     6     7    8     9 10
Definition 5 (Temporal Correlations). The temporal cor-                   Figure 4: Privacy leakage may increase over time.
relation of user i is described by a transition matrix Pi ∈
R|loc|×|loc| , which represents Pr(lit−1 |lit ), where lit is the lo-    achieve 0.1-DP at each time point. However, in an extreme
cation of user i at time t.                                              case, if a user’s data at any two successive time points are
Definition 6 (Spatial Correlations). Let xi denote Gaus-                 correlated with probability 1, say, Pa = 10 01 , then continual
sian random variables that represent the possible locations of           data publishing is equivalent to releasing the same data mul-
each user i ∈ [1, n]. GMRF G, which we call the correlation              tiple times. Hence, the privacy leakage at each time point
graph, describes the spatial correlations among x1 , . . . , xn .        will accumulate with respect to previous time points, exhibit-
   We define realistic adversaries in a comprehensive and                ing a linear increase (Figure 4(a)). In another extreme case,
flexible manner such that various scenarios can be repre-                if there is no temporal             correlation, or a uniform correlation
                                                                         such as Pc = 00 11 , then the privacy leakage at each time
                                                                                                     
sented.
Definition 7 (Realistic Adversaries). The adversaries tar-               point does not increase, as shown in Figure 4(c). Figure
                                                                         4(b) depicts the privacy leakage induced by Pb = 0.8                               0.2
                                                                                                                                                                
geting user i who possess various levels of prior knowledge,                                                                                              0 1 ,
denoted by Ri (Pi , Gi , m), are called the realistic adversaries        which can be finely calculated using our algorithm.
of that user. There are three basic types of realistic adver-            5. CONCLUSION AND FUTURE WORK
saries: (1) Ri (Pi , ∅, m), (2) Ri (∅, Gi , m), and (3) Ri (Pi , Gi , m). In this research, we addressed two fundamental limita-
   Attack by realistic adversaries is Bayesian in nature [17].           tions of differential privacy by developing rigorous and flex-
         t
Let DK       ⊂ Dt represent the prior knowledge of such ad-              ible privacy models for the use of personal spatiotemporal
versaries, and let DUt be the unknown tuples, i.e., Dt =                 data. An interesting future direction of research will be to
   t
DK     ∪ DUt ∪ {lit }. The adversaries attempt to infer the lo-          extend our privacy models to other data types.
cation of user i at time t. To this end, they first infer DUt            Acknowledgments
               t                                    0
based on DK       and a guessed value of lit (or lit , to represent
a different guess) and then attempt to distinguish the dif-              We appreciate the comments of the anonymous reviewers.
                                                                    0    This research is partially supported by the Center of Innova-
ference between the two neighboring databases Dt and Dt ,
            t0      t    t    t0                                         tion Program of the Japan Science and Technology Agency,
where D = DK ∪ DU ∪ {li }.
                                                                         JST.
Definition 8 (-DPST ). At each time t, a DP mechanism
     t          t                           t
M takes D as input and produces r as output. The pri-                    6.[1] Benjamin
                                                                                REFERENCES
                                                                                         C. M. Fung, Ke Wang, Rui Chen, and Philip S. Yu.
                      t
vacy leakage of M against Ri (Pi , Gi , m) is defined as follows.              Privacy-preserving data publishing: A survey of recent developments.
                                                                               ACM Comput. Surv., 42(4):14:1–14:53, 2010.
                       t   def                     Pr(r 1 , . . . , r t |lit , DK t
                                                                                    )          [2] Latanya Sweeney. K-anonymity: A model for protecting privacy. Int. J.
         P L(Ri , M ) ==           sup       log                                                   Uncertain. Fuzziness Knowl.-Based Syst., 10(5):557–570, 2002.
                                 ∀|D t |=m         Pr(r 1 , . . . , r t |lit 0 , DK
                                                                                  t )
                                                                                               [3] Lars Backstrom, Cynthia Dwork, and Jon Kleinberg. Wherefore art thou
                                    K
                                                                                                   r3579x?: Anonymized social networks, hidden patterns, and structural
                           P          1           t  t     t t      t                              steganography. In WWW, pages 181–190, 2007.
                             D t Pr(r , . . . , r |D ) Pr(DU |li , DK )                        [4] A. Narayanan and V. Shmatikov. Robust de-anonymization of large sparse
                                U
       =     sup       log P         1           t  t0     t t0      t
                                                                                        (4)        datasets. In S&P ’08, pages 111–125, 2008.
           ∀|D t |=m
              K              D t Pr(r , . . . , r |D ) Pr(DU |li , DK )                        [5] Yves-Alexandre de Montjoye, CÃľsar A. Hidalgo, Michel Verleysen, and
                              U
                                                                                                   Vincent D. Blondel. Unique in the crowd: The privacy bounds of human
   Mt satisfies -DPST (i.e., DP under spatiotemporal corre-                                       mobility. Sci. Rep., 3, 2013.
                                                                                               [6] Yves-Alexandre de Montjoye, Laura Radaelli, Vivek Kumar Singh, and
lations) if, for all Ri , i ∈ [1, n], it holds that P L(Ri , Mt ) ≤ .                             Alex âĂĲSandyâĂİ Pentland. Unique in the shopping mall: On the
                                                                                                   reidentifiability of credit card metadata. Science, 347(6221):536–539, 2015.
                                                                                               [7] Yan Michalevsky, Gabi Nakibly, Gunaa Arumugam Veerapandian, Dan
4.3     Quantifying Privacy Leakage                                                                Boneh, and Gabi Nakibly. PowerSpy: location tracking using mobile device
                                                                                                   power analysis. In SEC, pages 785–800, 2015.
   The primary challenge of protecting against realistic ad-                                   [8] Cynthia Dwork. Differential privacy. In ICALP, pages 1–12, 2006.
                                                                                               [9] Ashwin Machanavajjhala, Daniel Kifer, Johannes Gehrke, and
versaries is to finely quantify the privacy leakage, i.e., Equa-                                   Muthuramakrishnan Venkitasubramaniam. L-diversity: Privacy beyond
tion (4). Because of space limitation, we briefly describe                                         k-anonymity. ACM Transactions on Knowledge Discovery from Data,
                                                                                                   1(1):3–es, 2007.
a concept for how this can be achieved and present a pre-                                     [10] Ninghui Li, Tiancheng Li, and S. Venkatasubramanian. t-closeness:
                                                                                                   Privacy beyond k-anonymity and l-diversity. In IEEE 23rd International
liminary result. We analytically divide Equation (4) into                                          Conference on Data Engineering, 2007. ICDE 2007, pages 106–115, 2007.
two parts, namely, the privacy leakages w.r.t. adversaries                                    [11] Xiaokui Xiao and Yufei Tao. M-invariance: Towards privacy preserving
                                                                                                   re-publication of dynamic datasets. In SIGMOD, pages 689–700, 2007.
of type (1) (in Definition 7) and those of type (2). The                                      [12] Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith.
                                                                                                   Calibrating noise to sensitivity in private data analysis. In TCC, pages
corresponding problems of quantification can be translated                                         265–284, 2006.
into inference on the GMRF (for type (1)) and a linear-                                       [13] Frank McSherry and Kunal Talwar. Mechanism design via differential
                                                                                                   privacy. In FOCS, pages 94–103, 2007.
factional programming problem (for type (2)). We then an-                                     [14] Yang Cao and Masatoshi Yoshikawa. Differentially private real-time data
alyze the combination of these problems, i.e., the privacy                                         publishing over infinite trajectory streams. IEICE Trans. Inf.& Syst.,
                                                                                                   E99-D(1):163–175, 2016.
leakage w.r.t. type (3).                                                                      [15] Georgios Kellaris, Stavros Papadopoulos, Xiaokui Xiao, and Dimitris
                                                                                                   Papadias. Differentially private event sequences over infinite streams.
   The result reveals that the privacy leakage of a DP mech-                                       PVLDB, 7(12):1155–1166, 2014.
anism w.r.t. realistic adversaries may increase over time, as                                 [16] Liyue Fan, Li Xiong, and Vaidy Sunderam. FAST: differentially private
                                                                                                   real-time aggregate monitor with filtering and adaptive sampling. In
shown in Example 2.                                                                                SIGMOD, pages 1065–1068, 2013.
                                                                                              [17] Daniel Kifer and Ashwin Machanavajjhala. Pufferfish: A framework for
Example 2. According to the Laplace mechanism, tradi-                                              mathematical privacy definitions. ACM Trans. Database Syst.,
                                                                                                   39(1):3:1–3:36, 2014.
tionally, adding Lap(1/0.1) noise to published counts can

</pre>