=Paper= {{Paper |id=Vol-3318/short3 |storemode=property |title=Privacy Amplification for Episodic Training Methods |pdfUrl=https://ceur-ws.org/Vol-3318/short3.pdf |volume=Vol-3318 |authors=Vandy Tombs,Olivera Kotevska,Steven Young |dblpUrl=https://dblp.org/rec/conf/cikm/TombsKY22 }} ==Privacy Amplification for Episodic Training Methods== https://ceur-ws.org/Vol-3318/short3.pdf

Privacy Amplification for Episodic Training Methods
Vandy Tombs1 , Olivera Kotevska1 and Steven Young1
1
Oak Ridge National Laboratory, 1 Bethel Valley Road, Oak Ridge, TN 37830

Abstract
It has been shown that differential privacy bounds improve when subsampling within a randomized mechanism. Episodic
training, utilized in many standard machine learning techniques, uses a multistage subsampling procedure which has not
been previously analyzed for privacy bound amplification. In this paper, we focus on improving the calculation of privacy
bounds in episodic training by thoroughly analyzing privacy amplification due to subsampling with a multi-stage subsam-
pling procedure. The newly developed bound can be incorporated into existing privacy accounting methods.

1. Introduction the privacy amplification due to subsampling as well as a
complete analysis for Poisson and simple random subsam-
As more data is being utilized by algorithms and machine pling both with and without replacement subsampling
learning techniques, rigorously maintaining the privacy methods.
of this data has become important. Cyber security, health, The subsampling methods analyzed previously include
and census data collection are all examples of fields that many of the subsampling methods utilized by machine
are seeing increased scrutiny for ensuring the privacy learning; however, the methods does not capture batches
of data, and it is well known that just anonymizing the formed by algorithms that use episodic training. Episodic
data by removing features such as name is not sufficient training methods are utilized by a variety of machine
to guarantee privacy due to vulnerabilities such as re- learning algorithms, such as meta learning (e.g., [8, 9])
identification attacks, especially in the case when an or metric learning (e.g., [10, 11]) algorithms. Domain
adversary has access to auxiliary knowledge or data (see generalization algorithms have also frequently utilized
e.g. [1, 2]). episodic training [12].
Differential privacy, first introduced by Dwork, is one In this paper, we analyze the privacy amplification
technical definition of privacy that has been studied due to the subsampling method utilized in an episodic
widely in the literature [3, 4]. This definition provides training regime. Specifically, we notice forming batches
rigorous guarantees for the privacy of data that is uti- in episodic training is a multistage subsampling method,
lized by an algorithm and has several nice properties like and we provide a complete analysis of the improved dif-
robustness to post processing and strong composition ferential privacy bounds when applying a mechanism to
theorems. a sample drawn using multistage subsampling. The re-
Machine learning practitioners initially integrated dif- sulting theorem can be easily applied to episodic training
ferential privacy by naively applying these composition methods and integrated with privacy accounting meth-
theorems algorithm by assuming that the algorithm ac- ods such as the moment’s accountant [5]. This bound
cessed the entire training set on each step of training. can also be utilized by practitioners of other domains
Abadi et al. [5] noticed the data is subsampled into that use multistage subsampling within their algorithms.
batches, so only a subset of the data is utilized for each
step of training. This allowed for improved privacy bounds;
however, they assumed that batches were created us- 2. Background and Related Works
ing Poisson sampling. Later authors showed improved
bounds for creating batches using simple random sam- 2.1. Multistage Subsampling
pling without replacement [6]. And most recently, Balle
In a multistage sampling procedure, the universe from
et al. [7] provided a fully unified theory for determining
which samples are drawn is partitioned. These partitions
This manuscript has been authored by UT-Battelle, LLC, under may contain the examples we are ultimately interested
contract DE-AC05-00OR22725 with the US Department of En- in sampling or may contain one or several levels of parti-
ergy (DOE). The publisher acknowledges the US government li- tions. The subsampling procedure is to sample partitions
cense to provide public access under the DOE Public Access Plan at each level until examples are sampled. For example,
(http://energy.gov/downloads/doe-public-access-plan).
Research sponsored by the Laboratory Directed Research and De-
if we are interested in the demographics of students at
velopment Program of Oak Ridge National Laboratory, managed by a school, we could partition students by teacher, sample
UT-Battelle, LLC, for the U. S. Department of Energy. some number of teachers and then sample students from
each sampled teacher.
© 2022 Author:Pleasefillinthe\copyrightclause macro
To see that episodic training is a multistage subsam-
CEUR
Workshop
Proceedings
http://ceur-ws.org
ISSN 1613-0073
CEUR Workshop Proceedings (CEUR-WS.org)
′
pling procedure, consider how training batches are formed Given 𝜀 > 0, let 𝜀′ = 𝑙𝑜𝑔(1+𝜂(𝑒𝜀 −1)) and 𝛽 = 𝑒𝜀 /𝑒𝜀 ,
in Algorithm 2 of [8]. In this work, a subset of tasks are the following holds:
sampled from a collection of tasks, then the examples are
sampled and provided to the training algorithm. This is 𝐷𝑒𝜀′ (𝜇||𝜇′ ) = 𝜂𝐷𝑒𝜀 (𝜇1 ||(1 − 𝛽)𝜇0 + 𝛽𝜇′1 )
a 2-stage sampling procedure since the training data is The final theorem provides the concrete privacy ampli-
only partitioned into two levels: tasks and examples. In fication that we need for our analysis. Before presenting
multistage subsampling, the first level of partitions are this, we need to define when two distributions 𝜐, 𝜐 ′ ∈
the primary sampling units and the final level is called P(𝑌 ) are 𝑑𝑌 -compatible. Let 𝜋 be a coupling of 𝜐, 𝜐 ′ ,
the ultimate sampling units and this final level contains define 𝑑𝑌 (𝑦, 𝑦 ′ ) = 𝑑𝑌 (𝑦, supp(𝜐 ′ )) where (𝑦, 𝑦 ′ ) ∈
the examples we are ultimately interested in sampling. supp(𝜋) and the distance between a point 𝑦 and supp(𝜐 ′ )
For more details on multistage subsampling, see e.g., [13]. is defined to be the distance between 𝑦 and the closest
point in supp(𝜐 ′ ).
2.2. Differential Privacy
Theorem 2. Let 𝐶(𝜐, 𝜐 ′ ) be the set of all couplings
Since our analysis utilizes the tools of Balle et al. [7], we between 𝜐 and 𝜐 ′ and for 𝑘 ≥ 1 let 𝑌𝑘 = {𝑦 ∈ supp(𝜐) :
introduce the necessary notations and definitions from 𝑑𝑌 (𝑦, supp(𝜐 ′ )) = 𝑘}. If 𝜐 and 𝜐 ′ are 𝑑𝑌 -compatible,
it. Let 𝒰 be an input space equipped with a binary sym- then the following holds:
metric relation ≃𝒰 that describes the concept of neigh- ∑︁ ∑︁
boring inputs. For our purposes, 𝒰 is a universe that the 𝜋∈𝐶(𝜐,𝜐
min ′
)
𝜋𝑦,𝑦′ 𝛿ℳ,𝑑𝑌 (𝑦,𝑦′ ) (𝜀) = 𝜐(𝑌𝑘 )𝛿ℳ,𝑘 (𝜀)
training data is drawn from and the relation will be the 𝑦,𝑦 ′ 𝑘≥1

add-one/remove-one relation, thus two training sets are We are now equipped to begin an analysis of the pri-
related if they differ by the addition or removal of one vacy amplification due to multistage subsampling.
element.
Given a randomized algorithm or mechanism ℳ :
𝑋 → P(𝑍), where P(𝑍) is the set of probability mea- 3. OUR APPROACH: Privacy
sures on the output space 𝑍, ℳ is (𝜀, 𝛿)-differentially
private w.r.t ≃𝒰 if for every pair 𝒯 ≃𝒰 𝒯 ′ and every
Bounds for Multistage
measurable subset 𝐸 ⊆ 𝑍, Sampling Analysis
Pr[ℳ(𝒯 ) ∈ 𝐸] ≤ 𝑒𝜀 Pr[ℳ(𝒯 ′ ) ∈ 𝐸] + 𝛿. We will begin the analysis with an example. Through
this example, we will introduce the notation necessary
Utilizing the tools from [7] requires expressing differen- for the general analysis.
tial privacy in terms of 𝛼-divergence given by
Example 3.1. Let 𝒰 be a universe of 18 examples from
𝐷𝛼 (𝜇||𝜇′ ) := sup(𝜇(𝐸) − 𝛼𝜇(𝐸)) which the database or training data is drawn from. Sup-
𝐸 pose we can categorize the data from the universe at 3
of two probability measures 𝜇, 𝜇 ∈ P(𝑍), where 𝐸 different levels, so we will perform a 3-stage sampling.
′

ranges over all measurable subsets of 𝑍. Differential pri- Let
vacy can then be stated in terms of 𝛼-divergence; specif- 𝒰 = 𝑈1 ∪ 𝑈2
ically, a mechanism ℳ is (𝜀, 𝛿)-differentially private if
= (𝑈11 ∪ 𝑈12 ∪ 𝑈13 ) ∪ (𝑈21 ∪ 𝑈22 )
and only if 𝐷𝑒𝜖 (ℳ(𝒯 )||ℳ(𝒯 ′ )) ≤ 𝛿 for every adjacent
datasets 𝒯 ≃𝒰 𝒯 ′ .
(︀
= {𝑢111 , 𝑢112 , 𝑢113 , 𝑢114 } ∪ {𝑢121 , 𝑢122 }
We can now define the privacy profile of a mechanism )︀ (︀
∪ {𝑢131 , 𝑢132 , 𝑢133 } ∪ {𝑢211 , 𝑢212 , 𝑢213 , 𝑢214 }
ℳ as 𝛿ℳ = sup𝒯 ≃𝒰 𝒯 ′ 𝐷𝑒𝜖 (ℳ(𝒯 )||ℳ(𝒯 ′ )), which )︀
associates each privacy parameter 𝛼 = 𝑒𝜀 with a bound ∪ {𝑢221 , 𝑢222 , 𝑢223 , 𝑢224 , 𝑢225 }
on the 𝛼-divergence between the results of the mecha- In this example, 𝑈𝑖1 for 𝑖 ∈ {1, 2} are the primary sam-
nism on two adjacent datasets. pling units, the 𝑈𝑖1 𝑖2 are the ultimate sampling units and
Two theorems from [7] are important in our analysis. the 𝑢𝑖1 𝑖2 𝑖3 are the examples that would be provided to a
The first is Advanced Joint Convexity, which we restate intraining algorithm.
terms of 𝛼 = 𝑒𝜀 since we are interested in applying this In general, let 𝒰 be a universe from which the training
theorem to improve the privacy bounds due to multistage data is drawn and suppose a finite number of levels, 𝑁𝐿 ,
subsampling. partition this universe. Define 𝑈𝑖 be the primary sam-
Theorem 1. ([7], Advanced Joint Convexity of 𝐷𝑒 ) Let pling units and let 𝑈𝑖1 𝑖2 ···𝑖𝑁𝐿−1 be the sampling units of
𝜀

𝜇, 𝜇′ ∈ P(𝑍) be measures satisfying 𝜇 = (1 − 𝜂)𝜇0 + the 𝑈𝑖1 𝑖2 ···𝑖𝑁𝐿−2 unit. 𝑈𝑖1 𝑖2 ···𝑖𝑁𝐿−1 is an ultimate sam-
𝜂𝜇1 and 𝜇′ = (1 − 𝜂)𝜇0 + 𝜂𝜇′1 for some 𝜂, 𝜇0 , 𝜇1 , 𝜇′1 . pling unit which contain the examples we are interest in
sampling. Note that we require that each sampling unit Coupling Theorem from [7]. We just need to compute:
be of finite size except the ultimate sampling units, which ∑︁
may be infinite. The multistage sampling procedure can 𝑇 𝑉 (𝜇, 𝜇′ ) = 1 − min (𝜇(𝑢), 𝜇′ (𝑢))
be described by Algorithm 1: Multistage Sampling. Most 𝑢∈𝑈
episodic training procedures only use 2- or 3-stage sam-
pling but we analyze the general case; which may have Note we can easily extend our probability measures 𝜇, 𝜇′
applications to other scientific domains (e.g. medical do- to the entire universe by setting the inclusion probability
mains) where multistage sampling may have more levels. to 0 for any element not in 𝑇 or 𝑇 ′ . For all elements
𝑡 ∈ 𝒯 ′ ∖ 𝑇𝑖1 𝑖2 ···𝑖𝑁𝐿−1 , we have min(𝜇(𝑡), 𝜇′ (𝑡)) =
𝜇(𝑡) = 𝜇′ (𝑡). Since 𝑡𝑖1 𝑖2 ···𝑖𝑁𝐿−1 1 ̸∈ 𝒯 ′ , we also have
Algorithm 1: Multistage Sampling
min(𝜇(𝑡𝑖1 𝑖2 ···𝑖𝑁𝐿−1 1 ), 𝜇′ (𝑡𝑖1 𝑖2 ···𝑖𝑁𝐿−1 1 )) = 0 . So we
Set 𝑃 𝑟𝑒𝑣𝐿𝑒𝑣𝑒𝑙 := 𝑈𝑖
⋃︀
just need to consider the elements of the ultimate unit
Set 𝐶𝑢𝑟𝑟𝑒𝑛𝑡𝐿𝑒𝑣𝑒𝑙 := ∅
from which we removed an element. Since, we removed
Given 𝑛𝑗 : the number of units to be sampled at
an element from this unit, the probability 𝜇′ (𝑡) > 𝜇(𝑡)
each level (1 ≤ 𝑗 ≤ 𝑁𝐿 )
since 𝑇𝑖′1 𝑖2 ···𝑖𝑁 (the ultimate unit missing an element
for 𝑗 ∈ {1, ...., 𝑁𝐿 } do 𝐿−1

for 𝑆 ∈ PrevLevel do in 𝑇 ′ ) has fewer elements than 𝑇𝑖1 𝑖2 ···𝑖𝑁𝐿−1 , therefore
sample without replacement 𝑛𝑗 elements for all 𝑡𝑖1 𝑖2 ···𝑖𝑁𝐿−1 𝑖 ∈ 𝑇𝑖′1 𝑖2 ···𝑖𝑁 and 𝑖 ̸= 1, we have
from 𝑆
𝐿−1
𝜇(𝑡𝑖1 𝑖2 ···𝑖𝑁𝐿−1 𝑖 ) < 𝜇′ (𝑡′𝑖1 𝑖2 ···𝑖𝑁 𝑖 ) where
add sampled elements to 𝐶𝑢𝑟𝑟𝑒𝑛𝑡𝐿𝑒𝑣𝑒𝑙 𝐿−1

end ∏︀𝑁𝐿
𝑃 𝑟𝑒𝑣𝐿𝑒𝑣𝑒𝑙 = 𝐶𝑢𝑟𝑟𝑒𝑛𝑡𝐿𝑒𝑣𝑒𝑙 𝑗=1 𝑛𝑗
𝜇(𝑡𝑖1 𝑖2 ···𝑖𝑁𝐿−1 𝑖 ) =
𝐶𝑢𝑟𝑟𝑒𝑛𝑡𝐿𝑒𝑣𝑒𝑙 := ∅ |𝑈𝑖1 ||𝑈𝑖1 𝑖2 | · · · |𝑇𝑖1 𝑖2 ···𝑖𝑁𝐿−1 |
end ∏︀𝑁𝐿
𝑗=1 𝑛𝑗
𝜇′ (𝑡′𝑖1 𝑖2 ···𝑖𝑁 𝑖) = .
𝐿−1 |𝑈𝑖1 ||𝑈𝑖1 𝑖2 | · · · |𝑇𝑖′1 𝑖2 ···𝑖𝑁 |
Now, let 𝒯 ⊂ 𝒰 be the training data or database we 𝐿−1

are analyzing. We will require that the training data has Thus
at least one element from each sampling unit described ∑︁ ∑︁
above. Thus we only allow the ultimate sampling units min (𝜇(𝑢), 𝜇′ (𝑢)) = 𝜇(𝑡) = 1−𝜇(𝑡𝑖1 𝑖2 ···𝑖𝑁𝐿−1 1 ).
of the training data 𝑇𝑖1 𝑖2 ···𝑖𝑁𝐿−1 ⊂ 𝑈𝑖1 𝑖2 ···𝑖𝑁𝐿−1 , to be 𝑢∈𝒰 𝑡∈𝒯 ′
a non-empty finite subset of the ultimate sampling units
Hence the total variational distance is just the inclusion
with at least 𝑛𝑁𝐿 elements (i.e. at least the number of
probability of the element we removed. Determining the
units that will be sampled from the ultimate sampling
total variational distance when adding an element from
units). All other sampling units defined for the universe
𝒰 to 𝒯 is similar to the above argument.
will remain the same for the training set.
We can now provide an amplified privacy bound for
We want to analyze the privacy bound on algorithms
multistage subsampling.
that use a multistage subsampling procedure on 𝒯 . To do
this, we will apply the theorems from [7] and will analyze Theorem 3. Let ℳ′ be a subsampled mechanism on 𝒯
this sampling procedure under the add-one/remove one described by Algorithm 1 and let 𝑚1 𝑚2 . . . 𝑚𝑁𝐿 −1 be
relation. We begin by defining a probability measure the index of the penultimate sampling unit that satisfies
for this sampling procedure. We can do this by simply
defining min (|𝑈𝑖1 ||𝑈𝑖1 𝑖2 | · · · |𝑇𝑖1 𝑖2 ···𝑖𝑁𝐿−1 |).
𝑖1 ,𝑖2 ,...,𝑖𝑁
𝐿−1
∏︀𝑁𝐿
𝑗=1 𝑛𝑗 Then, for any 𝜖 ≥ 0, we have that 𝛿ℳ′ (𝜖′ ) ≤ 𝜂𝛿ℳ′ (𝜖)
𝜇(𝑡𝑖1 𝑖2 ···𝑖𝑁𝐿 ) =
|𝑈𝑖1 ||𝑈𝑖1 𝑖2 | · · · |𝑇𝑖1 𝑖2 ···𝑖𝑁𝐿−1 | ∏︀𝑁𝐿
𝑛𝑗
for and 𝜂 = |𝑈𝑚 ||𝑈𝑚 𝑚 |···|𝑇
𝑗=1
𝑚 𝑚 ···𝑚 |
and 𝜖′ =
1 1 2 1 2 𝑁𝐿−1
where 𝑡𝑖1 𝑖2 ···𝑖𝑁𝐿 is in the ultimate unit 𝑇𝑖1 𝑖2 ···𝑖𝑁𝐿−1 . 𝑙𝑜𝑔(1 + 𝜂(𝑒𝜖 − 1)) under the add-one/remove-one rela-
Now consider 𝒯 ′ created by removing one element tion.
from 𝒯 , say without loss of generality, 𝑡𝑖1 𝑖2 ···𝑖𝑁𝐿−1 1 for
To fully complete the proof, let 𝒯 , 𝒯 ′ be training sets
some 𝑖1 , 𝑖2 , ..., 𝑖𝑁𝐿−1 . The probability measure 𝜇′ for
drawn from 𝒰 with 𝒯 ≃𝑟 𝒯 ′ under the add-one/remove-
sampling from 𝒯 ′ can be defined similar to above. We
one relation ≃𝑟 and let 𝑆𝛾 (𝒯 ) denote the subsampling
wish to compute the total variational distance between
mechanism described by Algorithm 1 for 𝛾 = 𝑇 𝑉 (𝜇, 𝜇′ ).
these two measure so that we can apply the Advanced
Let 𝒯0 = 𝒯 ∩ 𝒯 ′ , then by definition of ≃𝑟 , 𝒯0 = 𝒯 or
𝒯0 = 𝒯 ′ . Let 𝜔0 = 𝑆𝛾 (𝒯0 ), 𝜇 = 𝑆𝛾 (𝒯 ) and 𝜇′ =
𝑆𝛾 (𝒯 ′ ). Then the decompositions of 𝜇 and 𝜇′ induced [4] C. Dwork, Differential privacy: A survey of re-
by their maximal coupling have that 𝜇1 = 𝜔0 when sults, in: M. Agrawal, D. Du, Z. Duan, A. Li (Eds.),
𝒯0 = 𝒯 or 𝜇′1 = 𝜔0 when 𝒯0 = 𝒯 ′ . We only need to Theory and Applications of Models of Computa-
consider 𝒯0 = 𝒯 ′ since this is when the maximum is tion, Springer Berlin Heidelberg, Berlin, Heidelberg,
obtained in applying advanced joint convexity. Finally, 2008, pp. 1–19.
we note that one can easily create a 𝑑≃𝑟 -compatible pair [5] M. Abadi, A. Chu, I. Goodfellow, H. B. McMahan,
according to the definition provided in [7] by first sam- I. Mironov, K. Talwar, L. Zhang, Deep learning
pling 𝑦 from 𝜇 and building 𝑦 ′ by adding 𝑣 (which may with differential privacy, in: Proceedings of the
be empty) to 𝑦. Thus for each dataset pair, by Theorem 2016 ACM SIGSAC Conference on Computer and
7 of [7], we have 𝛿𝑀 ′ (𝜀′ ) ≤ 𝛾𝛿𝑀 (𝜀). In order to get a Communications Security, CCS ’16, Association for
bound for all possible training set pairs, we need to take Computing Machinery, New York, NY, USA, 2016,
𝜂 = 𝑚𝑖𝑛(𝒯 ,𝒯 ′ ) (𝛾𝒯 ≃𝑟 𝒯 ′ ). This occurs exacty when we p. 308–318. URL: https://doi.org/10.1145/2976749.
remove an element from the penultimate unit with index 2978318. doi:10.1145/2976749.2978318.
𝑚1 𝑚2 · 𝑚𝑁𝐿 −1 which completes the proof. [6] Y.-X. Wang, B. Balle, S. Kasiviswanathan,
We briefly mention how one might incorporate this Subsampled rényi differential privacy and
new bound into a privacy accounting method. Many analytical moments accountant, Journal of
accounting methods, like the moments accountant [5], Privacy and Confidentiality 10 (2021). URL: https:
use the moment generating function in conjunction with //journalprivacyconfidentiality.org/index.php/jpc/
the Gaussian mechanism to calculate the privacy bounds article/view/723. doi:10.29012/jpc.723.
while a machine learning algorithm is training. Using [7] B. Balle, G. Barthe, M. Gaboardi, Privacy amplifica-
Theorem 4 from [7] with our new bound one can easily tion by subsampling: Tight analyses via couplings
derive a subsampled Gaussian that can be utilized in and divergences, in: Proceedings of the 32nd In-
algorithms like those described in [5, 14]. ternational Conference on Neural Information Pro-
cessing Systems, NIPS’18, Curran Associates Inc.,
Red Hook, NY, USA, 2018, p. 6280–6290.
4. Conclusion [8] C. Finn, P. Abbeel, S. Levine, Model-agnostic meta-
learning for fast adaptation of deep networks, in:
This paper completely analyzes the privacy amplification
D. Precup, Y. W. Teh (Eds.), Proceedings of the
due to multistage subsampling. This provides the correct
34th International Conference on Machine Learn-
privacy bounds for any algorithm that utilizes multistage
ing, volume 70 of Proceedings of Machine Learn-
subsampling, such as machine learning algorithms that
ing Research, PMLR, 2017, pp. 1126–1135. URL:
use episodic training. Our future goal is to perform exper-
https://proceedings.mlr.press/v70/finn17a.html.
iments to better understand privacy in machine learning
[9] S. Ravi, H. Larochelle, Optimization as a model for
algorithms that use episodic training like meta-learning
few-shot learning, in: ICLR, 2017.
algorithms. We hope our presented approach and discus-
[10] O. Vinyals, C. Blundell, T. Lillicrap, k. kavukcuoglu,
sion will prove useful to other researchers wanting to
D. Wierstra, Matching networks for one shot learn-
apply privacy bounds on multistage sampling in other
ing, in: D. Lee, M. Sugiyama, U. Luxburg,
studies and applications.
I. Guyon, R. Garnett (Eds.), Advances in
Neural Information Processing Systems, vol-
References ume 29, Curran Associates, Inc., 2016. URL:
https://proceedings.neurips.cc/paper/2016/file/
[1] A. Narayanan, V. Shmatikov, Robust de- 90e1357833654983612fb05e3ec9148c-Paper.pdf.
anonymization of large sparse datasets, in: 2008 [11] J. Snell, K. Swersky, R. Zemel, Prototypical
IEEE Symposium on Security and Privacy (sp 2008), networks for few-shot learning, in: I. Guyon,
2008, pp. 111–125. doi:10.1109/SP.2008.33. U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus,
[2] L. Rocher, J. M. Hendrickx, Y.-A. de Mon- S. Vishwanathan, R. Garnett (Eds.), Advances
tjoye, Estimating the success of re- in Neural Information Processing Systems,
identifications in incomplete datasets using volume 30, Curran Associates, Inc., 2017. URL:
generative models 10 (????) 3069. URL: https://proceedings.neurips.cc/paper/2017/file/
https://doi.org/10.1038/s41467-019-10933-3. cb8da6767461f2812ae4290eac7cbc42-Paper.pdf.
doi:10.1038/s41467-019-10933-3. [12] D. Li, J. Zhang, Y. Yang, C. Liu, Y.-Z. Song,
[3] C. Dwork, A. Roth, The algorithmic foundations of T. Hospedales, Episodic training for domain gener-
differential privacy, Found. Trends Theor. Comput. alization, in: 2019 IEEE/CVF International Confer-
Sci. 9 (2014) 211–407. URL: https://doi.org/10.1561/ ence on Computer Vision (ICCV), 2019, pp. 1446–
0400000042. doi:10.1561/0400000042. 1455. doi:10.1109/ICCV.2019.00153.
[13] C.-E. Särndal, B. Swensson, J. Wretman, Model As-
sisted Survey Sampling, Springer-Verlag, 2003, p.
124–162.
[14] I. Mironov, K. Talwar, L. Zhang, R\’enyi differ-
ential privacy of the sampled gaussian mecha-
nism (????). URL: http://arxiv.org/abs/1908.10530.
arXiv:1908.10530.