Metric Pseudoentropy: Characterizations and Applications Maciej Skorski Cryptology and Data Security Group, University of Warsaw maciej.skorski@gmail.com Abstract. Metric entropy is a computational variant of entropy, often used as a convenient substitute of HILL Entropy, slightly stronger and standard notion for entropy in cryptographic applications. In this paper we develop a general method to characterize metric-type computational variants of entropy, in a way depending only on properties of a chosen class of test functions (adversaries). As a consequence, we obtain a nice and elegant geometric interpretation of metric entropy. We apply these characterization to simplify and modularize proofs of some important re- sults, in particular: (a) computational dense model theorem, (b) deriva- tion of the improved version of Leftover Hash Lemma and (c) equivalence between unpredictability entropy and HILL entropy for short strings. 1 Introduction 1.1 Computational Entropy Entropy. Entropy, as a measure of uncertainty or randomness, is a fundamen- tal notion in information-theory. The most known metric of entropy is Shannon Entropy [15]. For cryptographic applications such as extracting randomness, it is more convenient to work with so called min-entropy, which gives an upper bound on the probability that computationally unbounded adversary can guess a value sampled according to a given distribution. A slightly weaker but also very useful, especially in the context of hashing, is the notion of collision entropy which up- perbounds the probability that two independent samples of a given distribution collide. Defining computational variants of entropy. Computational analogues of entropy can be defined in different ways. In any case, we need to formalize that a distribution has, from a computational point of view, the same of almost the same properties like a distribution having “true” information-theoretic en- tropy. This might be based on hardness of compressing-decompressing, hardness of prediction or hardness of distinguishing. In this paper we follow the last ap- proach, which is most widely used. A good survey of different entropy notions and their properties can be found in [3] and [12]. We stress that, contrarily to the information-theoretic case, for computational entropy it’s not only the amount of entropy that matters but also its quality is important. Metric Pseudoentropy: Characterizations and Applications 91 Computational Indistinguishability. Indistinguishability is a fundamental concept in computational complexity and cryptography. For two distributions X, Y taking values in the same space, a class D of [0, 1]-valued functions (refere- ed to as the “attackers class”) and a parameter  (refereed to as the “distin- guishing advantage”), we say that X and Y are (D, )-indistinguishable if for all D ∈ D we have | E D(X) − E D(Y )| 6 . An attacker D can distinguish X and Y if E D(X) − E D(Y ) > 0 or E D(X) − E D(Y ) < 0, and the far from 0 this difference is, the better “advantage” he achieves. Sometimes we want to define indistinguishability between two sets X and Y of probability distributions. We can formalize this by saying that no single adversary D can achieve bigger than 0 advantage for every pair (X, Y ) where X comes from X and Y comes from Y. Since the expectation E D(X) can be thought as the scalar product of vectors representing D and the distribution of X, the concept of indistinguishability is exactly the same concept as the idea of separating hyperplanes. Computational Entropy. Having formalized the concept of “computational closeness”, one can define the “computational” entropy, called also pseudoen- tropy, of a distribution X by one of the following ways: (a) (stronger) X is computationally indistinguishable from a single distribu- tion having required amount of information-theoretic entropy (min-entropy, Shannon Entropy etc.) (b) (weaker) is computationally indistinguishable from a set of all distributions having required amount of information-theoretic entropy. Both approaches turn out to be useful. Setting the underlying information- theoretic entropy measure to be the min-entropy, for case (a) we obtain the notion of HILL entropy [9] which directly generalizes the notion of pseudoran- domness, whereas for case (b) we get the notion of the so called Metric Entropy [3]. Roughly speaking, with HILL entropy one generalizes most of information- theoretic facts about entropy, into the computational setting. Metric entropy is commonly thought as a less intuitive and understood notion than HILL en- tropy. Quite surprisingly it has been proven to be technically more convenient in many problems. The typical approach is to work with metric entropy and to convert it to HILL entropy (which is possible with some loss in quality [3]). For example, the use of metric entropy simplifies and improves the proof of the com- putational variant of the dense model theorem [2], applicable in leakage-resilient cryptography [7]. Notions of pseudoentropy have found also important applica- tions in general complexity theory, for example in [18] a HILL-like variant of Shannon entropy is used to simplify the construction of a PRG from a one-way function. These two examples show also that the notion of pseudoentropy is a key ingredient of important or even breakthrough results and as such is worth of studying. Worst Case Distributions. In problems which involve computational indis- tinguishability it is often convenient to know the distributions which makes the attacker’s advantage maximal. This distribution is typically subjected to some entropy restrictions. In particular, one might ask the following question 92 M. Skorski Given D and X, what is the best (minimal) attacker advantage |∆D | = | E D(X) − E D(Y )| over all distributions Y of entropy as least k? An answer to this question yields a bound on how (computationaly) close is X to the set of all distributions of entropy k. Such problems arises naturally where one uses HILL and Metric entropy, see for instance [3,5,18]. 1.2 Our Results Summary of our contribution. As mentioned, the concept of characterizing the “worst case” distribution which optimizes the attacker advantage is very common, thought not always explicitly stated [2,3,5,14]. In this paper we give a uniform treatment of this idea and use to obtain characterizations for pseudoen- tropy and other interesting corollaries. Characterizing Metric Pseudoentropy via Optimizing Attacker’s Advantage. Using standard constrained optimization techniques, we develop a general method to characterize metric-type pseudoentropy. A characterization is based on explicitly calculating the distribution which minimizes the attacker’s advantage, subject to entropy constraints. These characterizations could be used in studying properties of variants of pseudoentropy based on entropy different than min-entropy. In particular, they could be applied in studying the problem of comparing the amount of metric pseudoentropy against deterministic and randomized adversaries, or verifying the so called “chain rule”. We also unify the definitions of metric and HILL entropy in a nice geometric way. Applications: the power of pseudoentropy characterizations. Our technique leads to interesting corollaries besides the basic properties of pseu- doentropy. From the characterization of metric pseudo-entropy we immediately obtain the computational Dense Model Theorem [2,7,14]. Extending our cha- racterization into the conditional case when side information is available to the attacker, we reprove equivalence between unpredictability and indistinguisha- bility based definition of pseudoentropy for short strings [18]. Finally, from the characterization of collision-pseudoentropy we derive the improved Leftover Hash Lemma 1. Our results show that metric entropy is a powerful tool which deserves the systematic study. 2 Preliminaries Entropy notions. The min-entropy of a distribution X equals P H∞ (X) = − log(maxx Pr[X = x]). The collision entropy of X is H2 (X) = − log( x Pr[X = x]2 ). If there is side information Z, we define the average conditional min-entropy [6] of X given Z by H e ∞ (X|Z) = − log(Ez←Z maxx Pr[X = x|Z = z]). Computational advantage. The advantage of an attacker D in distinugish- ing random variables X and Y , which take values in the same space, is defined to be ∆D (X; Y ) = E D(X) − E D(Y ). Metric Pseudoentropy: Characterizations and Applications 93 Computational Entropy. There is many ways to define computational ana- logues of entropy. We follow the most popular approach, which is based on the concept of computational indistinguishability. Definition 1 (HILL Pseudoentropy [9]). Let X be a distribution with the following property: there exists Y of min-entropy at least k such that for all circuits D of size at most s we have |∆D (X; Y )| 6 . Then we say that X has k HILL,(s,) bits of HILL min-entropy of quality (s, ) and denote by H∞ (X) > k. Remark 1 (HILL entropy against different circuits classes). It is known that for HILL entropy all kind of circuits: deterministic boolean, deterministic real valued and randomized boolean, are equivalent (for the same size s). That’s why we can abbreviate the notation and omit declaring circuits type in Definition 1. Definition 2 (Metric Pseudoentropy [3]). Let X be a distribution with the following property: for every deterministic boolean (respectively: deterministic real valued or boolean randomized) circuit D of size at most s there exists Y of min-entropy at least k such that |∆D (X; Y )| 6 . Then we say that X has k bits of deterministic (respectively: deterministic real valued or boolean ran- M,det{0,1},(s,) domized) metric min-entropy of quality (s, ) and denote by H∞ (X) M,det[0,1],(s,) M,rand{0,1},(s,) (respectively: H∞ (X) and H∞ (X)). Definitions of HILL and metric entropy for entropy notions different than min- entropy, for instance collision entropy can be obtained by replacing min-entropy with collision entropy in Definition 1 and Definition 2. Remark 2 (Metric Entropy against different circuits class). For metric min- entropy, it does not matter if the deterministic circuits are boolean or real valued (see [12] and the errata of [3]). However, this is not true for the conditional case and does not extend to other entropy notions. Computational Entropy - Side information. Sometimes we assume that information Z correlated to X might be available to an adversary. Definition 3 (Conditional HILL Pseudoentropy [10]). Let X, Z be a joint distribution with the following property: there exists Y of average conditional min-entropy at least k given Z such that for all circuits D of size at most s we have |∆D (X, Z; Y, Z)| 6 . Then we say that X given Z has k bits of HILL HILL,(s,) min-entropy of quality (s, ) and denote by H∞ (X|Z) > k. Remark 3 (HILL entropy against different circuits classes). Similarly to Remark 2, here all kinds of circuits: deterministic boolean, deterministic real valued and randomized boolean, are equivalent (for the same size s). Definition 4 (Conditional Metric Pseudoentropy [2]). Let X, Z be a joint distribution with the following property: for every deterministic boolean (respec- tively: deterministic real valued or boolean randomized) circuit D of size at most s there exists Y of average conditional min entropy at least k given Z such 94 M. Skorski that |∆D (X, Z; Y, Z)| 6 . Then we say that X given Z has k bits of deter- ministic (respectively: deterministic real valued or boolean randomized) metric M,det{0,1},(s,) min-entropy of quality (s, ) and denote by H∞ (X|Z) (respectively: M,det[0,1],(s,) M,rand{0,1},(s,) H∞ (X|Z) and H∞ (X|Z)). There is a variant of conditional pseudoentropy where (X, Z) is required to be computationally close to (Y, Z 0 ) but Z 0 is not necessarily the same as Z. This no- tion is called the “relaxed” HILL entropy [12] and denoted by HHILL−rlx,(s,) (X) (for metric variants HM−rlx,det{0,1},(s,) (X) and HM−rlx,det[0,1],(s,) (X) ). Typ- ically we want Z to be the same as Z 01 but this relaxed notion is also useful [8,12]. It satisfies the so called chain rule, a property desired in leakage-resilient cryptography, which doesn’t hold for HILL entropy [11]. Relations between HILL and Metric Pseudoentropy. For any “reaso- nable” notion of (information-theoretic) entropy, metric and HILL variants are equivalent up to some loss in quality parameters s, . Lemma 1 (HILL vs Metric Pseudoentropy, [3]). Let H be an entropy notion which is concave2 . Then for any n-bit random variable X we have 0 0 HHILL,(s , ) (X) > HM,det[0,1],(s,) (X) where δ ∈ (0, 1) is arbitrary, s0 = O s · δ 2 /n and 0 =  + δ. The same   is true  2 for conditional pseudoentropy and relaxed pseudoentropy, with s0 = O s · n+m δ where m is the length of Z. 3 Characterizing Metric Pseudoentropy In what follows we assume that H is a concave entropy notion (like min-entropy or collision entropy), and that all distributions and distinguishers are over {0, 1}n . 3.1 Connections to Separating Hyperplanes We start with the following simple observation, which gives a nice geometrical formulation of the definition of pseudo-entropy. We say that the sets X and Y of probability distributions are (D, )-indistinguishable if there exists no adversary D such that | E D(X) − E D(Y )| >  for all X ∈ X and all Y ∈ Y. It is easy to see that if X and Y are convex and if D is closed under complements (that is D ∈ D implies 1 − D ∈ D) then this is equivalent to There is no D ∈ D such that: E D(X) − E D(Y ) >  for all X ∈ X, Y ∈ Y. 1 For instance, when Z represents information that adversary might have learned. 2 That is, a convex combination of distributions with entropy at least k is a distribution with entropy at least k. This assumption is fulfilled for most notions, for example for all Renyi entropies which include min-entropy and collision entropy Metric Pseudoentropy: Characterizations and Applications 95 We can interpret the expectation E D(X) as the scalar product hD, PX i by n identifying D and distributions of X with the vectors in R2 . Hence we can write the above condition as There is no D ∈ D such that: hD, PX − PY i >  for all X ∈ X, Y ∈ Y, which means that the distinguisher D is precisely a separating hyperplane. If D is a circuit class, X = {X} and Y = {Y : H(Y ) > k} we obtain3 Corollary 1 (Alternative definitions of metric and HILL entropy). Let X be an n-bit random variable and let H be a concave entropy notion. Then (a) HHILL,(s,) (X) > k iff X is (D, )-indistinguishable from some Y of entropy H at least k, where D is the class of boolean circuits4 of size s with n-inputs. (b) HM,det{0,1},(s,) (X) > k iff X is (D, )-indistinguishable from the set of all Y of entropy H at least k, where D is the class of all deterministic boolean circuits of size s with n-inputs (analogously for randomized and deterministic real valued circuits). 3.2 Reduction to Constrained Optimization By the “geometric” view on pseudoentropy, given in Corollary 1, we obtain the following characterization of pseudoentropy. Lemma 2 (Characterization of metric pseudoentropy). Let X and H be as in Corollary 1. Then HM,det{0,1},(s,) (X) > k, respectively HM,det[0,1],(s,) (X) > k if and only if for every boolean (respectively real va- lued) deterministic circuit D of size at most s we have E D(X) 6 E D(Y ∗ ) + , where Y ∗ is optimal to the following optimization problem maximize E D(Y ) Y . (1) s.t. H(Y ) > k This results is useful if we can solve the optimization problem in Eqzatuib (1). In the next subsections we explain how to solve it in general and discuss the two concrete and simple cases: min-entropy and collision entropy. 3 We can assume that the class circuits of size at most s is closed under complements because every complement is of size at most s + 1. Formally we need to start with size s0 = s + 1 but we omit this negligible difference 4 Randomized or deterministic- it makes no difference 96 M. Skorski 3.3 Maximizing Expectations Under Convex Constraints We can characterize optimal solutions of [1] in terms of Lagrange multipliers. Due to convexity, the characterization is both: necessary and sufficient. Lemma 3 (Maximizing expectation under convex constraints). Let f be a differentiable convex real-valued function on Rd . Assume that a is a number such that minp f (p) < a where the minimum is over all probability vectors, and consider the following optimization program X maximize Di pi (pi )i i   f (p) 6 a   . (2) −pi 6 0  s.t. X pi = 1     i Then a feasible point p = p∗ is optimal to [2] if and only if there exist λ1 > 0, λ2 > 0 and λ3i ∈ R for i = 1, . . . , m such that the following relations hold Di = λ1 (∇f (p∗ ))i − λ3i + λ2 for i = 1, . . . , m (3) and the following complementary condition is satisfied: pi · λ3i = 0 (4) Proof. The Slater Constraint Qualification holds, by the assumption on a, and we have strong duality. In other words, the first order Karush-Kuhn-Tucker condition is sufficient and necessary [4]. The numbers λ1 , λ2 , λ3i are exactly KKT multipliers for the convex program in Equation (2), and Equation (3) states that the gradient of the objective function is a combination of gradients of constraints. The condition in Equation (4) means that we take only active constraints into account. Finally, to the inequality constraints we assign non-negative multipliers which explains the requirement λ1 > 0 and λ3i > 0. t u Remark 4. If f is not differentiable, we replace the gradient of f in optimality conditions by the subdifferential of f , which always exists for a convex function. 3.4 Characterization of Metric Min Entropy For H = H∞ we obtain from Lemma 3 the following simple characterization of pseudoentropy based on min-entropy (see [3] for a restricted variant) Theorem 1 (Characterization of metric min-entropy). Let X be an n-bit M,det{0,1},(s,) M,det[0,1],(s,) r.v.. Then H∞ (X) > k, respectively H∞ (X) > k if and Metric Pseudoentropy: Characterizations and Applications 97 only if for every boolean (respectively real valued) deterministic circuit D of size at most s with n inputs we have E D(X) 6 E D(Y ∗ ) + , where Y ∗ is uniform over the set of 2k values of x which correspond to the biggest values of D(x). Extending Lemma 3 by adding additional constraints, to cover the case of side information, we obtain the characterization of conditional metric entropy Theorem 2 (Characterization of conditional metric min-entropy). Let M,det{0,1},(s,) X and Z be, respectively, n and m-bit random variables. Then H∞ (X) M,det[0,1],(s,) > k (respectively H∞ (X) > k) iff for every boolean (respectively real valued) deterministic circuit D of size at most s on {0, 1}n+m we have E D(X, Z) 6 E D(Y ∗ , Z) + , for Y ∗ such that Y ∗ |Z = z is uniform over the set {D(x, z) > t(z)} for every z, where the thresholds t(z) satisfy the following two conditions E E max(D(x, z) − t(z)) = const for all z x←Un E [1/# {x : D(x, z) > t(z)}] 6 2−k 6 E [1/# {x : D(x, z) > t(z)}] . z←Z 3.5 Characterization of Metric Collision Entropy The characterization of the worst-case collision entropy distribution is slightly different. It is proportional to a distinguisher, after taking a threshold. Theorem 3 (Characterization of metric collision entropy). Let X be an M,det{0,1},(s,) M,det[0,1],(s,) n-bit r.v.. Then H2 (X) > k, respectively H2 (X) > k if and only if for every boolean (respectively real valued) deterministic circuit D of size at most s with n inputs we have E D(X) 6 E D(Y ∗ ) + , where Y ∗ satisfies λ · PY ∗ (x) = max(D(x) − t, 0) for some t ∈ R and λ > 0. 2 Remark 5. Note that t is a solution of E D0 (U )2 = 2n−k E D0 (U ) where D0 (x) = max(D(x) − t, 0) p E D0 (U ). It follows that E D0 (Y ∗ ) = and λ = 2n √ 2n−k E D0 (U ) = E D0 (U ) + VarD0 (U ) · 2n−k − 1. 4 Applications 4.1 Computational Dense Model Theorem We say that a distribution A is γ-dense in B if we have Pr[A = x] 6 Pr[B = x]/γ. The Dense Model Theorem is the statement of the following form: if X is (s, )- indistinguishable from the uniform distribution R and X 0 is γ-dense in X, then 98 M. Skorski there exists a distribution R0 which is γ-dense in R and is (s0 , 0 )- indistinguish- able from X 0 , where s0 and 0 depends as explicit functions on s and . In this sense, R is a dense “model” for X 0 . The dense model theorem was proved first by Tao and Ziegler [17]. It’s efficient versions5 have found important applications in complexity theory and cryptography [2,7,14], see also [16]. Below we recall a version with improved parameters, stated in language of pseudoentropy and called the “leakage lemma”: Theorem 4 (Leakage Lemma [2,7]). Let X be an n-bit random variable HILL,(s,) such that H∞ (X) > k and let Z be correlated with X. Then we have HILL,(s0 ,0 ) (X|Z=z ) > k 0 where k 0 = k − log(1/ Pr[Z = z]), s0 = O s · δ 2 /n  H∞ and 0 = / Pr[Z = z] + δ, for any δ ∈ (0, 1). The lemma states that the amount of pseudoentropy due to leakage of t bits of information decreases roughly by t, hence its name. The original proof was simplified by the use of metric entropy [2]. We show how it can be simplified even further: just few lines using the basic facts about metric entropy! Proof. If we can prove that HM,det{0,1},(s,/ ∞ Pr[Z=z])) M,det{0,1},(s,) (X|Z=z ) > H∞ (X) − log(1/ Pr[Z = z]) then the result will follow by Lemma 1 and Remark 2. Note that by Theorem 1 (X) > k if and only if E D(X) 6 |D| M,det{0,1},(s,) for any X we have H∞ 2k +  for all boolean D of size at most s. From this we get E D(X|Z=z ) 6 E D(X)/ Pr[Z = z] 6 |D|/2k Pr[Z = z] + / Pr[Z = z] for any D. Since the characterization is also sufficient, the results follows. t u 4.2 Equivalence of HILL Entropy and Unpredictability Entropy for Short Strings Unpredictability entropy. The notion of unpredictability entropy is based on the (assumed) hardness of guessing X given auxiliary information Z. More formally, we have HUnp,s (X|Z) > k if and only if no adversary of size at most s can predict X given Z better than with probability 2−k . For Z independent of X or of the relatively short length, this reduces to the min-entropy of X 6 . Seperation from HILL entropy. If f is a one-way function, U is the uniform distribution and X = U, Z = f (U ) then we see that X|Z has large amount of unpredictability. It is also easy to see that X|Z has almost no HILL entropy. Equivalence for short strings. On the positive side, using metric entropy and the characterization in Theorem 2, we reprove the following result of Vadhan and Zheng who established the equivalence when X is short7 5 With the loss at most poly(1/δ) in s and . In the original proof the loss is exp(1/δ) 6 Provided that s > 2m n so that the adversary can hardcore his best guess. 7 Logarithmically in the security parameter Metric Pseudoentropy: Characterizations and Applications 99 Theorem 5 ([18]). Suppose that X and Z are, respectively, n and m-bit ran- HILL,(s0 ,) dom variables. Then H∞ (X|Z) & HUnp,s (X|Z) with s0 = poly(2sn ,1/) . The original proof is based on a result similar to Theorem 2 proved in a much more complicated way. We note that this part is a trivial consequence of KKT optimality conditions and also simplify the rest of the proof. M,det[0,1],(s0 ,) Proof (Sketch). We prove that H∞ (X|Z) < k implies HUnp,s (X|Z) < k. Suppose not, then we have E D(X, Z) − E D(Y, Z) >  for all Y such that e ∞ (X|Z) > k. Let Y ∗ be the distribution which minimizes this expression, H that is which maximizes E D(Y, Z). Let t(z) be P as in Theorem 2 and denote D0 (x, z) = max(D(x, z) − t(z), 0) and let λ = x D0 (x, z) (according to Theo- rem 2 this sum does not depend on z). Consider the following predictor A: On input z sample x according to the probability Pr[A(z) = x] = D0 (x, z)/λ Note that Y ∗ |Z=z is uniform over the set {x : D0 (x, z) > 0}. By Theorem 2 (the sufficiency part) it follows that Y ∗ is also maximal for D. For every z we have E D0 (Y ∗ |Z=z , z) = E D(Y ∗ |Z = z, z) − t(z). We have also E D0 (X|Z=z , z) > E D(X|Z=z , z) − t(z) by the definition of D0 . This proves E D0 (X, Z) − E D0 (Y, Z) >  for all Y such that H e ∞ (X|Z) > k. It is easy to observe that E D0 (X, Z) E D0 (Y |Z=z , z)   ∗ Pr [A(Z) = X] = > E P 0 > E 2−H∞ (Y |Z=z ) z←Z λ z← x D (x, z) z←Z which is at least 2−k . The circuit D0 (x, z) is of complexity 2m · size(D), which is too big. However, if the domain of x is small, we can approximate the numbers t(z) given λ from relations in Theorem 2 (and even λ, from the second relation, for the uniform setting). Indeed, knowing that E max(D(U, z) − t(z)) = λ, we estimate E max(D(U, z) − t) for fixed t and then find a “right” value t = t(z) by the binary search. This way for every z we can approximate D0 (·, z), and hence the distribution Pr[A(z) = x], up to a maximal error δ  2−k and with overwhelming probability 1 − exp(−poly(1/δ)), using poly(1/δ) samples of D. On average over z we predict X with probability 2−k − δ ≈ 2−k . t u 4.3 Improved Leftover Hash Lemma for Square-Secure Applications In the key derivation problem we want to derive a secure m-bit key for some application P from an imperfect source of randomness X. The generic approach is to use a randomness extractor. However, as implied by the RT-bounds [13], the min-entropy in X needs to be at least m + 2 log(1/) if we want the derived key to be -secure. Fortunately, as shown by Barak et al. [1], for many crypto- graphic applications, one can reduce this loss by half, that is to L = log(1/). To this end, they introduce the class of square-secure applications, where the 100 M. Skorski squared advantage, over the uniform choice of keys, of every bounded attacker is small8 . This class contains for example all unpredictability applications, stateless chosen plaintext attack secure encryption and weak pseudo-random functions. The reduction of entropy loss follows by combining universal hashing with the following lemma Lemma 4 ([1]). For a function D : {0, 1}` → [−1, 1] and X ∈ {0, 1}` of colli- sion entropy k we have p p E D(X) 6 E D(U` ) + VarD(U` ) · 2`−k − 1. To see this, let WinA (r, h), for arbitrary attacker A ∈ A, be the probability that A breaks the key r given in addition9 h and let DA (r, h) = WinA (r, h) − 12 be its advantage. Let X be any n-bit random variable of min-entropy m + log(1/). We apply a randomly chosen universal hash function10 H from n to m bits. It is easy to see that H(X), H is a distribution with collision entropy m+log |H|−log(1+). From the lemma it follows now that p √ E DA (H(X), H) 6 E DA (U, H) + VarDA (U, H) ·  If we assume that maxh E DA (U, h) 6  (which means -security against A with the uniform key) and that maxh E DA (U, h)2 6 σ with σ = O () (which means σ-square-security against A with the uniform key) then we achieve O() security for the extracted key, with entropy loss only log(1/). An alternative proof. We show that Theorem 3 implies Lemma 4. Indeed, set k = ` and  = 0 in Theorem 3. Let Y ∗ be the distribution of collision entropy at least k = ` which maximizes E D(Y ), and let t, λ and D0 be as in the characterization. Denote S = {x : D(x) > t} and let D|S be the restriction of d D to the set S. Note that Y ∗ |S = Y ∗ maximizes D|S and D|S (x) = D0 |S (X) + t for every x ∈ S. By Remark 5 we get p q E D(X) 6 E D(Y ∗ ) = E D|S (Y ∗ |S) = E D|S (US )+ VarDS (US ) · |S|2−k − 1. We show that one can replace S by the {0, 1}` on the right hand side. This will follow by the following general lemma Lemma 5. Let X be a random variable, c > 1 be a constant and S be an event of probability P(S) > c−1 . Then p p p √ E[X|S] + Var[X|S] · cP(S) − 1 6 E[X] + Var[X] · c − 1 (5) The proof follows by a few algebraic manipulations and is omitted. 8 Which essentially means that the probability that an attacker break the key is con- centrated over keys 9 For the uniformly chosen key this doesn’t help the adversary, at least in the nonuni- form model 10 A family H functions from n to m bits is universal if Prh←H [h(x) = h(x0 )] = 2−m for x 6= x0 Metric Pseudoentropy: Characterizations and Applications 101 4.4 Some Further Applications Lower bounds on square security. Using the characterization from Theorem 3 one can derive some non-trivial lower bounds on square-security needed for key derivation. We discuss this problem in a separate paper. References 1. Barak, B., Dodis, Y., Krawczyk, H., Pereira, O., Pietrzak, K., Standaert, F.-X., Yu, Y.: Leftover hash lemma, revisited. Cryptology ePrint Archive, Report 2011/088 http://eprint. iacr.org/ (2011) 2. Fuller, B., Reyzin, L.: A unified approach to deterministic encryption: New con- structions and a connection to computational entropy. In TCC 2012, vol, 7194 of LNCS, pp. 582–599, Springer (2012) 3. Barak, B., Shaltiel, R., Wigderson, A.: Computational analogues of entropy. In: Arora, S. et al. (eds.) RANDOM-APPROX, Lecture Notes in Computer Science, vol. 2764, pp. 200–215, Springer (2003) 4. Boyd, S., Vandenberghe, L.: Convex optimization. Cambridge University Press, New York, NY, USA (2004) 5. Chung, K.-M., Kalai, Y.T., Liu, F.-H., Raz, R.: Memory delegation. Cryptology ePrint Archive, Report 2011/273, http: //eprint.iacr.org/ (2011) 6. Dodis, Y., Ostrovsky, R., Reyzin, L., Smith, A.: Fuzzy extractors: How to generate strong keys from biometrics and other noisy data. SIAM J. Comput. 38, No. 1, 97–139 (2008) 7. Dziembowski, S., Pietrzak, K.: Leakage-resilient cryptography in the standard model. IACR Cryptology ePrint Archive 2008, p. 240 (2008) 8. Gentry, C., Wichs, D.: Separating succinct non-interactive argu- ments from all falsifiable assumptions. Cryptology ePrint Archive, Report 2010/610, http://eprint.iacr.org/ (2010) 9. Hastad, J., Impagliazzo, R., Levin, L.A., Luby, M.: A pseudorandom generator from any one-way function. SIAM J. Comput. 28, No. 4, 1364–1396 (1999) 10. Hsiao, Ch.-Y., Lu, Ch.-J., Reyzin, L.: Conditional computational entropy, or to- ward separating pseudoentropy from compressibility. In Proceedings of the 26th annual international conference on Advances in Cryptology (Berlin, Heidelberg), EUROCRYPT ’07, Springer-Verlag, pp. 169–186 (2007) 11. Krenn, S., Pietrzak, K., Wadia, A.: A counterexample to the chain rule for condi- tional hill entropy. In: Sahai, A. (ed.) Theory of Cryptography, Lecture Notes in Computer Science, vol. 7785, pp. 23–39, Springer Berlin Heidelberg (2013) 12. Reyzin, L.: Some notions of entropy for cryptography. In: Fehr, S. (ed.) Information Theoretic Security, Lecture Notes in Computer Science, vol. 6673, pp. 138–142, Springer Berlin Heidelberg (2011) 13. Radhakrishnan, J., Ta-Shma, A.: Bounds for dispersers, extractors, and depth-two superconcentrators. In SIAM JOURNAL ON DIS- CRETE MATHEMATICS 13 (2000), Metric Pseudoentropy: Characterizations and Applications 13, (2000) 14. Reingold, O., Trevisan, L., Tulsiani, M., Vadhan, S.: Dense subsets of pseudoran- dom sets. In Proceedings of the 2008 49th Annual IEEE Symposium on Founda- tions of Computer Science (Washington, DC, USA), FOCS ’08, pp. 76–85, IEEE Computer Society (2008) 15. Shannon, C.E.: A mathematical theory of communication. In Bell system technical journal 27 (1948) 102 M. Skorski 16. Trevisan, L., Tulsiani, M., Vadhan, S.: Regularity, boosting, and efficiently simu- lating every high-entropy distribution. In Proceedings of the 2009 24th Annual IEEE Conference on Computational Complexity (Washington, DC), CCC ’09, pp. 126–136, IEEE Computer Society (2009) 17. Tao, T., Ziegler, T.: The primes contain arbitrarily long polynomial progressions. In Acta Mathematica (English), 201:2, 213–305 (2008) 18. Vadhan, S., Zheng, C.J.: Characterizing pseudoentropy and simplifying pseudo- random generator constructions. In Proceedings of the 44th symposium on Theory of Computing (New York), pp. 817–836, STOC ’12, ACM (2012)