=Paper=
{{Paper
|id=Vol-1885/112
|storemode=property
|title=The Role of
Information in the Two Envelope Problem
|pdfUrl=https://ceur-ws.org/Vol-1885/112.pdf
|volume=Vol-1885
|authors=Mirko Navara,Jiří Šindelář
|dblpUrl=https://dblp.org/rec/conf/itat/NavaraS17
}}
==The Role of
Information in the Two Envelope Problem==
J. Hlaváčová (Ed.): ITAT 2017 Proceedings, pp. 112–119 CEUR Workshop Proceedings Vol. 1885, ISSN 1613-0073, c 2017 M. Navara, J. Šindelář The Role of Information in the Two Envelope Problem Mirko Navara, Jiří Šindelář Center for Machine Perception, Department of Cybernetics, Faculty of Electrical Engineering, Czech Technical University in Prague, Technická 2, 166 27 Prague, Czech Republic, navara@cmp.felk.cvut.cz, sindeji3@fel.cvut.cz, WWW home page: http://cmp.felk.cvut.cz/~navara/ Abstract: We offer a new view on the two envelope prob- Argument 3. The smaller amount is x, the bigger is 2x. lem (also called the exchange paradox). We describe it as They are assigned randomly (with probabilities 1/2) to a zero-sum game of two players, having only partial in- players A and B. The mean values for both players are formation. We first explain a standard situation and show that the mean gain—when defined—is really zero. How- 1 1 3 x − 2x = x ever, there are even more paradoxical situations in which 2 2 2 the information obtained by the players supports the ex- and there is no reason for (or against) switching. change of envelopes. We explain that this does not lead to a contradiction and we demonstrate it also by computer We presented several arguments; each of them seems simulation. The reason for this paradox is that the mean correct, but their conclusions are contradictory. gain does not exist and that the players have different in- Surprisingly, the debate about this paradox is still not formation, supporting their contradictory decisions. finished (cf. [3]). “Currently, there is no consensus on a demon- 1 Formulation of the Problem stration, since most people generally reject each other’s demonstrations." [5] The two envelope problem (also called the exchange para- dox) is a famous logical puzzle demonstrating a paradox in One reason is that many authors merely defended their so- logic and probability. We adopt its formulation from [14], lution (mostly correct), cf. [6, 13]. However, to resolve expressed here as a game of two players: the paradox, it is necessary to explain the errors in the contradicting arguments.1 The topic was studied not only There are two indistinguishable envelopes, each by mathematicians and logicians but also by philosophers containing money, one contains twice as much (e.g., [4, 11]). For some of them, a sufficient explana- as the other. Player A picks one envelope of his tion is that a in Argument 2 denotes different amounts; choice; player B receives the second envelope. the smaller one in the first case and the bigger one in the They can keep the money contained in their en- second case [4, 13]. However, this is not forbidden, this is velopes or switch the envelopes (if both agree on just what a random variable means. Thus a more advanced it). Should they switch? analysis is needed. There is an easy answer: The paradox has more variants (cf. [11]). The method of choice of the amounts was not specified. (This is usual Argument 1. The situation is symmetric. Thus there is no in such puzzles. They rarely start with a precise definition reason for (or against) switching. of a random experiment generating the data. Instead of that, it was said that two amounts are given, one of them However, there are other interpretations suggesting twice greater than the other.) Here we assume that they something else: were drawn as realizations of some random variable with Argument 2. The situation is symmetric. Thus the prob- a given distribution (known or unknown to players). Nev- ability of having the envelope with the higher or lower ertheless, the formulation of the problem does not specify amount is 1/2. If the envelope of player A contains the this at all, and some authors (e.g. [9]) consider this amount amount a, then the other envelope contains 2a (and the ex- as given (without any randomness); such formulation ex- change results in a gain of a) or a/2 (and the exchange cludes a probabilistic analysis. Thus we do not consider it results in a loss of a/2). In average, the mean gain is here. Besides, it is not specified whether we first draw the (realization of) random variable X (the smaller amount) or 1 1a a the contents of the envelope given to player A, described a− = , 2 22 4 thus switching is always recommended. 1 We experienced this misunderstanding also during the reviewing process of this paper: “Argument 3 is correct, so there is no paradox.” Another point of view is the following: However, what is wrong on Argument 2? The Role of Information in the Two Envelope Problem 113 by random variable A. It is natural to assume that the bi- Player B receives the other envelope and knows only its nary choice of envelopes is made with equal probabilities contents, specified by realization b of random variable B, and independently of all other random events (or param- ( eters) of the experiment. Here we apply the probabilis- x if X = x and U = 1 , B= tic approach to the problem in its original form: We first 2x if X = x and U = 0 , draw a positive amount x from some distribution. We put this amount into one envelope and 2x into the other enve- B = (2 −U) X . lope. Both envelopes have probability 1/2 to be chosen by If the players exchange the envelopes, the gain of A is G = player A. The remaining envelope is given to player B. B − A. For player B, G is the loss and −G is the gain. It is also not specified whether the players know the Random variables X and U are independent. If X has an amounts in their envelopes. This knowledge is useless if expectation EX, then the distribution is unknown. On the other hand, knowing the distribution, the amounts bring useful information for 3 EA = (1 + EU) EX = EX , the decision. We suppose that the players know the dis- 2 tribution from which x was drawn. We discuss this case 3 EB = (2 − EU) EX = EX , in detail. Some of its consequences seem to determine the 2 strategy also without looking inside the envelope, but—as EG = EB − EA = 0 . we shall show—this need not correspond to conclusions made in the former case. This is in accordance with Arguments 1 and 3. It remains to find an error in Argument 2. We present an explanation of the paradox (based As U, X are independent, on [12]) using results of probability and information the- ory. The standard explanation of the exchange paradox 1 P(U = 0|X = x) = P(U = 0) = , (following [7]) is presented in Section 2 and demonstrated 2 by an example in Section 3. As a new contribution, we 1 P(U = 1|X = x) = P(U = 1) = modify this example to two even more surprising and 2 counterintuitive versions of the paradox, which we explain for all x. However, this does not apply to conditional prob- in detail in Sections 4 and 5. In Section 6, we verify the abilities P(U = 0|A = a), P(U = 1|A = a) because U, A are results by a computer simulation. dependent: P(U = 0, A = a) P(U = 0|A = a) = P(A = a) 2 Exchange Paradox: First Level P(U = 0, X = a) P(X = a) = = , P(A = a) 2 P(A = a) P(U = 1, A = a) P(U = 1|A = a) = P(A = a) We introduce a third member of the experiment, the P(U = 1, X = a2 ) P(X = a2 ) banker C, who controls the game and puts (his) money = = , in the envelopes. We assume that the smaller amount, x, P(A = a) 2 P(A = a) was chosen by a realization of a random variable X. (The where larger amount in the second envelope is 2x.) The distribu- tion of random variable X is known to the banker and also P(A = a) = P(U = 0, A = a) + P(U = 1, A = a) to the players. For simplicity, we assume that the distribu- = P(U = 0, X = a) + P(U = 1, X = a2 ) tion is discrete and the amounts are positive. Then we do 1 an independent random experiment (e.g., tossing a coin) = P(X = a) + P(X = 2a ) . with two equally probable results, expressed by a random 2 variable U, whose possible values are 0 and 1 and expec- Notice that P(U = 0|A = a) is the conditional probability tation EU = 1/2. If U = 0, player A receives the smaller of gain and P(U = 1|A = a) is the conditional probability amount, x; if U = 1, player A receives the bigger amount, of loss given A = a. Their ratio is 2x. He does not know x, only the contents of his envelope, specified by realization a of random variable A, P(X = a) . P(X = a2 ) ( x if X = x and U = 0 , As there is no uniform distribution on an infinite count- A= able set (a fact ignored even in [10]), P(X = a), P(X = a2 ) 2x if X = x and U = 1 , cannot be equal for all a. Typically, the conditional prob- A = (1 +U) X . ability of gain, P(U = 0|A = a), is higher for “small” 114 M. Navara, J. Šindelář values of a and smaller for “high” values, although the (after substitution a := 2b). This explains the error in Ar- notions “small” and “high” are relative. In any case, gument 2 provided that the expectation of G is defined. these probabilities converge to 0 when a goes to infin- ity, hence there must be values “sufficiently large” so that Remark 1. There is another arrangement suggested in [8, P(X = a) < P(X = a2 ) and the conditional probability of 11]: First, the amount a in the envelope of player A is gain P(U = 0|A = a) < 12 . This can also lead to an effec- drawn from some distribution. Then the random variable tive strategy based on random switching [9, 10]. U (as before) decides whether the second envelope will Given A = a, switching brings a gain with conditional contain 2a or 2a . In this arrangement, random variables probability distribution U and A are independent and Argument 2 is valid. Argu- ments 1 and 3 fail because of an intervention of the banker; P(G = g, A = a) it is him who puts additional money in the second enve- P(G = g|A = a) = , P(A = a) lope, so that the total amount may be 3a or 23 a. This is not where a zero-sum game, and it is not symmetric. P(U = 0, X = a) if g = a , P(G = g, A = a) = P(U = 1, X = a2 ) if g = − a2 , 3 Example of the First Level of Paradox 0 otherwise. 1 Let T be a random variable with geometrical distribution 2 P(X = a) if g = a , = 21 P(X = a2 ) if g = − 2a , with quotient q ∈ (0, 1): 0 otherwise. qt P(T = t) = , t ∈ {0, 1, 2, . . .} . We obtain 1−q P(X = a) a if g = a , Let X = 2T , thus X attains values 1, 2, 4, 8, . . . with proba- P(X = a) + P(X = 2 ) a P(X = 2 ) bilities P(G = g|A = a) = if g = − a2 , P(X = a) + P(X = a2 ) qt 0 otherwise. P(X = 2t ) = , t ∈ {0, 1, 2, . . .} . 1−q The conditional expectation of the gain is always defined and it is In this arrangement, a player can deduce the contents of both envelopes only if he holds 1. For any other value, a P(X = a) − 2a P(X = a2 ) E(G|A = a) = . (1) both cases are possible—switching may bring a gain or a P(X = a) + P(X = a2 ) loss. These values may differ from 0. Suppose that the expectation of X exists; this happens The (unconditional) distribution of the gain is iff q < 21 . Then 1 2 P(X = g) if g > 0 , ∞ qt 1 1 P(G = g) = 2 P(X = −g) if g < 0 , EX = ∑ 2t = . t=0 1 − q (1 − q) (1 − 2q) 0 otherwise and its expectation is The joint distribution of U and A is given (for a = 2s , s ∈ {0, 1, 2 . . .}) by EG = ∑ g P(G = g) (2) g qs 1 P(U = 0, A = a) = P(U = 0, X = a) = , 1−q = 2 ∑ g P(X = g) + ∑ g P(X = −g) (3) s−1 g>0 g<0 q if s ≥ 1 , 1 P(U = 1, A = a) = P(U = 1, X = a2 ) = 1 − q = 2 ∑ g P(X = g) − ∑ h P(X = h) =0 0 if s = 0 . g>0 h>0 (after substitution g := −h), provided that the sum (2) is For q = 0.25, it is shown in Fig. 1. absolutely convergent. In this case If player A has a = 2s , the conditional probability of his EG = ∑ P(A = a) E(G|A = a) (4) gain by switching is a 1 P(X = a) q =∑ a P(X = a) − 2a P(X = a2 ) P(U = 0|A = 2s ) = = (5) a 2 P(X = a) + P(X = a2 ) q + 1 1 2 ∑ = a P(X = a) − ∑ b P(X = b) = 0 for all s ∈ {1, 2, . . .} (and 1 for s = 0). As q < 1/2, this a b probability is less than 1/3. The conditional expectation The Role of Information in the Two Envelope Problem 115 Figure 1: Joint distribution of U and A for q = 0.25. Values Figure 2: Conditional expectation of gain given A = a for a of A are on the horizontal axis; blue dots denote P(U = q = 0.25 (orange circles). It is a mixture (=convex combi- 0, A = a), orange circles P(U = 1, A = a). nation) of the cases U = 0 (blue dots) and U = 1 (yellow dots). of his gain is 2s P(T = s) − 2s−1 P(T = s − 1) E(G|A = 2s ) = P(T = s) + P(T = s − 1) 2s−1 · 2q − 1 if s ∈ {1, 2, . . .} , = q+1 (6) 1 if s = 0 . For q = 0.25, it is shown in Fig. 2. The contributions of conditional expectations E(G|A = a) to the unconditional expected gain EG are E(G|A = a) · P(A = a), see Fig. 3. The conditional expectation is positive only for s = 0 (i.e., Figure 3: The contributions of conditional expectations of a = 1), negative otherwise. This determines the right strat- gain given A = a to the unconditional expected gain for egy of switching: Switch only if you hold 1. The two play- q = 0.25. ers never both agree on switching the envelopes because at most one of them holds 1. The unconditional expectation of the gain is If player A does not know the contents of his envelope, ∞ he may use only its distribution EG = ∑ P(A = 2s ) E(G|A = 2s ) (7) s=0 ∞ 1 1 s−1 2q − 1 s 1 q qs−1 = + ∑ (2q) + if s ∈ {1, 2, . . .} , 2 1 − q s=1 1−q 2 1−q 1−q 1 1 1 2q − 1 P(A = 2 ) = 1 1 s if s = 0 , = + · = 0, 2 1−q 2 1 − q 1 − 2q 1 − q 0 otherwise , in accordance with our arguments. 1 s−1 q + 1 q if s ∈ {1, 2, . . .} , 2 1−q 4 Exchange Paradox: Second Level = 1 1 if s = 0 , 2 1−q In Section 2, we presented a standard explanation of the 0 otherwise . exchange paradox. It is based on the assumption that the 116 M. Navara, J. Šindelář amount in the envelopes has an expectation. This can fail even for some common distributions. (This fact is ignored, e.g., in [9].) We discovered that this leads to a more ad- vanced paradox. We have found out that Nalebuff [8] pro- posed the same example, and similar ones can be found in [2]. However, Nalebuff only noticed that both players might be convinced that switching brings gain to them and that the above arguments are not applicable if the expecta- tion does not exist. It seems that no detailed analysis was published since, and this is what we do here. Let us consider the situation from Section 3 if q > 12 . Then the expectation EX does not exist. (The respective sum of a geometric series with quotient 2q > 1 is +∞.) For q = 0.75, the joint distribution of U and A is shown in Fig. 4. Figure 5: Conditional expectation of gain given A = a for q = 0.75 (orange circles). It is a mixture of the cases U = 0 (blue dots) and U = 1 (yellow dots). Figure 4: Joint distribution of U and A for q = 0.75. Values a of A are on the horizontal axis; blue dots denote P(U = 0, A = a), orange circles P(U = 1, A = a). The expectation of the gain, EG, does not exist because Figure 6: The contributions of conditional expectations of the sum (2) is not absolutely convergent; it is a difference gain given A = a to the unconditional expected gain for of two infinite sums in (3). q = 0.75. Still formula (6) for the conditional expectation of the gain is valid, probabilistic analysis. The only thing which does not work 2s−1 · 2q − 1 if s ∈ {1, 2, . . .} , as in Section 3 is formula (7) for unconditional gain; the E(G|A = 2s ) = q+1 sum is not absolutely convergent. However, the uncondi- 1 if s = 0 . tional gain is not needed for decision if the conditional one is always positive. How can we now defend Argument 1? Thus E(G|A = 2s ) > 0 for all s ∈ {1, 2, . . .}. For q = 0.75, First of all, we refuse the possibility (considered in [2, see Fig. 5. (Notice that formula (5) for the conditional 4, 6]) that players A and B will change the envelopes there probability of gain P(U = 0|A = 2s ) for s 6= 0 still holds and back forever. After one exchange and looking inside, and gives a constant value from the interval ( 13 , 12 ).) Player they would know the contents of both envelopes and de- A has a strong argument for switching the envelopes, in- cide deterministically with full information. One of the dependently of the amount in his envelope. (Thus he may players has the larger amount (and he knows that), so he “rationally” decide for switching without looking inside would not agree to switch again. the envelope.) Such distributions are called paradoxical If the players know only the amount in the envelope they in [2]. received first, they have different information, and this is The same argument applies to player B. Although he the key difference. holds a different amount in his envelope, he also prefers switching. We have again a paradox, now supported by a Example 1. Suppose that A has 4 and B has 8. Then The Role of Information in the Two Envelope Problem 117 A knows that B can have 2 or 8, while B knows that A 5 Exchange Paradox: Third Level can have 4 or 16. Using Argument 2, A might expect that switching brings him a mean gain of 1. Using the model Another modification of the example of Section 3 with q = described in this section (formula (5) remains valid), he 1/2 is also of particular interest. The joint distribution of knows that his chance of gain is lower, U and A is shown in Fig. 7. Formula (5) for the conditional q 1 1 ∈ , . q+1 3 2 . For q = 0.75, this chance is 3/7 = 0.43, still sufficient to give a positive conditional mean gain of 8q + 2 4 −4 = . q+1 7 Player B may apply the same arguments, leading to twice higher estimates of his gain. Example 2. Suppose now that A has 4 and B has 2. Then A knows that B can have 2 or 8, while B knows that A can have 1 or 4. From the point of view of player A, the situation is the same as in Ex. 1. Player B may apply the same arguments, leading to twice lower estimates of his gain, still supporting the decision to switch. Figure 7: Joint distribution of U and A for q = 0.5. Values a of A are on the horizontal axis; blue dots denote P(U = In Exs. 1 and 2, we saw that probabilistic analysis sug- 0, A = a), orange circles P(U = 1, A = a). gests switching to both players. This apparently brings a gain to only one of them, but their arguments overestimate probability of gain gives P(U = 0|A = 2s ) = 1/3 if s 6= 0. their chances. This explains why they may have contra- The loss is twice more probable but twice smaller. Thus dictory views on the effect of the switching of envelopes the conditional expectation of the gain simplifies to (both thinking that the other envelope is “better”). ( To understand this paradox better, imagine the reverse s 0 if s ∈ {1, 2, . . .} , game: Suppose that the players see the contents of the E(G|A = 2 ) = 1 if s = 0 , other player’s envelope (and not of their own). Then the same reasoning (based on the information received) would see Fig. 8 for the conditional expectations and Fig. 9 for support keeping the envelopes (and no switching). This their contributions to the unconditional expectation EG. shows that it is the different incomplete information which supports their paradoxical behavior. (The role of incom- plete information and other arrangements of the experi- ment are discussed in [11] for the “first level” of the para- dox.) This situation is not so counterintuitive. Imagine for instance a poker game where two players hold a poker in their hands. They both evaluate their chances of win- ning as very high, although it is clear that only one is in the winning position. In the reverse game, where they see the cards of the opponent (and not their own), each player would estimate the chances of his opponent as very high, and he would surrender. The two envelope problem in this setting possesses the same feature: the partial information given to players is overly optimistic. Thus looking inside the envelopes is not so helpful as it seems. Therefore, if rational players do not look inside any of the envelopes, the latter argument Figure 8: Conditional expectation of gain given A = a for makes their choice ambivalent, and they would accept Ar- q = 0.5 (orange circles). It is a mixture of the cases U = 0 gument 1. Even if they look in the envelopes, they will not (blue dots) and U = 1 (yellow dots). accept Argument 2, knowing (from the above analysis of the model) that its prediction is biased and too optimistic. It seems that player A (as well as B) may only gain by 118 M. Navara, J. Šindelář of q = 0.5, 0.75, where the expectation does not exist. The values are distributed approximately symmetrical with re- spect to zero, verifying that no envelope appears “better” and Argument 2 does not apply in practice, as predicted by the theoretical analysis in previous sections. Figure 9: The contributions of conditional expectations of gain given A = a to the unconditional expected gain for q = 0.5. switching (if he holds 1), in all other cases the risk of loss is compensated by the same expected gain. In the sum Figure 10: Histogram of average gains of 1000 samples in (7), only the first summand is nonzero, and it is positive. for q = 0.25. So the sum exists and evaluates to 1 1 > 0. 2 1−q The arguments from Section 4 are applicable. Moreover, switching is supported by a computation which results in a positive unconditional gain (of a player not looking in his envelope). However, this argument is wrong. As in Section 4, the expectation of the gain, EG, does not ex- ist because the sum (2), as a difference of two infinite sums in (3), is not absolutely convergent. Formula (7) uses only one possible arrangement of the summands, leading to an invalid conclusion. If player B applies the same argu- ment, he uses another arrangement of the summands and gets a positive expected gain for himself, loss for A. Thus the sum (2) does not exist and the discussion from Sec- tion 4 fully applies, despite the seemingly trivial (wrong) sum (7). Figure 11: Histogram of average gains of 1000 samples 6 Simulations for q = 0.5. To verify the results, we also used computer simulations. We computed the average gain from 1000 samples and 7 Conclusions repeated this 5000 times. The results were displayed as histograms of the averages, see Figs. 10, 11, 12 for quo- We explained the two envelope paradox in its classical tients q = 0.25, 0.5, 0.75, respectively. For q = 0.75, the form, as well as in two advanced instances in which the linear scale could not be used for the horizontal axis. The players find rather convincing (and still insufficient) prob- semilogarithmic scale would not allow negative values. abilistic arguments for switching the envelopes. The latter Therefore, we used the 31st root as a compromise which is our novel contribution to the discussion of the paradox. combines non-linearity similar to the logarithm and possi- We confirmed the following conclusion bility of displaying negative values. As expected, the histograms show relatively frequent “a perfectly rational player would simply recog- occurrences of averages with high absolute value in cases nize that his subjective probabilities provide a The Role of Information in the Two Envelope Problem 119 [8] Nalebuff, B.: Puzzles: The Other Person’s Envelope is Al- ways Greener. Journal of Economic Perspectives 3 (1988): 171–81 doi:10.1257/jep.3.1.171 [9] Martinian, E.: The Two Envelope Problem. https: //web.archive.org/web/20071114230748/http: //www.mit.edu/~emin/writings/envelopes.html, 2017-06-03, archive version of an original from 2007-11- 14 [10] McDonnell, M. D., Abbott, D.: Randomized switching in the two-envelope problem. Proceedings of the Royal Soci- ety A 465 (2111): 3309–3322 doi:10.1098/rspa.2009.0312 [11] Priest, G., Restall, G.: Envelopes and Indifference, In: Di- alogues, Logics and Other Strange Things, College Publi- cations, 2007, 135–140 [12] Šindelář, J.: Two Envelopes Problem. Preprint of Bachelor Thesis, CTU, Prague, 2017-05-04 [13] Schwitzgebe, E., Dever, J.: The Two Envelope Paradox and Figure 12: Histogram of average gains of 1000 samples for Using Variables Within the Expectation Formula, Sorites q = 0.75. The values on the horizontal axis are mapped by (2008) 135–140 the 31st root. [14] Two envelopes problem. Wikipedia, 2017-05-10, https://en.wikipedia.org/wiki/Two_envelopes_ misleading account using Bayesian decision the- problem ory and would therefore ignore those results” [1] also in the case of nonexisting expectation, which was not considered in the cited source. This topic has consequences in psychology, but it is also important in economics because it explains behavior at a market which is seemingly well-motivated, but in fact wrong. Besides, this paradox can be a good example and motivation for the study of statistics and information the- ory. Acknowledgments. The first author was supported by the Ministry of Education of the Czech Republic under Project RVO13000. References [1] Bliss, E.: A Concise Resolution to the Two Envelope Para- dox. 2012, arXiv:1202.4669 [2] Broome, J.: The Two-envelope Paradox. Analysis 55 (1995): 6–11 doi:10.1093/analys/55.1.6 [3] Cover, T.M.: Pick the largest number. In: Cover, T, Gopinath, B., Open Problems in Communication and Com- putation. Springer-Verlag, 1987 [4] Falk, R.: The Unrelenting Exchange Paradox. Teach- ing Statistics 30 (2008), 86–88 doi:10.1111/j.1467- 9639.2008.00318.x [5] Gerville-Réache, L.: Why do we change whatever amount we found in the first envelope: the Wikipedia "two en- velopes problem" commented. University of Bordeaux - IMB UMR 52-51, 2014, arXiv:1402.3311 [6] Green, L.: The Two Envelope Paradox. 2012, http://www.aplusclick.com/pdf/ LeslieGreenTwoEnvelopes.pdf, 2017-06-03 [7] Jackson, F., Menzies, P., Oppy, G.: The two envelope “paradox”, Analysis 54 (1994) 43-45.