Posted by: tesla September 28, 2006
Probability and Statistics
Login in to Rate this Post:     0       ?        
Some good links: http://mathforum.org/probstat/probstat.html http://www.mathpages.com/home/iprobabi.htm Here is an interesting example: ----------------------------------------------------------------- Optimizing Your Wife ----------------------------------------------------------------- If a man can expect to meet exactly N eligible women in his life, what strategy should he use to maximize his chances of choosing the very best one? We typically make some idealized assumptions when solving this type of problem, namely, that he meets the women in random order of "goodness", that there is an unambiguous ordering of the women, and that the man is concerned ONLY to maximize his chances of choosing the VERY best of the N candidates, placing no value at all on choosing the second-best versus the worst. Under these conditions the man's strategy can obviously never be to select a candidate that isn't the best he has met so far. Also, at each stage his only information is (1) how many women he has evaluated so far, and (2) whether the current women is the best so far. At some stage, his strategy must be to select the current woman if she is the best so far, and clearly if he has reached this stage the optimum strategy can't subsequently be to pass on a "best so far" candidate. Thus, his optimum strategy for selecting the very best of a sequence of N candidates is to pass on the first j-1 candidates (where j is a number to be determined), and then select the next "best so far" that he encounters. This strategy will result in choosing the very best woman if and only if there is one and only one "best so far" in the sequence from j to N. Of course, the probability that the overall best woman is the kth woman in the sequence is just 1/N. Also, the probability that that best woman of the first k-1 women would be one of the first j-1 women is (j-1)/(k-1). Hence, the probability of selecting the optimum woman at the kth round, for k in the range j to N based on some particular value of j, equals the product / 1 \ / j-1 P{N,j} = ( --- )( ----- ) \ N / \ k-1 / So, for this value of j, the probability of selecting the optimum woman out of all N candidates based on this strategy (pass on the first j-1, then choose the next "best so far") is the sum of the above expression for k ranging from j to N. Thus we have /j-1\ N / 1 P{N,j} = ( --- ) SUM ( ----- ) \ N / k=j \ k-1 / If N is fairly large, the summation on the right is a large span of the simple harmonic series, i.e., sum of the inverses of consecutive integers. Recalling that the sum of 1/m for m=1 to s is asymptotically equal to ln(s) + gamma (where gamma=0.57 is Euler constant) it follows that the above probability approaches /j-1\ / N P{N,j} = ( --- ) ln( ----- ) \ N / \ j-2 / as N and j increase. To maximize this probability, we differentiate with respect to j and set the result to zero, which gives j-1 / N --- = ln( ----- ) j-2 \ j-2 / Thus, as j increases, the left side approaches 1, and we can take the exponential of both sides to give N e = --- j-2 Consequently, for a large number of candidates, the optimum strategy for maximizing the chances of selecting the very best woman from N sequential candidates is to pass on the first N/e women (approximately) and then select the next "best so far" that is encountered. With this strategy the probability of success approaches 1/e. (Incidentally, it's been called to my attention that the above formula could, in certain circumstances, be used in a Bayesian way to estimate the number N of candidates that a man could have expected to encounter over his entire life, based on knowledge of having already met the very best woman. The "Amanda Rule" states that if a man knows, by some means, that the kth woman he has encountered is actually the very best, then a Bayesian estimate for the number N of women he would have expected to meet overall is roughly e*k. Of course, if he has already determined that the kth woman is the very best, he presumably has no interest in the remaining k(e-1) candidates, so the formula is of only academic interest.) The above is the traditional answer, and it gives a surprisingly good probability of finding the single best woman from N candidates even as N increases without limit. However, as noted previously, this solution essentially treats the second best woman as no more suitable than the least suitable. In other words, the strategy is entirely focused on maximizing the probability of selecting the very best candidate, without distinguishing between the utilities of any other outcomes. A more pragmatic criterion for choosing a strategy might be to maximize the expected "goodness" of the selection based on some weighting of the outcomes. This certainly affects the answer, as can be seen in the case of N=5 with a linear weighting where 0 is the worst and 4 is the best. The results for a strategy of "passing" on the first k candidates and then choosing the next "best so far" (or the last candidate if necessary) are as shown below probability expected of selecting goodness of the best the selected k candidate candidate ---- ----------- -------------- 0 24/120 240/120 = 2.000 1 50/120 348/120 = 2.900 2 52/120 336/120 = 2.800 3 41/120 297/120 = 2.475 4 24/120 240/120 = 2.000 This shows that to maximize your chances of selecting the very best candidate you should "pass" on the first two, whereas to maximize the expected goodness of the selected candidate you should only "pass" on the first one. By the way, the numerators of the exact probabilities of selecting the best of N candidates using a strategy of k "passes" are as shown below, based on denominators of N!. N k --- -------------------------------- 3 2 3 2 4 6 11 10 6 5 24 50 52 41 24 6 120 274 308 271 206 120 7 720 1764 2088 1950 1640 1237 720 Is there is simple way of generating more rows of this table (aside from just counting permutations)? Do these coefficients have any other applications?
Read Full Discussion Thread for this article