Some good links:
http://mathforum.org/probstat/probstat.html
http://www.mathpages.com/home/iprobabi.htm
Here is an interesting example:
-----------------------------------------------------------------
Optimizing Your Wife -----------------------------------------------------------------
If a man can expect to meet exactly N eligible women in his life,
what strategy should he use to maximize his chances of choosing
the very best one? We typically make some idealized assumptions
when solving this type of problem, namely, that he meets the women
in random order of "goodness", that there is an unambiguous
ordering of the women, and that the man is concerned ONLY to
maximize his chances of choosing the VERY best of the N candidates,
placing no value at all on choosing the second-best versus the
worst.
Under these conditions the man's strategy can obviously never be
to select a candidate that isn't the best he has met so far. Also,
at each stage his only information is (1) how many women he has
evaluated so far, and (2) whether the current women is the best
so far. At some stage, his strategy must be to select the current
woman if she is the best so far, and clearly if he has reached this
stage the optimum strategy can't subsequently be to pass on a
"best so far" candidate. Thus, his optimum strategy for selecting
the very best of a sequence of N candidates is to pass on the first
j-1 candidates (where j is a number to be determined), and then
select the next "best so far" that he encounters. This strategy
will result in choosing the very best woman if and only if there
is one and only one "best so far" in the sequence from j to N.
Of course, the probability that the overall best woman is the kth
woman in the sequence is just 1/N. Also, the probability that that
best woman of the first k-1 women would be one of the first j-1
women is (j-1)/(k-1). Hence, the probability of selecting the
optimum woman at the kth round, for k in the range j to N based on
some particular value of j, equals the product
/ 1 \ / j-1 P{N,j} = ( --- )( ----- )
\ N / \ k-1 /
So, for this value of j, the probability of selecting the optimum
woman out of all N candidates based on this strategy (pass on the
first j-1, then choose the next "best so far") is the sum of the
above expression for k ranging from j to N. Thus we have
/j-1\ N / 1 P{N,j} = ( --- ) SUM ( ----- )
\ N / k=j \ k-1 /
If N is fairly large, the summation on the right is a large span of
the simple harmonic series, i.e., sum of the inverses of consecutive
integers. Recalling that the sum of 1/m for m=1 to s is asymptotically
equal to ln(s) + gamma (where gamma=0.57 is Euler constant) it follows
that the above probability approaches
/j-1\ / N P{N,j} = ( --- ) ln( ----- )
\ N / \ j-2 /
as N and j increase. To maximize this probability, we differentiate
with respect to j and set the result to zero, which gives
j-1 / N --- = ln( ----- )
j-2 \ j-2 /
Thus, as j increases, the left side approaches 1, and we can take
the exponential of both sides to give
N
e = ---
j-2
Consequently, for a large number of candidates, the optimum strategy
for maximizing the chances of selecting the very best woman from N
sequential candidates is to pass on the first N/e women (approximately)
and then select the next "best so far" that is encountered. With
this strategy the probability of success approaches 1/e.
(Incidentally, it's been called to my attention that the above formula
could, in certain circumstances, be used in a Bayesian way to estimate
the number N of candidates that a man could have expected to encounter
over his entire life, based on knowledge of having already met the
very best woman. The "Amanda Rule" states that if a man knows, by some
means, that the kth woman he has encountered is actually the very
best, then a Bayesian estimate for the number N of women he would have
expected to meet overall is roughly e*k. Of course, if he has already
determined that the kth woman is the very best, he presumably has no
interest in the remaining k(e-1) candidates, so the formula is of
only academic interest.)
The above is the traditional answer, and it gives a surprisingly good
probability of finding the single best woman from N candidates even
as N increases without limit. However, as noted previously, this
solution essentially treats the second best woman as no more suitable
than the least suitable. In other words, the strategy is entirely
focused on maximizing the probability of selecting the very best
candidate, without distinguishing between the utilities of any other
outcomes. A more pragmatic criterion for choosing a strategy might
be to maximize the expected "goodness" of the selection based on
some weighting of the outcomes. This certainly affects the answer,
as can be seen in the case of N=5 with a linear weighting where 0
is the worst and 4 is the best. The results for a strategy of
"passing" on the first k candidates and then choosing the next
"best so far" (or the last candidate if necessary) are as shown
below
probability expected
of selecting goodness of
the best the selected
k candidate candidate
---- ----------- --------------
0 24/120 240/120 = 2.000
1 50/120 348/120 = 2.900
2 52/120 336/120 = 2.800
3 41/120 297/120 = 2.475
4 24/120 240/120 = 2.000
This shows that to maximize your chances of selecting the very
best candidate you should "pass" on the first two, whereas to
maximize the expected goodness of the selected candidate you
should only "pass" on the first one.
By the way, the numerators of the exact probabilities of selecting
the best of N candidates using a strategy of k "passes" are as
shown below, based on denominators of N!.
N k
--- --------------------------------
3 2 3 2
4 6 11 10 6
5 24 50 52 41 24
6 120 274 308 271 206 120
7 720 1764 2088 1950 1640 1237 720
Is there is simple way of generating more rows of this table (aside
from just counting permutations)? Do these coefficients have any
other applications?