Virtual Laboratories > Bernoulli Trials > 1 2 3 4 [5] 6 7
Suppose again that our random experiment is to perform Bernoulli trials I1, I2, ... In this section we will study the random variable Yk that gives the trial number of the k'th success. Note that Y1 is the number of trials needed to get the first success, which we now know has the geometric distribution. Recall also that Xn, the number of successes in the first n trials, has the binomial distribution with parameters n and p.
1. Show
that Yk = n if and only if In = 1 and Xn-1
= k - 1.
2. Use
Exercise 1, independence, and the binomial distribution to
show that
P(Yk = n) = C(n - 1, k - 1)pk(1 - p)n - k for n = k, k + 1, k + 2, ...
The distribution defined by the density function in Exercise 2 is known as the negative binomial distribution; it has two parameters, the number of successes k and the success probability p.
3. In the
negative binomial experiment, vary k and p with the scroll bars and note the
shape of the density function. Now set k = 2 and p = 0.4 and run the
experiment with an update frequency of 10. Watch the apparent convergence of the relative
frequency function to the density function.
4. Show that the
binomial and negative binomial sequences are inverse to each other in the sense that
Xn
k if and only
if Yk
n
Thus, argue that any event that can be expressed in terms of the negative binomial variables can also be expressed in terms of the binomial variables.
5. Show that
P(Yk = n) > P(Yk = n - 1) if and only if n < (k - 1 + p) / p.
Thus, the density function at first increases and then decreases, reaching its largest value at the greatest integer in (k - 1 + p) / p. This integer is the mode of the distribution and hence the negative binomial distribution is unimodal.
6. A fair die is
rolled until 3 aces occur. Find the probability that at least 15 rolls are required.
We will define random variables that give the number of trials between successive successes:
Z1 = Y1 and Zk = Yk - Yk-1 for k = 2, 3, ...
7. Argue that
these variables are independent and each has the geometric distribution with parameter p.
Moreover,
Yk = Z1 + Z2 + ··· + Zk.
The mean, variance and probability generating function of Yk now follow easily from results for the geometric distribution.
8. Show that
E(Yk) = k / p.
9. Show that var(Yk)
= k(1 - p) / p2.
10. Show that E(tYk)
= [pt / (1 - t + tp)]k for |t|
< 1 / (1 - p).
11. Suppose that U
and V are independent random variables for an experiment, and that U has the
negative binomial distribution with parameters j and p and V has the
negative binomial distribution with parameters k and p. Show that U + V
has the negative binomial distribution with parameters j + k and p.
12. In the
negative binomial experiment, vary k and p with the scroll bars and note the
location and size of the mean/standard deviation bar. Now set k = 3 and p =
0.25 and run the experiment with an update frequency of 10. Watch the apparent convergence
of the sample mean and standard deviation to the distribution mean and standard deviation.
13. A certain type
of missile has failure probability 0.02. Find the mean and standard deviation of the
launch number of the fourth failure.
14. In the
negative binomial experiment, start with p = 0.5 and k =
1. Successively increase k by 1, noting the shape of the density function each
time. Repeat with p = 0.3 and p = 0.8.
Even though you are limited to k = 5, you can still see the characteristic bell shape. This is a consequence of the central limit theorem because the negative binomial variable can be written as a sum of k independent, identically distributed (geometric) random variables.
15. Show that the
distribution of the standardized variable below converges to the standard normal distribution as k increases.
(Yk - k / p) / [k(1 - p) / p]1/2 = (pYk - k) / [k(1 - p)]1/2.
16. In the
negative binomial experiment, set p = 0.5 and k = 5. Run
the experiment 1000 times with an update frequency of 100. Compute and compare each of the
following:
17. A coin is
tossed until the 50th head occurs.
Suppose that an absent-minded professor (is there any other kind?) has N matches in his right pocket and N matches in his left pocket. When he needs a match to light his pipe, he is equally likely to choose a match from either pocket. We want to compute the density function of the random variable W that gives the number of matches remaining when the professor first discovers that one of the pockets is empty. This is known as the Banach match problem, named for the mathematician Stefan Banach, who evidently behaved in the manner described.
We can recast the problem in terms of the negative binomial distribution. Clearly the match choices form a sequence of Bernoulli trials with parameter p = 1/2. Specifically, we can consider a match from the right pocket as a win for player R, and a match from the left pocket as a win for player L. In a hypothetical infinite sequence of trials, let Y denote the number of trials necessary for R to win N + 1 trials, and Z the number of trials necessary for L to win N + 1 trials. Note that Y and Z each have the negative binomial distribution with parameters N + 1 and p.
18. For k = 0, 1, ..., N, show that
19. For k = 0, 1, ..., N, show that
20. Combine the results of the previous two exercises to conclude that
P(W = k) = C(2N - k, N) (1/2)2N - k for k = 0, 1, ..., N.
We can also solve the non-symmetric Banach match problem, using the same methods as above. Thus, suppose that the professor reaches for a match in his right pocket with probability p and in his left pocket with probability 1 - p, where 0 < p < 1..The essential change in the analysis is that Y has the negative binomial distribution with parameters N + 1 and p, while Z has the negative binomial distribution with parameters N + 1 and 1 - p.
P(W = k) = C(2N - k, N)[pN + 1 (1 - p)N - k + (1 - p)N pN - k] for k = 0, 1, ..., N.
Suppose that two teams (or individuals) A and B play a sequence of Bernoulli trials, where p is the probability that player A wins a trial. For nonnegative integers n and m, let Fn,m(p) denote the probability that A wins n points before B wins m points. Computing Fn,m(p) is an historically famous problem, known as the problem of points, that was solved by Pierre de Fermat and by Blaise Pascal.
22. Comment on the
validity of the Bernoulli trial assumptions (independence of trials and constant
probability of success) for games of sport that have a skill component as well as
a random component.
There is an easy solution to the problem of points using the binomial distribution;
this was essentially Pascal's solution. Let us pretend that n + m - 1
trials are played, regardless of the outcomes, and let
23. Show that A
wins n trials before B wins m trials if and only if
Xn + m - 1
n.
24. Use the result
of the previous exercise to show that
Fn,m(p) =
k
= n, ..., n + m -1 C(n + m -
1, k) pk(1 - p)n + m - 1 - k.
25. In the
problem of
points experiment, vary the parameters n, m, and p,
and note how the probability changes. Now with n = 10, m = 5, and p
= 0.5, run the problem 1000 times with an update frequency of 10. Note the apparent
convergence of the relative frequency to the probability.
There is also an easy solution to the problem of points using the negative binomial distribution In a sense, this has to be the case, given the equivalence between the binomial and negative binomial processes. First, let us pretend that the games go on forever, regardless of the outcomes, and let Yn denote the number of games needed for A to win n games. By definition, Yn has the negative binomial distribution with parameters n and p.
26. Show that A
wins n trials before B wins m trials if and only if
Yn
n + m -1
27. Use the result
of the previous exercise to show that
Fn,m(p) =
j
= n, ..., n + m - 1 C(j - 1, n
- 1) pn(1 - p)j - n.
28. In the
problem of points experiments, vary the parameters j, k, and p,
and note how the probability changes. Now with n = 10, m = 10, and p
= 0.7, run the problem 1000 times with an update frequency of 10. Note the apparent
convergence of the relative frequency to the probability.
29. Show that for
fixed n and m, Fn,m(p) increases from
0 to 1 as p increases from 0 to 1.
30. In the
problem of points experiment, vary the parameters n, m, and p,
and note how the probability changes. Now with n = 5, m = 10, and p
= 0.3, run the problem 1000 times with an update frequency of 10. Note the apparent
convergence of the relative frequency to the probability.
31. Show that Fn,m(p)
decreases as n increases for fixed m and p, and that Fn,m(p)
increases as m increases for fixed n and p.
32. In the problem of
points experiments, vary the parameters n, m, and p,
and note how the probability changes. Now with n = 10, m = 15, and p
= 0.3, run the problem 1000 times with an update frequency of 10. Note the apparent
convergence of the relative frequency to the probability.
33. Condition on
the outcome of the first trial to derive the following recurrence relation and boundary
conditions (this was essentially Fermat's solution):
The special case n = m is important, because Fn,n(p) is the probability that A wins a best n of 2n - 1 series. Such series, especially when n = 2, 3, or 4, are frequently used in championship tournaments.
34. Suppose
that p = 0.6. Explicitly find the probability that team A wins
35. In
the problem of
points experiments, vary the parameters n, m and p
(keeping n = m), and note how the probability changes. Now simulate a
best 3 of 5 series by selecting n = m = 3, p = 0.6. Run the experiment 1000
times, updating every 10 runs. Note the apparent convergence of the relative frequency to
the true probability.
36. Show that Fn,n(1
- p) = 1 - Fn,n(p) for any n and p.
37. In
the problem of
points experiments, vary the parameters n, m and p
(keeping n = m), and note how the probability changes. Now simulate a
best 4 of 7 series by selecting n = m = 4, p = 0.45. Run the experiment 1000
times, updating every 10 runs. Note the apparent convergence of the relative frequency to
the true probability.
38. Suppose
that if n > m. Show that Fn,n(p) > Fm,m(p)
if and only if and p > 1/2. Interpret the result.
The problem of points originated from a question posed by Chevalier de Mere, who was interested in the fair division of stakes when a game is interrupted. Specifically, suppose that players A and B each put up C monetary units, and then play Bernoulli trials until one of them wins a specified number of trials. The winner then takes the entire 2C fortune.
39. If the game is
interrupted when A needs to win n more trials and B
needs to win m
more trials, argue that the fortune should be divided between A and B,
respectively, as follows:
40. Suppose that
players A and B bet $50 each. The players toss a fair coin
until one of them has 10 wins; the winner takes the entire fortune. Suppose that the game
is interrupted by the gambling police when A has 5 wins and B has 3 wins. How
should the stakes be divided?