Virtual Laboratories > Probability Spaces > 1 2 3 4 [5] 6 7 8

5. Conditional Probability


Definition

As usual, suppose that we have a random experiment with sample space S, and probability measure P. Suppose also that we know that an event B in a has occurred. In general, this information should clearly alter the probabilities that we assign to other events. In particular, if A is another event then A occurs if and only if A and B occur; effectively, the sample space has been reduced to reduced to B. Thus, the probability of A, given that we know B has occurred, should be proportional to P(A B). However, conditional probability, given that B has occurred, should still be a probability measure, that is, it must satisfy the axioms of probability. This forces the proportionality constant to be 1 / P(B). Thus, we are led inexorably to the following definition:

Let A and B be events in a random experiment with P(B) > 0. The conditional probability of A given B is defined to be

P(A | B) = P(A intersect B) / P(B).

This argument was based on the axiomatic definition of probability. Let’s explore the idea of conditional probability from the less formal and more intuitive notion of relative frequency. Thus, suppose that we replicate the experiment repeatedly. For an arbitrary event C, let Nn(C) denote the number of times C occurs in the first n runs.

If Nn(B) is large, the conditional probability that A has occurred given that B has occurred should be close to the conditional relative frequency of A given B, namely the relative frequency of A for the runs on which B occurred:

Nn(A intersect B) / Nn(B).

But by another application of the relative frequency idea,

Nn(A intersect B) / Nn(B) = [Nn(A intersect B) / n] / [Nn(B) / n] converges to P(A intersect B) / P(B) as n converges to infinity.

so again we are led to the same definition.

In some cases, conditional probabilities can be computed directly, by effectively reducing the sample space to the given event. In other cases, the formula above is better.

Properties

Mathematical Exercise 1. Show that as a function of A, for fixed B, P(A | B) is a probability measure.

Exercise 1 is the most important property of conditional probability because it means that any result that holds for probability measures in general holds for conditional probability, as long as the conditioning event remains fixed.

Mathematical Exercise 2. Suppose that A and B are events in a random experiment with P(B) > 0. Prove each of the following:

  1. If B A then P(A | B) = 1.
  2. If A B then P(A | B) = P(A) / P(B).
  3. If A and B are disjoint then P(A | B) = 0.

Mathematical Exercise 3. Suppose that A and B are events in a random experiment, each having positive probability. Show that

  1. P(A | B) > P(A) if and only if P(B | A) > P(B) if and only if P(A intersect B) > P(A)P(B)
  2. P(A | B) < P(A) if and only if P(B | A) < P(B) if and only if P(A intersect B) < P(A)P(B)
  3. P(A | B) = P(A) if and only if P(B | A) = P(B) if and only if P(A intersect B) = P(A)P(B)

In case (a), A and B are said to be positively correlated. Intuitively, the occurrence of either event means that the other event is more likely. In case (b), A and B are said to be negatively correlated. Intuitively, the occurrence of either event means that the other event is less likely. In case (c), A and B are said to be independent. Intuitively, the occurrence of either event does not change the probability of the other event.

Sometimes conditional probabilities are known and can be used to find the probabilities of other events.

Mathematical Exercise 4. Suppose that A1, A2, ..., An are events in a random experiment whose intersection has positive probability. Prove the multiplication rule of probability.

P(A1 intersect A2 intersect ··· intersect An) = P(A1)P(A2 | A1)P(A3 | A1 intersect A2) ··· P(An | A1 intersect A2 intersect ··· An-1)

The multiplication rule is particularly useful for experiments that consist of dependent stages, where Ai is an event in stage i. Compare the multiplication rule of probability with the multiplication rule of combinatorics.

Exercises

Mathematical Exercise 5. Suppose that A and B are events in an experiment with P(A) = 1 / 3, P(B) = 1 / 4, P(A intersect B) = 1 / 10. Find each of the following:

  1. P(A | B)
  2. P(B | A)
  3. P(Ac | B)
  4. P(Bc | A)
  5. P(Ac | Bc)

Mathematical Exercise 6. Consider the experiment that consists of rolling 2 fair dice and recording the sequence of scores (X1, X2). Let Y denote the sum of the scores. For each of the following pairs of events, find the probability of each event and the conditional probability of each event given the other. Determine whether the events are positively correlated, negatively correlated, or independent.

  1. {X1 = 3}, {Y = 5}
  2. {X1 = 3}, {Y = 7}
  3. {X1 = 2}, {Y = 5}
  4. {X1 = 2},{X1 = 3}

Correlation is not transitive. From the previous exercise, for example, note that {X1 = 3}, {Y = 5} are positively correlated, {Y = 5}, {X1 = 2} are positively correlated, but {X1 = 3}, {X1 = 2} are negatively correlated.

Simulation Exercise 7. In dice experiment, set n = 2. Run the experiment 500 times. Compute the empirical conditional probabilities corresponding to the conditional probabilities in the last exercise.

Mathematical Exercise 8. Consider the card experiment that consists of dealing 2 cards from a standard deck and recording the sequence of cards dealt. For i = 1, 2, let Qi be the event that card i is a queen and Hi the event that card i is a heart. For each of the following pairs of events, compute the probability of each event, and the conditional probability of each event given the other. Determine whether the events are positively correlated, negatively correlated, or independent.

  1. Q1, H1.
  2. Q1, Q2.
  3. Q2, H2.
  4. Q1, H2.

Simulation Exercise 9. In the card experiment, set n = 2. Run the experiment 500 times. Compute the conditional relative frequencies corresponding to the conditional probabilities in the last exercise.

Mathematical Exercise 10. Consider the card experiment with n = 3 cards. Find the probability of the following events:

  1. All three cards are all hearts.
  2. The first two cards are hearts and the third is a spade.
  3. The first and third cards are hearts and the second is a spade.

Simulation Exercise 11. In the card experiment, set n = 3 and run the simulation 1000 times. Compute the empirical probability of each event in the previous exercise and compare with the true probability.

Mathematical Exercise 12. In a certain population, 30% of the persons smoke and 8% have a certain type of heart disease. Moreover, 12% of the persons who smoke have the disease.

  1. What percentage of the population smoke and have the disease?
  2. What percentage of the population with the disease also smoke?
  3. Are smoking and the disease positively correlated, negatively correlated, or independent?

Mathematical Exercise 13. Suppose that A, B, and C are events in a random experiment with P(A | C) = 1 / 2, P(B | C) = 1 / 3, and P(A intersect B | C) = 1 / 4. Find each of the following:

  1. P(A intersect Bc | C)
  2. P(A union B | C)
  3. P(Ac intersect Bc | C).

Mathematical Exercise 14. Suppose that A and B are events in a random experiment with P(A) = 1 / 2, P(B) = 1 /3 , P(A | B) = 3 / 4. Find each of the following

  1. P(A intersect B).
  2. P(A union B).
  3. P(B union Ac).
  4. P(B | A).

Data Analysis Exercise 15. For the M&M data set, find the empirical probability that a bag has at least 10 reds, given that the weight of the bag is at least 48 grams.

Data Analysis Exercise 16. For the Cicada data,

  1. Find the empirical probability that a cicada weighs at least 0.25 grams given that the cicada is male.
  2. Find the empirical probability that a cicada weighs at least 0.25 grams given that the cicada is the tredecula species.

Conditional Distributions

Again, suppose that we have an experiment with sample space S and probability measure P. Suppose that X is a random variable for the experiment that takes values in a set T. Recall that the probability distribution of X is the probability measure on T given by

P(X in B) for B T.

Analogously, if A is an event with positive probability, the conditional distribution of X given A is the probability measure on T given by

P(X in B | A) for B T.

Mathematical Exercise 17. Consider the experiment that consists of rolling 2 fair dice and recording the sequence of scores (X1, X2). Let Y denote the sum of the scores. Find the conditional distribution of (X1, X2) given that Y = 7.

Mathematical Exercise 18. Suppose that the time X required to perform a certain job (in minutes) is uniformly distributed on the interval (15, 60).

  1. Find the probability that the job requires more than 30 minutes.
  2. Given that the job is not finished after 30 minutes, find the probability that the job will require more than 15 additional minutes.
  3. Find the conditional distribution of X given X > 30.

Mathematical Exercise 19. Recall that Buffon's coin experiment consists of tossing a coin with radius r 1/2 randomly on a floor covered with square tiles of side length 1. The coordinates (X, Y) of the center of the coin are recorded relative to axes through the center of the square, parallel to the sides.

  1. Find P(Y > 0 | X < Y)
  2. Find the conditional distribution of (X, Y) given that the coin does not touch the sides of the square.

Simulation Exercise 20. Run Buffon's coin experiment 500 times. Compute the empirical probability that Y > 0 given X < Y, and compare with the probability in the last exercise.

The Law of Total Probability and Bayes' Theorem

Suppose that {Aj: j in J} is a countable collection of events that that partition the sample space S. Let B be another event and suppose that we know P(Aj) and P(B | Aj) for each j in J.

Mathematical Exercise 21. Prove the the law of total probability:

P(B) = sumj P(Aj) P(B | Aj).

Mathematical Exercise 22. Prove Bayes' Theorem, named after Thomas Bayes: for k in J,

P(Ak | B) = P(Ak)P(B | Ak) / sumj P(Aj) P(B | Aj).

In the context of Bayes theorem, P(Aj) is the prior probability of Aj and P(Aj | B) is the posterior distribution of Aj. We will study more general versions of the law of total probability and Bayes theorem in the chapter on Distributions.

Mathematical Exercise 23. In the die-coin experiment, a fair die is rolled and then a fair coin is tossed the number of times showing on the die.

  1. Find the probability that all the coins show heads.
  2. Given that all coins are heads, find the probability that the die score was i for each i = 1, 2, 3, 4, 5, 6.

Simulation Exercise 24. Run the die-coin experiment 200 times.

  1. Compute the empirical probability of the event that all coins are heads and compare with the probability in the previous exercise.
  2. For i = 1, 2, ..., 6, compute the empirical probability of the event that the die score was i given that there all coins were heads. Compare with the probability in the previous exercise.

Mathematical Exercise 25. Suppose that a bag contains 12 coins: 5 are fair, 4 are biased with probability of heads 1/3; and 3 are two-headed. A coin is chosen at random from the bag and tossed.

  1. Find the probability that the coin is heads.
  2. Given that the coin is heads, find the conditional probability of each coin type.

Compare Exercises 23 and 25. In Exercise 23, we toss a coin with a fixed probability of heads a random number of times. In Exercise 25, we effectively toss a coin with a random probability of heads a fixed number of times.

Mathematical Exercise 26. In the coin-die experiment, a fair coin is tossed. If the coin lands tails, a fair die is rolled. If the coin lands heads, an ace-six flat die is tossed (1 and 6 have probability 1/4 each, while 2, 3, 4, 5 have probability 1/8 each).

  1. Find the probability that the die score is i, for i = 1, 2, ..., 6.
  2. Given that the die score is 4, find the conditional probability that the coin landed heads and the conditional probability that the coin lands tails.

Simulation Exercise 27. Run the coin-die experiment 500 times.

  1. Compute the empirical probability of the event that the die score is i, for each i, and compare with the probability in the previous exercise
  2. Compute the empirical probability of the event that the coin landed heads, given that the die score is 4 and compare with the probability in the previous exercise.

Mathematical Exercise 28. A plant has 3 assembly lines that produces memory chips. Line 1 produces 50% of the chips and has a defective rate of 4%; line 2 has produces 30% of the chips and has a defective rate of 5%; line 3 produces 20% of the chips and has a defective rate of 1%. A chip is chosen at random from the plant.

  1. Find the probability that the chip is defective.
  2. Given that the chip is defective, find the conditional probability for each line.

Mathematical Exercise 29. The most common form of colorblindness (dichromatism) is a sex-linked hereditary condition caused by a defect on the X chromosome. Thus, it is much more common in males than females; 7% of males are colorblind but only 0.5% of females are colorblind. (For more on sex-linked hereditary disorders, see the discussion of hemophilia.) In a certain population, 50% are male and 50% are female.

  1. Find the percentage of colorblind persons in the population.
  2. Find the percentage of colorblind persons that are male.

Mathematical Exercise 30. An urn initially contains 6 red and 4 green balls. A ball is chosen at random and then replaced along with 2 additional balls of the same color; the process is repeated. This an example of Pólya's urn scheme, named after George Pólya.

  1. Find the probability that the first 2 balls are red and the third ball is green.
  2. Find the probability that the second ball is red.
  3. Find the probability that the first ball is red given that the second ball is red.

Mathematical Exercise 31. Urn 1 contains 4 red and 6 green balls while urn 2 contains 7 red and 3 green balls. An urn is chosen at random and then a ball is chosen from the selected urn.

  1. Find the probability that the ball is green.
  2. Given that the ball is green, find the conditional probability that urn 1 was selected.

Mathematical Exercise 32. Urn 1 contains 4 red and 6 green balls while urn 2 contains 6 red and 3 green balls. A ball is selected at random from urn 1 and transferred to urn 2. Then a ball is selected at random from urn 2.

  1. Find the probability that the ball from urn 2 is green.
  2. Given that the ball from urn 2 is green, find the conditional probability that the ball from urn 1 was green.

Diagnostic Testing

Suppose that we have a random experiment with an event A of interest. When we run the experiment, of course, event A will either occur or not occur. However, we are not able to observe the occurrence or non-occurrence of A directly. Instead we have a test designed to indicate the occurrence of event A; thus the test that can be either positive for A or negative for A. The test also has an element of randomness, and in particular can be in error. Here are some typical examples of the type of situation we have in mind:

Let T be the event that the test is positive for the occurrence of A. The conditional probability P(T | A) is called the sensitivity of the test. The complementary probability

P(Tc | A) = 1 - P(T | A)

is the false negative probability. The conditional probability P(Tc | Ac) is called the specificity of the test. The complementary probability

P(T | Ac) = 1 - P(Tc | Ac)

is the false positive probability. In many cases, the sensitivity and specificity of the test are known, as a result of the development of the test. However, the user of the test is interested in the opposite conditional probabilities:

P(A | T), P(Ac|Tc).

Mathematical Exercise 33. Use Bayes' Theorem to show that

P(A | T) = P(T | A)P(A) / [P(T | A)P(A) + P(T | Ac)P(Ac)].

For a concrete example, suppose that the sensitivity of the test is 0.99 and the specificity of the test is 0.95. Superficially, the test looks good.

Mathematical Exercise 34. Find P(A | T) as a function of p = P(A). Show that the graph has the following shape:

P(A | T) as a function of p = P(A)

Mathematical Exercise 35. Show that P(A | T) as a function of P(A) has the values given in the following table:

P(A) P(A | T)
0.001 0.019
0.01 0.167
0.1 0.688
0.2 0.832
0.3 0.895

The small value of P(A | T) for small values of P(A) is striking. The moral, of course, is that P(A | T) depends critically on P(A), not just on the sensitivity and specificity of the test. Moreover, the correct comparison is P(A | T) with P(A), as in the table, not P(A | T) with P(T | A).