Virtual Laboratories > Probability Spaces > 1 [2] 3 4 5 6 7 8

2. Sets and Events


Set theory is the foundation of probability, as it is for almost every branch of mathematics. In probability, set theory is used to provide a language for modeling and describing random experiments.

Sets and subsets

First, a set is simply a collection of objects; the objects are referred to as elements of the set. The statement that s is an element of set S is written s in S. (In this project, for notational convenience, we sometimes simply use the word in.)

If A and B are sets then A is a subset of B if every element of A is also an element of B:

A subset B if and only if s in A implies s in B.

By definition, a set is completely determined by its elements. Thus sets A and B are equal if they have the same elements:

A = B if and only if A subset B and B subset A.

In most applications of set theory, all sets under discussion are subsets of a certain universal set. By contrast, the empty set, denoted Ø, is the set with no elements.

Mathematical Exercise 1. Use the formal definition of implication to show that the empty set is a subset of any set A.

A set is said to be countable if it can be put into one-to-one correspondence with a subset of the integers. Thus, a countable set is either finite or an infinite set that can be "counted" with the integers. By contrast, the set of real numbers is not countable. As we will see, countable sets play a special role in probability. The term one-to-one correspondence is defined formally in the next section on Functions and Random Variables.

Sample Space and Events

The sample space of a random experiment is a set S that includes all possible outcomes of the experiment; the sample space plays the role of the universal set when modeling the experiment. For simple experiments, the sample space may be precisely the set of possible outcomes. More often, for complex experiments, the sample space is a mathematically convenient set that includes the possible outcomes and perhaps other elements as well. For example, if the experiment is to throw a standard die and record the outcome, the sample space is S = {1, 2, 3, 4, 5, 6}, the set of possible outcomes. On the other hand, if the experiment is to capture a cicada and measure its body weight (in milligrams), we might conveniently take the sample space to be S = [0, infinity), even though most elements of this set are practically impossible.

Certain subsets of the sample space of an experiment are referred to as events. Thus, an event is a set of outcomes of the experiment. Each time the experiment is run, a given event A either occurs, if the outcome of the experiment is an element of A, or does not occur, if the outcome of the experiment is not an element of A. Intuitively, you should think of an event as a meaningful statement about the experiment.

The sample space S itself is an event; by definition it always occurs. At the other extreme, the empty set Ø is also an event; by definition it never occurs. More generally, if A and B are events in the experiment and A is a subset of B, then the occurrence of A implies the occurrence of B.

Product sets

Usually, the outcome of a random experiment consists of one or more (perhaps infinitely many) real measurements, and thus, the sample space consists of all possible measurement sequences. Therefore, we need good notation for constructing sets of sequences.

Suppose first that we have n sets S1, S2, ..., Sn. The Cartesian product (named for René Descartes) of S1, S2, ..., Sn denoted

S1 × S2 × ··· × Sn

is the set of all (ordered) sequences (s1, s2 , ..., sn) where si is an element of Si for each i. Recall that two ordered sequences are the same if and only if their corresponding coordinates agree:

(s1, s2 , ..., sn) = (t1, t2 , ..., tn) if and only if si = ti for i = 1, 2, ....

If we have n experiments with sample spaces S1, S2, ..., Sn, then S1 × S2 × ··· × Sn is the natural sample space for the compound experiment that consists of performing the n experiments in sequence. If Si = S for each i, then the product set can be written compactly as

Sn = S × S × ··· × S (n factors).

Thus if we have a basic experiment with sample space S, then Sn is the natural sample space for the compound experiment that consists of n replications of the basic experiment. In particular, R will denote the set of real numbers so that Rn is n-dimensional Euclidean space. In many cases, the sample space of a random experiment, and hence the events of the experiment, are subsets of Rn for some n.

Next, suppose that we have an infinite collection of sets S1, S2, ..., the Cartesian produce of S1, S2, ..., denoted

S1 × S2 × ···

is the set of all (ordered) sequences (s1, s2 , ...,) where si is an element of Si for each i. Again, two ordered sequences are the same if and only if their corresponding coordinates agree. If we have an infinite sequence of experiments with sample spaces S1, S2, ..., then S1 × S2 × ··· is the natural sample space for the compound experiment that consists of performing the given experiments in sequence. In particular, the sample space for the compound experiment that consists of indefinite replications of a basic experiment is S × S × ···. This is an essential special case, because probability theory is based on the idea of replicating a given experiment.

Set Operations

We are now ready to review the basic operations of set theory. For a random experiment, these operations can be used to construct new events from given events. For the following definitions, suppose that A and B are subsets of the universal set, which we will denote by S.

The union of A and B is the set obtained by combining the elements of A and B.

A union B = {s in S: s in A or s in B}.

If A and B are events in an experiment with sample space S, then the union of A and B is the event that occurs if and only if A occurs or B occurs.

The intersection of A and B is the set of elements common to both A and B:

A intersect B = {s in S: s in A and s in B}.

If A and B are events in an experiment with sample space S, then the intersection of A and B is the event that occurs if and only if A occurs and B occurs. If the intersection of sets A and B is empty, then A and B are said to be disjoint:

A intersect B = Ø.

If A and B are disjoint events in an experiment, then they are mutually exclusive; they cannot both occur on the same run of the experiment.

The complement of A is the set of elements that are not in A and is denoted Ac:

Ac = {s in S: s not in A}.

If A is an event in an experiment with sample space S, then the complement of A is the event that occurs if and only if A does not occur.

Simulation Exercise 2. The set operations are often illustrated with small, schematic sketches known as Venn diagrams, named for John Venn. In the Venn diagram applet, select each of the following and note the shaded area in the diagram.

  1. A
  2. B
  3. Ac
  4. Bc
  5. A union B
  6. A intersect B

Basic Rules

In the following problems, A, B, and C are subsets of a universal set S.

Mathematical Exercise 3. Show that A intersect B A A union B

Mathematical Exercise 4. Prove the commutative laws:

  1. A union B = B union A
  2. A intersect B = B intersect A

Mathematical Exercise 5. Prove the associative laws:

  1. A union (B union C) = (A union B) union C
  2. A intersect (B intersect C) = (A intersect B) intersect C

Mathematical Exercise 6. Prove the distributive laws:

  1. A (B union C) = (A B) union (A C)
  2. A union (B intersect C) = (A union B) intersect (A union C)

Mathematical Exercise 7. Prove DeMorgan's laws (named after Agustus DeMorgan):

  1. (A union B)c = Ac intersect Bc.
  2. (A intersect B)c = Ac union Bc.

Mathematical Exercise 8. Show that B intersect Ac is the event that occurs if and only if B occurs, but A does not.

When A B, B intersect Ac is sometimes written B - A. Thus, S - A is the same as Ac.

Mathematical Exercise 9. Show that (A intersect Bc) union (B intersect Ac) is the event that occurs if and only if one, but not both, of the given events occurs. This event is called the symmetric difference and corresponds to exclusive or.

Mathematical Exercise 10. Show that (A intersect B) union (Ac intersect Bc) is the event that occurs if and only if both the given events occurs or neither occurs.

Mathematical Exercise 11. Prove that there are 16 different (in general) events that can be constructed from two given events A and B.

Simulation Exercise 12. In the Venn diagram applet, observe the diagram of each of the 16 events that can be constructed from A and B. Note in particular the diagram of the events in Exercises 8, 9, and 10.

Computational Exercises

Mathematical Exercise 13. Consider the experiment of rolling a die twice and recording the two scores. Let A denote the event that the first die score is 1 and B the event that the sum of the scores is 7.

  1. Define the sample space S mathematically.
  2. Describe A as a subset of S.
  3. Describe B as s subset of S.
  4. Describe A union B as a subset of S.
  5. Describe A intersect B as a subset of S.
  6. Describe Ac intersect Bc as a subset of S.

Simulation Exercise 14. In the simulation of the dice experiment, select fair dice and set n = 2 . Run the experiment 100 times and count the number of times each event in the previous exercise occurs.

Mathematical Exercise 15. Consider the experiment of dealing a card from a standard deck. The outcome is recorded by giving the denomination and suit of the selected card. Let Q denote the event that the card is a queen and H the event that the card is a heart.

  1. Define the sample space S mathematically.
  2. Express Q as a subset of S.
  3. Express H as a subset of S.
  4. Express Q union H as a subset of S.
  5. Express Q intersect H as a subset of S.
  6. Express Q intersect Hc as a subset of S.

Simulation Exercise 16. In the card experiment, set n = 1. Run the experiment 100 times and count the number of times each event in the previous exercise occurs.

Mathematical Exercise 17. Recall that Buffon's coin experiment consists of tossing a coin with radius r 1/2 on a floor covered with square tiles of side length 1. The coordinates of the center of the coin are recorded relative to axes through the center of the square in which the coin falls. Let A denote the event that the coin does not touch the sides of the square.

  1. Define the sample space S mathematically.
  2. Describe A as a subset of S.
  3. Describe Ac as a subset of S.

Simulation Exercise 18. In Buffon's coin experiment, set r = 1/4. Run the simulation 100 times and count the number of times event A in the last exercises occurs

Mathematical Exercise 19. An experiment consists of rolling a pair of dice until the sum of the two scores is either 5 or 7. The number of rolls is recorded. Give the sample space of this experiment.

Mathematical Exercise 20. An experiment consists of rolling a pair of dice until the sum of the two scores is either 5 or 7. The scores of the dice on the final roll are recorded. Let A denote the event that the sum is 5 rather than 7.

  1. Define the sample space S mathematically.
  2. Describe A as a subset of S.

Mathematical Exercise 21. The die-coin experiment consists of rolling a die and then tossing a coin the number of times shown on the die. The sequence of coin scores is recorded. Let A denote the event that there are exactly two heads.

  1. Define the sample space S mathematically.
  2. Express A as a subset of S.

Simulation Exercise 22. Run the simulation of the die-coin experiment, with the default settings, 100 times. Count the number of times event A in the last exercise occurs occurs.

Mathematical Exercise 23. In the coin-die experiment, we have a coin and two dice, one red and one green. First the coin is tossed, and then if the result is heads the red die is rolled, while if the result is tails the green die is rolled. The coin score and the score of the chosen die are recorded. Let A denote the event that the die score is at least 4.

  1. Define the sample space S mathematically.
  2. Express A as a subset of S.

Simulation Exercise 24. Run the coin-die experiment, with the default settings, 100 times. Count the number of times that event A in the last exercise occurs.

Mathematical Exercise 25. In a certain district, candidates 1, 2, and 3 are running for congress. A political consultant samples 100 registered voters from the district and records the age (in years), gender, and candidate preference of each person in the sample. Assume that a registered voter must be at least 18 years old. Define a sample space for the experiment.

Data Analysis Exercise 26. In the basic cicada experiment, a cicada in the Middle Tennessee area is captured and the following measurements recorded: body weight (in grams), wing length, wing width, and body length (in millimeters), species type, and gender. The cicada data set gives the results of 104 repetitions of this experiment.

  1. Define a sample space for the basic experiment.
  2. Let F be the event that a cicada is female. Describe F as a subset of the sample space.
  3. Determine whether F occurs for each cicada in the data set.
  4. Give the sample space for the compound experiment that consists of 104 repetitions of the basic experiment/

Data Analysis Exercise 27. In the basic M&M experiment, a bag of M&Ms (of a specified size) is purchased and the following measurements recorded: the number of red, green, blue, yellow, orange, and brown candies, and the net weight (in grams). The M&M data set gives the results of 30 repetitions of this experiment.

  1. Define a sample space for the basic experiment.
  2. Let A be the event that a bag contains at least 57 candies. Describe A as a subset of the sample space.
  3. Determine whether A occurs for each bag in the data set.
  4. Give the sample space for the compound experiment that consists of 30 repetitions of the basic experiment.

Mathematical Exercise 28. A system consists of 5 components, labeled 1, 2, 3, 4, 5. Each component is either failed (encoded by 0) or working (encoded by 1). The sequence of component states is recorded. Let A be the event that a majority of components are working.

  1. Define the sample space S mathematically.
  2. Express A as a subset of S.

Mathematical Exercise 29. Two components, labeled 1 and 2, are operated until failure, and the sequence of failure times (in hours) is recorded. Let A be the event that component 1 lasts longer than 1000 hours and let B be the even that component 1 lasts longer than component 2.

  1. Define the sample space S mathematically.
  2. Describe A as a subset of S.
  3. Describe B as a subset of S.
  4. Describe A union B as a subset of S.
  5. Describe A intersect B as a subset of S.
  6. Describe A intersect Bc as a subset of S.

General Operations

The operations of union and intersection can easily be extended to a finite or even an infinite collection of sets. Thus, suppose that Aj is a subset of a universal set S for each j in a nonempty index set J.

The union of the sets Aj, j in J is the set obtained by combining the elements of the given sets:

j Aj = {s in S: s in Aj for some j}.

If Aj, j in J are events in an experiment with sample space S, then the union is the event that occurs if and only if at least one of the given events occurs.

The intersection of the sets Aj, j in J is the set of elements common to all of the given sets:

j Aj = {s in S: s in Aj for every j}.

If Aj, j in J are events in an experiment with sample space S, then the intersection is the event that occurs if and only if every event in the collection occurs.

The sets Aj, j in J are pairwise disjoint if the intersection of any two sets is empty:

Ai intersect Aj = Ø for i j.

If Aj, j in J are events in a random experiment, this means that they are mutually exclusive; at most one of the events could occur on a given run of the experiment.

The sets Aj, j in J are said to partition a set B if Aj, j in J are pairwise disjoint and

j Aj = B.

Basic Rules

In the following problems, Aj, j in J and B are subsets of a universal set S.

Mathematical Exercise 30. Prove the general distributive laws:

  1. [j Aj] intersect B = j (Aj intersect B)
  2. [j Aj] union B = j (Aj union B)

Mathematical Exercise 31. Prove the general De Morgan’s laws:

  1. [j Aj] c = j Ajc.
  2. [j Aj]c = Ajc .

Mathematical Exercise 32. Suppose that the sets Aj, j in J partition S. Show that for any subset B, the sets Aj intersect B, j in J, partition B.

Rules for Product Sets

We will now see how the set operations relate to the Cartesian product operation. Suppose that S1 and S2 are sets and that A1, B1 are subsets of S1 while A2, B2 are subsets of S2. The sets in the exercises below are subsets of S1 × S2.

Mathematical Exercise 33. Show that (A1 × A2) intersect (B1 × B2) = (A1 intersect B1) × (A2 intersect B2).

Mathematical Exercise 34. Show that

  1. (A1 × A2) union (B1 × B2) subset (A1 union B1) × (A2 union B2),
  2. In part (a), equality does not hold in general.
  3. (A1 × A2) union (B1 × B2) can be written as a disjoint union of product sets.

Mathematical Exercise 35. Show that

  1. (A1c × A2c) subset (A1 × A2)c.
  2. In part (a), equality in does not hold in general.
  3. (A1 × A2)c can be written as a disjoint union of product sets.

The last three subsections explore advanced topics and can be omitted on a first reading.

Sigma Algebras

In probability theory, and in most other mathematical theories, it is sometimes impossible to include all subsets of the universal set S in the theory. There are many strange, pathological subsets of R, for instance, that play no essential role in applied mathematics. However, we naturally want our collection of admissible subsets to be closed under the set operations listed above. Specifically, we usually need that the following property to hold: 

Any set that can be constructed from a countable number of admissible sets (using the set operations) should itself be admissible.

This leads to key definition. Suppose that A is a collection of subsets of S. Then A is said to be a sigma algebra if

  1. S in A.
  2. If A in A then Ac in A.
  3. If Aj in A for each j in a countable index set J, then unionj Aj in A.

Mathematical Exercise 36. Show that Ø in A.

Mathematical Exercise 37. Show that If Aj in A for each j in a countable index set J, then intersectj Aj in A. Hint: Use DeMorgan's law.

In any random experiment, we assume that the collection of events forms a sigma algebra.

General Constructions

Let {0, 1}S denote the collection of all subsets of S, called the power set of S. Trivially, {0, 1}S is a the largest sigma algebra of S, and as discussed above, is sometimes too large to be useful. The rather strange notation will be explained in the next section on Functions and Random Variables.

At the other extreme, the smallest sigma algebra of S is given in the following exercise.

Mathematical Exercise 38. Show that {Ø, S} is a sigma algebra.

In many cases, we want to construct a sigma algebra that contains certain basic sets. The following exercises show how to do this.

Mathematical Exercise 39. Suppose that Aj is a sigma algebra of subsets of S for each j in a nonempty index set J. Show that the intersection A below is also a sigma algebra of subsets of S.

A = intersectj Aj.

Suppose now that B is a collection of subsets of S. Think of the sets in B as basic sets; but in general B will not be a sigma algebra. The sigma algebra generated by B is the intersection of all sigma algebras that contain B, which by the previous exercise, really is a sigma algebra:

sigma(B) = intersect{A: A is a sigma algebra of subsets of S and B A}.

Mathematical Exercise 40. Show that sigma(B) is the smallest sigma algebra containing B:

  1. B sigma(B)
  2. If A is a sigma algebra of subsets of S and B A then sigma(B) A.

Mathematical Exercise 41. Suppose that A is a subset of S. Show that

sigma({A}) = {Ø, A, Ac, S}.

Mathematical Exercise 42. Suppose that A and B are subsets of S. List the 16 (in general distinct) sets in sigma({A, B}).

Mathematical Exercise 43. Suppose that A1, A2, ..., An are subsets of S. Show that there are 2^(2n) (in general distinct) sets in the sigma algebra generated by the given sets.

Special Cases

We will now discuss the natural sigma algebras that we will use for various sample spaces and other sets in this project.

As noted previously, product sets play a crucial role in probability theory. Thus, suppose that S1, S2, ..., Sn are sets and that Ai is a sigma algebra of subsets of Si for each i. For the product set

S = S1 × S2 × ··· × Sn,

we use the sigma algebra A generated by the collection of all product sets of the form

A1 × A2 × ··· × An where Ai in Ai for each i.

We extend this idea to an infinite product. Thus, suppose that S1, S2, ... are sets and that Ai is a sigma algebra of subsets of Si for each i. For the product set

S = S1 × S2 × ··· ,

we use the sigma algebra A generated by the collection of all product sets of the form

A1 × A2 × ··· × An × Sn+1 × Sn+2 × ··· where n is a positive integer and Ai in Ai for each i.

Combining the product construction with our earlier remarks about R, note that for Rn, we use the sigma algebra generated by the collection of all products of intervals. This is the Borel sigma algebra for Rn.