Virtual Laboratories > Point Estimation > 1 2 3 4 5 [6]

6. Sufficient, Complete, and Ancillary Statistics


Consider again the basic statistical model, in which we have a random experiment with an observable random variable X taking values in a set S. Once again, the experiment is typically to sample n objects from a population and record a vector of measurements for each item. In this case, X has the form

X = (X1, X2, ..., Xn).

where Xi in the vector of measurements for the i'th item.

Suppose that the distribution of X depends on a parameter a taking values in a parameter space A. Typically, a is a vector of real parameters.

Sufficient Statistics

Intuitively, a statistic U = h(X) is sufficient for a if U contains all of the information about a that is available in the entire data variable X. Formally, U is sufficient for a if the conditional distribution of X given U does not depend on a.

Sufficiency is related to the concept of data reduction. Suppose that X takes values in Rn. If we can find a sufficient statistic U that takes values in Rj., then we can reduce the original data vector X (whose dimension n is usually large) to the vector of statistics U (whose dimension j is usually much smaller) with no loss of information about the parameter a.

The following result gives a condition for sufficiency that is equivalent to this definition.

Mathematical Exercise 1. Let U = h(X) and let f(x | a) and g(u | a) denote the probability density functions of X and U respectively. Show that U is sufficient for a if and only if

f(x | a) / g(h(x) | a)

is independent of a for any x in S. Hint: The joint distribution of (X, U) is concentrated on the set {(x, h(x)): x in S}.

Mathematical Exercise 2. Suppose that I1, I2, ..., In is a random sample of size n from the Bernoulli distribution with parameter p in (0, 1). Show that Xn = I1 + I2 + ··· + In is sufficient for p.

The result in Exercise 2 is intuitively appealing: in a sequence of Bernoulli trials, all of the information about the probability of success p is contained in the number of successes Xn. The particular order of the successes and failures provides no additional information.

The Factorization Theorem

The definition precisely captures the intuitive notion of sufficiency given above, but can be difficult to apply. We must know in advance a candidate statistic U, and then we must be able to compute the conditional distribution of X given U. The factorization theorem given in the next exercise frequently allows the identification of a sufficient statistic from the form of the density function of X.

Mathematical Exercise 3. Let f(x | a) denote the density function of X. Show that U = h(X) is sufficient for a if and only if there exist functions G(u | a) and r(x) such that

f(x | a) = G[h(x) | a] r(x) for x in S and a in A.

As the notation indicates, r depends only on the data x and not on the parameter a.

Mathematical Exercise 4. Show that if U and V are equivalent statistics and U is sufficient for a then V is sufficient for a.

Mathematical Exercise 5. Suppose that the distribution of X is a k-parameter exponential families with the natural statistic h(X). Show that h(X) is sufficient for a.

Because of this result, h(X) is referred to as the natural sufficient statistic for the exponential family.

Mathematical Exercise 6. Suppose that X1, X2, ..., Xn is a random sample of size n from the normal distribution with mean µ in R and variance d2 > 0.

  1. Show that (X1 + X2 + ··· + Xn, X12 + X22 + ··· + Xn2) is sufficient for (µ, d2),
  2. Show that (M, S2) is sufficient for (µ, d2) where M is the sample mean and S2 is the sample variance. Hint: Use part (a) and equivalence.

Mathematical Exercise 7. Suppose that X1, X2, ..., Xn is a random sample of size n from the Poisson distribution with parameter a > 0. Show that X1 + X2 + ··· + Xn is sufficient for a where

Mathematical Exercise 8. Suppose that X1, X2, ..., Xn is a random sample from the gamma distribution with shape parameter k > 0 and scale parameter b > 0.

  1. Show that (X1 + X2 + ··· + Xn, X1X2 ··· Xn) is sufficient for (k, b).
  2. Show that (M, U) is sufficient for (k, b) where M is the (arithmetic) sample mean and U is the geometric sample mean. Hint: Use part (a) and equivalence.

Mathematical Exercise 9. Suppose that X1, X2, ..., Xn is a random sample from the beta distribution with parameters a > 0 and b > 0. Show that (U, V) is sufficient for (a, b) where

U = X1X2 ··· Xn, V = (1 - X1)(1 - X2) ··· (1 - Xn).

Mathematical Exercise 10. Suppose that X1, X2, ..., Xn is a random sample from the uniform distribution on the interval [0, a] where a > 0. Show that X(n) (the n'th order statistic) is sufficient for a.

Minimal Sufficient Statistics

The entire data variable X is trivially sufficient for a. However, as noted above, there usually exists a statistic U that is sufficient for a and has smaller dimension, so that we can achieve real data reduction. Naturally, we would like to find the statistic U that has the smallest dimension possible. In many cases, this smallest dimension j will be the same as the dimension k of the parameter vector a. However, as we will see, this is not necessarily the case; j can be smaller or larger than k.

Formally, suppose that a statistic U is sufficient for a. Then U is minimally sufficient if U is a function of any other statistic V that is sufficient for a.

Once again, the definition precisely captures the notion of minimal sufficiency, but is hard to apply. The following exercise gives an equivalent condition.

Mathematical Exercise 11. Let f(x | a) denote the density function of X and suppose that U = h(X). Show that U is minimally sufficient for a if the following condition holds:

f(x | a) / f(y | a) does not depend on a if and only if h(x) = h(y).

Hint: If V = g(X) is another sufficient statistic, use the factorization theorem and the condition above to show that g(x) = g(y) implies h(x) = h(y). Then conclude that U is a function of V.

Mathematical Exercise 12. Show that if U and V are equivalent statistics and U is minimally sufficient for a then V is minimally sufficient for a.

Mathematical Exercise 13. Suppose that the distribution of X is a k-parameter exponential family with natural sufficient statistic U = h(X). Show that U is a minimally sufficient for a. Hint: Recall that j is the smallest integer such that X is a j-parameter exponential family.

Mathematical Exercise 14. Show that the sufficient statistics given above for the Bernoulli, Poisson, normal, gamma, and beta families are minimally sufficient for the given parameters.

Mathematical Exercise 15. Suppose that X1, X2, ..., Xn is a random sample from the uniform distribution on the interval [a, a + 1] where a > 0. Show that (X(1), X(n)) is minimally sufficient for a.

In the last exercise, note that we have a single parameter, but the minimally statistics is a vector of dimension 2.

Properties of Sufficient Statistics

Sufficiency is related to several of the methods of constructing estimators that we have studied.

Mathematical Exercise 16. Suppose that U is sufficient for a and that there exists a maximum likelihood estimator of a. Show that there exists a MLE V that is a function of U. Hint: Use the factorization theorem.

In particular, suppose that V is the unique MLE of a and that V is sufficient for a. If U is sufficient for a then V is a function of U by the previous exercise. Hence it follows that V is minimally sufficient for a.

Mathematical Exercise 17. Suppose that the statistic U is sufficient for the parameter a and that V is a Bayes' estimator of a. Show that V is a function of U. Hint: Use the factorization theorem.

The following exercise gives the Rao-Blackwell theorem. The theorem shows how a sufficient statistic can be used to improve an unbiased estimator.

Mathematical Exercise 18. Suppose that U is sufficient for a and that V is an unbiased estimator of a real parameter b = b(a). Use sufficiency and properties of conditional expectation and conditional variance to show that

  1. E(V | U) is a valid statistic (does not depend on a) and is a function of U.
  2. E(V | U) is an unbiased estimator of b.
  3. var[E(V | U)] <= var(V) for any a so E(V | U) is uniformly better than V.

Complete Statistics

Suppose that U = h(X) is a statistic. Then U is a complete if

E[g(U) | a] = 0 for all a in A implies P[g(U) = 0 | a] = 1 for all a in A.

Mathematical Exercise 19. Show that if U and V are equivalent statistics and U is complete for a then V is complete for a.

Mathematical Exercise 20. Suppose that I1, I2, ..., In is a random sample of size n from the Bernoulli distribution with parameter p in (0, 1). Show that the sum is complete for p:

Y = I1 + I2 + ··· + In.

Hint: Note that Ep[g(Y)] can be written as a polynomial in t = p / (1 - p). If this polynomial is 0 for all t > 0, then the coefficients must be 0.

Mathematical Exercise 21. Suppose that X1, X2, ..., Xn is a random sample of size n from the Poisson distribution with parameter a > 0. Show that the sum is complete for a:

Y = X1 + X2 + ··· + Xn.

Hint: Note that Ea[g(Y)] can be written as a power series in a. If this series is 0 for all a > 0, then the coefficients must be 0.

Mathematical Exercise 22. Suppose that X1, X2, ..., Xn is a random sample of size n from the exponential distribution with scale parameter b > 0. Show that the sum is complete for b.

Y = X1 + X2 + ··· + Xn.

Hint: Show that Eb[g(Y)] is the Laplace transform of a certain function. If this transform is 0 for all b > 0, then the function must be identically 0.

The result in the previous exercise generalizes to exponential families, although the general proof is complicated. Specifically, if the distribution of X is a j-parameter exponential family with the natural sufficient vector of statistics U = h(X) then U is complete for a (as well as minimally sufficient for a). This applies to random samples from the Bernoulli, Poisson, normal, gamma, and beta distributions discussed above.

The notion of completeness depends very much on the parameter space.

Mathematical Exercise 23. Suppose that I1, I2, I3 is a random sample of size 3 from the Bernoulli distribution with parameter p in {1/3, 1/2}. Show that Y = I1 + I2 + I3 is not complete for p.

The next exercise shows the importance of complete sufficient statistics; it is known as the Lehmann-Scheffe theorem.

Mathematical Exercise 24. Suppose that U is sufficient and complete for a and that T = r(U) is an unbiased estimator of a real parameter b(a). Show that T is a uniformly minimum variance unbiased estimator of b(a). The proof is based on the following steps:

  1. Suppose that V is an unbiased estimator of b(a). By the Rao-Blackwell theorem, E(V | U) is also an unbiased estimator of b(a) and is uniformly better than V.
  2. Since E(V | U) is a function of U, use completeness to conclude that T = E(V | U) (with probability 1).

Mathematical Exercise 25. Suppose that (I1, I2, ..., In) is a random sample of size n from the Bernoulli distribution with parameter p in (0, 1). Show that an UMVUE for p(1 - p), the variance of the sample distribution, is

X / (n - 1) - X2 / [n(n - 1)] where X = I1 + I2 + ··· + In.

Mathematical Exercise 26. Suppose that X1, X2, ..., Xn is a random sample of size n from the Poisson distribution with parameter a. Show that an UMVUE for P(X = 0) = e-a is

[(n - 1) / n]Y where Y = X1 + X2 + ··· + Xn.

Hint: Use the probability generating function of Y.

Ancillary Statistics

Suppose that V = r(X) is a statistics. If the distribution of V does not depend on a, then V is called an ancillary statistic for a. Thus, the notion of an ancillary statistic is complementary to the notion of a sufficient statistics (which contains all information about the parameter that is contained in the sample). Thus, the result in the following theorem, due to Basu, makes this point more precisely.

Mathematical Exercise 27. Suppose that U is complete and sufficient for a and that V is an ancillary statistic. Show that U and V are independent. The following steps sketch the proof:

  1. Suppose that V takes values in T . Let g denote the density function of V and let g(· | U) denote the conditional density of V given U.
  2. Use properties of conditional expected value to show that E[g(v | U)] = g(v) for v in T.
  3. Use completeness to conclude that g(v | U) = g(v) with probability 1.

Mathematical Exercise 28. Show that if U and V are equivalent statistics and U ancillary for a then V is ancillary for a.

Mathematical Exercise 29. Suppose that X1, X2, ..., Xn is a random sample from a scale family with scale parameter b > 0. Show that if V is a function of X1 / Xn, X2 / Xn, ..., Xn - 1 / Xn then V is an ancillary statistic for b.

Mathematical Exercise 30. Suppose that X1, X2, ..., Xn is a random sample of size n from the gamma distribution with shape parameter k > 0 and scale parameter b > 0. Let M denote the ordinary sample mean and U the geometric sample mean. Show that M / U is ancillary for b, and thus conclude that M and M / U are independent. Hint: Use the previous exercise.