Virtual Laboratories > Point Estimation > 1 2 3 4 5 [6]
Consider again the basic statistical model, in which we have a random experiment with an observable random variable X taking values in a set S. Once again, the experiment is typically to sample n objects from a population and record a vector of measurements for each item. In this case, X has the form
X = (X1, X2, ..., Xn).
where Xi in the vector of measurements for the i'th item.
Suppose that the distribution of X depends on a parameter a taking values in a parameter space A. Typically, a is a vector of real parameters.
Intuitively, a statistic U = h(X) is sufficient for a if U contains all of the information about a that is available in the entire data variable X. Formally, U is sufficient for a if the conditional distribution of X given U does not depend on a.
Sufficiency is related to the concept of data reduction. Suppose that X takes values in Rn. If we can find a sufficient statistic U that takes values in Rj., then we can reduce the original data vector X (whose dimension n is usually large) to the vector of statistics U (whose dimension j is usually much smaller) with no loss of information about the parameter a.
The following result gives a condition for sufficiency that is equivalent to this definition.
1.
Let U = h(X)
and let f(x | a)
and g(u | a) denote
the probability density functions of X
and U respectively. Show that U is
sufficient for a if and only if
f(x | a) / g(h(x) | a)
is independent of a for any x in S.
Hint: The joint distribution of (X, U)
is concentrated on the set {(x, h(x)): x
S}.
2.
Suppose that I1, I2, ..., In is a
random sample of size n from the Bernoulli
distribution with parameter p in (0, 1). Show that Xn
= I1 + I2 +
··· + In is sufficient for p.
The result in Exercise 2 is intuitively appealing: in a sequence of Bernoulli trials, all of the information about the probability of success p is contained in the number of successes Xn. The particular order of the successes and failures provides no additional information.
The definition precisely captures the intuitive notion of sufficiency given above, but can be difficult to apply. We must know in advance a candidate statistic U, and then we must be able to compute the conditional distribution of X given U. The factorization theorem given in the next exercise frequently allows the identification of a sufficient statistic from the form of the density function of X.
3.
Let f(x | a)
denote the density function of X. Show that U
= h(X) is sufficient for a
if and only if there exist functions G(u | a)
and r(x) such that
f(x | a) = G[h(x) | a] r(x) for x in S and a in A.
As the notation indicates, r depends only on the data x and not on the parameter a.
4.
Show that if U and V are equivalent
statistics and U is sufficient for a
then V is sufficient for a.
5.
Suppose that the distribution of X is a k-parameter exponential families with the natural statistic
h(X). Show that h(X)
is sufficient for a.
Because of this result, h(X) is referred to as the natural sufficient statistic for the exponential family.
6.
Suppose that X1, X2, ..., Xn is a
random sample of size n from the normal
distribution with mean µ in R and variance d2
> 0.
7.
Suppose that X1, X2, ..., Xn is a
random sample of size n from the Poisson
distribution with parameter a > 0. Show that X1 + X2 +
··· + Xn is sufficient for a where
8.
Suppose that X1, X2, ..., Xn is a
random sample from the gamma distribution with
shape parameter k > 0 and scale parameter b > 0.
9.
Suppose that X1, X2, ..., Xn is a
random sample from the beta distribution with
parameters a > 0 and b > 0. Show that (U, V)
is sufficient for (a, b) where
U = X1X2 ··· Xn, V = (1 - X1)(1 - X2) ··· (1 - Xn).
10.
Suppose that X1, X2, ..., Xn
is a random sample from the uniform distribution on the
interval [0, a] where a > 0. Show that X(n)
(the n'th order statistic) is sufficient for
a.
The entire data variable X is trivially sufficient for a. However, as noted above, there usually exists a statistic U that is sufficient for a and has smaller dimension, so that we can achieve real data reduction. Naturally, we would like to find the statistic U that has the smallest dimension possible. In many cases, this smallest dimension j will be the same as the dimension k of the parameter vector a. However, as we will see, this is not necessarily the case; j can be smaller or larger than k.
Formally, suppose that a statistic U is sufficient for a. Then U is minimally sufficient if U is a function of any other statistic V that is sufficient for a.
Once again, the definition precisely captures the notion of minimal sufficiency, but is hard to apply. The following exercise gives an equivalent condition.
11.
Let f(x | a)
denote the density function of X and suppose that U
= h(X). Show that U is minimally
sufficient for a if the following condition holds:
f(x | a) / f(y | a) does not depend on a if and only if h(x) = h(y).
Hint: If V = g(X) is another sufficient statistic, use the factorization theorem and the condition above to show that g(x) = g(y) implies h(x) = h(y). Then conclude that U is a function of V.
12.
Show that if U and V are equivalent
statistics and U is minimally sufficient for a
then V is minimally sufficient for a.
13.
Suppose that the distribution of X is a k-parameter
exponential family with natural sufficient statistic U = h(X).
Show that U is a minimally sufficient for a.
Hint: Recall that j is the smallest integer such that X
is a j-parameter exponential family.
14.
Show that the sufficient statistics given above for the Bernoulli, Poisson, normal, gamma,
and beta families are minimally sufficient for the given parameters.
15.
Suppose that X1, X2, ..., Xn
is a random sample from the uniform distribution on the
interval [a, a + 1] where a > 0. Show that (X(1), X(n))
is minimally sufficient for a.
In the last exercise, note that we have a single parameter, but the minimally statistics is a vector of dimension 2.
Sufficiency is related to several of the methods of constructing estimators that we have studied.
16.
Suppose that U is sufficient for a and
that there exists a maximum likelihood estimator of a.
Show that there exists a MLE V that is a function of U.
Hint: Use the factorization theorem.
In particular, suppose that V is the unique MLE of a and that V is sufficient for a. If U is sufficient for a then V is a function of U by the previous exercise. Hence it follows that V is minimally sufficient for a.
17.
Suppose that the statistic U is sufficient for the parameter a and that V
is a Bayes' estimator of a. Show that V is a
function of U. Hint: Use the factorization theorem.
The following exercise gives the Rao-Blackwell theorem. The theorem shows how a sufficient statistic can be used to improve an unbiased estimator.
18.
Suppose that U is sufficient for a and
that V is an unbiased estimator of a real parameter b = b(a).
Use sufficiency and properties of conditional expectation
and conditional variance to show that
Suppose that U = h(X) is a statistic. Then U is a complete if
E[g(U) | a] = 0 for all a in A implies P[g(U) = 0 | a] = 1 for all a in A.
19.
Show that if U and V are equivalent
statistics and U is complete for a
then V is complete for a.
20.
Suppose that I1, I2, ..., In is a
random sample of size n from the Bernoulli distribution with parameter p
in (0, 1). Show that the sum is complete for p:
Y = I1 + I2 + ··· + In.
Hint: Note that Ep[g(Y)] can be written as a polynomial in t = p / (1 - p). If this polynomial is 0 for all t > 0, then the coefficients must be 0.
21.
Suppose that X1, X2, ..., Xn is a
random sample of size n from the Poisson distribution with parameter a
> 0. Show that the sum is complete for a:
Y = X1 + X2 + ··· + Xn.
Hint: Note that Ea[g(Y)] can be written as a power series in a. If this series is 0 for all a > 0, then the coefficients must be 0.
22.
Suppose that X1, X2, ..., Xn is a
random sample of size n from the exponential distribution with scale parameter b
> 0. Show that the sum is complete for b.
Y = X1 + X2 + ··· + Xn.
Hint: Show that Eb[g(Y)] is the Laplace transform of a certain function. If this transform is 0 for all b > 0, then the function must be identically 0.
The result in the previous exercise generalizes to exponential families, although the general proof is complicated. Specifically, if the distribution of X is a j-parameter exponential family with the natural sufficient vector of statistics U = h(X) then U is complete for a (as well as minimally sufficient for a). This applies to random samples from the Bernoulli, Poisson, normal, gamma, and beta distributions discussed above.
The notion of completeness depends very much on the parameter space.
23.
Suppose that I1, I2, I3 is a random
sample of size 3 from the Bernoulli distribution with parameter p in {1/3, 1/2}.
Show that Y = I1 + I2 + I3
is not complete for p.
The next exercise shows the importance of complete sufficient statistics; it is known as the Lehmann-Scheffe theorem.
24.
Suppose that U is sufficient and complete for a
and that T = r(U) is an unbiased estimator of a
real parameter b(a). Show that T is a uniformly
minimum variance unbiased estimator of b(a).
The proof is based on the following steps:
25.
Suppose that (I1, I2, ..., In)
is a random sample of size n from the Bernoulli
distribution with parameter p in (0, 1). Show that an UMVUE for p(1
- p), the variance of the sample distribution, is
X / (n - 1) - X2 / [n(n - 1)] where X = I1 + I2 + ··· + In.
26.
Suppose that X1, X2, ..., Xn is a
random sample of size n from the Poisson
distribution with parameter a. Show that an UMVUE for P(X = 0)
= e-a is
[(n - 1) / n]Y where Y = X1 + X2 + ··· + Xn.
Hint: Use the probability generating function of Y.
Suppose that V = r(X) is a statistics. If the distribution of V does not depend on a, then V is called an ancillary statistic for a. Thus, the notion of an ancillary statistic is complementary to the notion of a sufficient statistics (which contains all information about the parameter that is contained in the sample). Thus, the result in the following theorem, due to Basu, makes this point more precisely.
27.
Suppose that U is complete and sufficient for a
and that V is an ancillary statistic. Show that U
and V are independent. The following steps sketch the
proof:
28.
Show that if U and V are equivalent
statistics and U ancillary for a then V
is ancillary for a.
29.
Suppose that X1, X2, ..., Xn is a
random sample from a scale family with scale
parameter b > 0. Show that if V is a function of X1 / Xn,
X2 / Xn, ..., Xn - 1 / Xn
then V is an ancillary statistic for b.
30.
Suppose that X1, X2, ..., Xn is a
random sample of size n from the gamma
distribution with shape parameter k > 0 and scale parameter b
> 0. Let M denote the ordinary sample mean and U the geometric sample
mean. Show that M / U is ancillary for b, and thus
conclude that M and M / U are independent. Hint:
Use the previous exercise.