Investigation 10: Sampling without Replacement (due Monday, October 18)

You may work with one other person on this assignment, handing in one report with both names.  Word-processed reports are preferred to hand-written ones. 

Suppose that in a population of 10 items, 3 are defective and 7 are not.  Suppose that two items are chosen at random for inspection.  Let X be the number of defective items inspected.  The random variable X has a hypergeometric distribution, with parameters N=10, M=3, and n=2.  [Note that since we are counting the number of defectives in the sample, “success” means “defective” here.]

 

(a) Explain why X does not have a binomial distribution.  [Hint: Which of the binomial conditions is not satisfied here?]

 

(b) Determine the probabilities that X=0, that X=1, and that X=2.  [You may do this by hand or with Minitab.  To use Minitab, first put the values 0, 1, and 2 into c1, and then use Calc> Probability Distributions> Hypergeometric.  Click on “probability,” enter the appropriate parameters, and select “input column” with c1 as that column and c2 for optional storage.]

 

(c) Repeat (b) assuming that the population size is N=100, with 30 defective and 70 not.

 

(d) Repeat (b) assuming that the population size is N=1000, with 300 defective and 700 not.

 

(e) Repeat (b) assuming that the population size is N=10,000, with 3000 defective and 7000 not.

 

If the sampling had been with replacement, then X would have followed a binomial distribution with n = 2 and p = M/N = .3.

 

(f) Determine the probabilities that this binomial probability distribution equals 0, 1, and 2.

 

(g) Does the binomial distribution do a good job of approximating the hypergeometric distribution for any of these values of the population size N?  Explain why this makes sense.

 

When the population size is large relative to the sample size, then sampling without replacement is not much different than sampling with replacement (because the probability of “success” changes very little even as successes and failures are drawn from the population.  In this case the binomial distribution with p = M/N provides an effective approximation to the hypergeometric distribution.  Our rule-of-thumb will be that if the population size is at least 20 times the sample size (N > 20n), then the binomial distribution closely approximates the hypergeometric.

 

Consider the population of roughly 200 million adult Americans.  Suppose that 105 million favor candidate A in an upcoming election and 95 million favor candidate B.  Suppose that a sample of 1000 is to be chosen at random.  Let X be the number on the sample who favor candidate A.

 

(h) Identify the (exact) probability distribution of X (its name and its parameter values).  Then identify a probability distribution that could be used to approximate X (also its name and parameter values).  Do you expect the approximation to be accurate here?  Explain.