Investigation 18: Vegetarians and Time Travel (due Thursday, December 2)

You may work with one other person on this assignment, handing in one report with both names.  Word-processed reports are preferred to hand-written ones.  Please copy/paste graphs from the Java applet into a Word file (using the “print screen” key on the keyboard) as appropriate.

The traditional formula for a 95% confidence interval for a population proportion p based on a sample proportion p-hat is: p-hat +/- 1.96sqrt[(p-hat)*(1-p-hat)/n].  This procedure, known as the Wald procedure, is generally considered to be valid when n(p-hat)>=10 and n(1-p-hat)>=10.

a) Recall the class survey on the question of whether you would prefer to travel to the future or to the past if time travel were possible (14 answered “future” and 21 answered “past”).  Check whether the sample size conditions for the Wald procedure are satisfied in this sample.  If they are, use the Wald procedure to construct a 95% confidence interval for p1, the proportion of all Cal Poly students who would choose to travel to the future.

b) Now recall the class survey on the “vegetarian” question (1 of 43 answered that he/she is vegetarian).  Check whether the sample size conditions for the Wald procedure are satisfied in this sample.  If they are, use the Wald procedure to construct a 95% confidence interval for p2, the proportion of all Cal Poly students who are vegetarians.

To investigate how well the Wald procedure performs, we can simulate taking a large number of samples from a population and see whether the resulting interval succeeds in capturing the actual value of the population parameter about 95% of the time.  The Java applet available at  http://www.rossmanchance.com/applets/Confsim/Confsim.html can be used for this purpose.  Make sure that the top window is set to “proportions.”

c) Suppose that the actual value of the population proportion were p=.5.  Use the Java applet to simulate 200 samples of size 50 and to calculate a 95% Wald interval from each sample.  [Hints: Make sure that the second window is set to “Wald,” and then enter .5 for p and 100 for n.  Also enter 200 for the number of intervals, and 95 for the confidence level.  Then click on “sample.”]  How many and what percentage of your 200 simulated samples produce a confidence interval that succeeds in capturing the actual value of p?  [Note that successful intervals appear in green and unsuccessful ones in red.]

d) Click on “sort,” and comment on what the intervals that fail to capture the actual value of p have in common.

e) Click repeatedly on “sample” until you have generated a total of 10,000 intervals.  How many and what percentage of them produce a confidence interval that succeeds in capturing the actual value of p?

f) Is this percentage of successful intervals fairly close to the confidence level (95%)?  Would you say that the Wald procedure performs well in this case?  Explain briefly.

g) Click on “reset” and then keep the actual value of the population proportion at p=.5, but change the sample size to n=10.  Again ask for 200 intervals at a time until you have generated a total of 10,000 of them.  How many and what percentage of these simulated samples produce a confidence interval that succeeds in capturing the actual value of p?  Is this percentage of successful intervals fairly close to the confidence level (95%)?  Would you say that the Wald procedure performs well in this case?  Explain briefly.

h) Repeat g) with n=10 and p=.2.

A procedure that has recently been proposed as an alternative to the Wald procedure is to add two (imaginary) successes and two (imaginary) failures to the sample and then apply the ordinary Wald method on that adjusted sample.  The adjusted Wald procedure therefore is: p-star +/- 1.96sqrt[(p-star)*(1-p-star)/(n+4)], where p-star = (# of successes +2)/(n+4).

i) Change the second window on the applet to “adjusted Wald.”  Set the actual value of the population proportion to p=.5 and the sample size to n=10.  Again ask for 200 intervals at a time until you have generated a total of 10,000 of them.  How many and what percentage of these simulated samples produce a confidence interval that succeeds in capturing the actual value of p?  Is this percentage of successful intervals fairly close to the confidence level (95%)?  Would you say that the adjusted Wald procedure performs well in this case?  Does the adjusted Wald method perform better than the (ordinary) Wald method?  Explain briefly.

j) Repeat i) with n=10 and p=.2.

k) Reconsider our class survey again.  Use the adjusted Wald method to construct a 95% confidence interval for p1, the proportion of all Cal Poly students who would choose to travel to the future.  How does this interval compare to the one from the (ordinary) Wald method in a)?

l) Reconsider our class survey again.  Use the adjusted Wald method to construct a 95% confidence interval for p2, the proportion of all Cal Poly students who are vegetarians.  How does this interval compare to the one from the (ordinary) Wald method in b)?