Stat 217 -
Review Problems Solutions
Activity
12-15
a. z
= (13.9-12.5)/1.5 = .93. At 3 months,
Benjamin Chance’s weight was about 9/10 of a standard deviation above the
average.
b. Pr(weight > 13.9)

We
must assume that 3-month old American baby weights are normally distributed.
c. By
the empirical rule, to be in the middle 68% of weights, his weight would need
to be within one standard deviations of the mean, so within 17.25 ± 2 = 15.25 lbs and 19.25 lbs.
Activity
13-15 (p. 272):
a.
label: sample
proportion that are bird owners
The CLT says this sampling distribution
will be approximately normal, centered at .05, with a standard deviation of Ö(.05(.95)/80000 = .000771.
b. This standard deviation is much
smaller because we are assuming π is smaller (.05 rather than .333,
further away from .5, see Activity 13-7).
c. z = (.046-.05) / .000771 = -5.19. This is a very unusual z-score. Pr(Z< -5.19) ≈ 0 – so the survey
does provide convincing evidence that the population proportion who own a pet
bird is not 5% (we observed a sample result that pretty much never happens when
π = .05, so we are convinced π
≠ .05).
Activity
14-11abe (p. 293)
a. No – it would not be surprising to obtain
30.4 mpg for one tankful. This value is
well within one standard deviation of the mean (z = (30.4 -31)/3 = -.2)
b. The CLT predicts the sampling distribution of
the sample means would be (approximately) normal with mean 31 mpg and standard
deviation 3/
= .5477 mpg. According
to our sketch 30.4 would not be an unusual mean value for a sample of 30 cars
to obtain.
c. The CLT predicts a sampling distribution
that would be (approximately) normal with mean 31 mpg and standard deviation 3/
= .3873 mpg. According
to our sketch, 30.4 would not be a very unusual mean value for a sample of 60
cars to obtain (it is not in the extreme tail of the distribution).
d. The CLT predicts a sampling distribution
that would be normal with mean 31 mpg and standard deviation 3/
= .2449 mpg. According
to our sketch, 30.4 would be a somewhat unusual mean value for a sample of 150
cars to obtain.

e. Only in part (a) where we can’t calculate a
probability there unless we know for such population distribution is normal. In
the other cases the population shape does not matter because all of these
sample sizes are at least 30, none of these responses depend on knowing the
shape of the population distribution.
Activity
15-7ab (p. 303)
a.
With n = 75 and p = .45, because 75(.45) = 33.75 and
75(.55) = 41.25 both exceed 10 and we are considering random samples, the CLT
says the distribution of the sample proportions will be approximately normally
distributed with mean = .45 and standard deviation =
.
b. Axis label: sample proportion of candies that
are orange
Reasonable
guesses would be 10-20%
Activity
15-8abd
a. Now with n = 175, the CLT will still
apply and claims that the distribution of sample proportions will be
approximately normally distributed with mean = .45 and standard deviation =
. The only change from
when the sample size was 75 is the standard deviation – which is smaller now.
b.

Reasonable guesses would be 5-15%.
d. This probability is smaller than when the
sample size is 75. This makes sense because
the standard deviation (spread) has decreased and thus there are fewer sample
proportions as far from the center of .45.
Activity
16-14 (p. 327-8)
a.
A
95% confidence interval for the proportion of all viewers who favored
|
Informal .54 + 2 = .54 + 2(.01434) = .54 + .0287 = (.511, .569) |
Applet
|
b. Yes – since all the values in this interval
are greater than .50, this suggests that more than ˝ the population favored
Activity
16-15 (d)-(f)
d. If we were to repeat this procedure many,
many times, always using random samples of 116 pages of Sports Illustrated (and 130 pages of Soap Opera Digest), 95% of
the time we would create intervals that would contain the population proportion
of the magazine’s pages that contain ads.
e. Yes – each interval contains the sample
proportion of pages with ads
f. This question was silly because this value
(
) is the center of the interval. We cannot create the interval without it.
Activity
17-13 ***
a. We can determine this: In the sample
= 497/1309 = .38 considered themselves to be
political moderates. This is clearly greater
than 1/3.
b.
We want to find the probability that
more than 497 in a sample of 1309 consider themselves moderates, in other words
Pr(
> 497/1309 ) = .380 where we are
told
=.333
The CLT applies here because 1309(.333) = 217.9 > 10,
1309(.667) = 873.1 >10, and we have a random sample.
= 3.61 p-value = Pr(Z > 3.61) ≈
.0000
(you should realize this z-value corresponds to a very
small probability even without using Table II or technology)

c. You expect the p-value to increase since the observed sample proportion (124/327 =
.38) is the same but the sample size is smaller so there will be more random
sampling variability. This implies the observed sample result is less
surprising corresponding to a larger p-value.
d.

The p-value did indeed increase.
Activity
18-8 (p. 365)
a.
= -6.8
b.

c. No – this significance test has no meaning
because we know that women constitute
less than half of the entire 2007 U.S. Senate since we have taken a census of
the population of interest. We know they
make up exactly 16% of the 2007 Senate.
Activity
19-17 (p. 389)
a. The observational units are the adult
American who were interviewed by the GSS.
The variable is the number of close friends that the adult American
has. This variable is quantitative.
b. A t-interval
is valid in spite of the strong right skew because the sample size is very
large (1467).
c. 95% confidence interval for
, the mean number of number friends in
the population of all adult Americans.
|
Informal 1.987 + 2(1.7708/√1467) = 1.987 + .0924 = (1.894, 2.079) |
Applet
|
We are 95% confident that the mean number of close
friends in the population of American adults is between 1.894 and 2.079
friends.
d. The reasonable interpretations of this
interval are:
·
You can be 90% confident that the mean
number of close friends in the population is between the endpoints of this
interval.
·
If you repeatedly took random samples
of 1467 people and constructed t-intervals in this same manner, 90% of the
intervals in the long run would include the population mean number of close
friends.
e. “Ninety
percent of all people in this sample reported a number of close friends within
this interval” is incorrect because that is not what 90% confidence refers
to. See either of the statements above
for a correct interpretation.
“If
you took another sample of 1467 people, there is a 90% chance that its sample
mean would fall within this interval” is incorrect because we are not
trying to capture sample means – we are trying to capture the population
mean.
“If
you repeatedly took random samples of 1467 people, this interval would contain
90% of your sample means in the long run” is not a correct interpretation
because the interval would definitely contain the one sample mean you used to
create the interval. We cannot predict
how many of the other sample means it would contain – the interval procedure is
estimating the population mean. We are
not saying other sample means should be within 2 standard deviations of the one
we observed, but that sample means in general should fall within 2 standard
deviations of the actual population mean.
It is incorrect to say “this interval captures the number of close
friends for 90% of the people in the population” because this interval
estimates the mean number of friends – not the number of individual
friends for any person.
f. If the sample size were large, the
interval would have the same midpoint, but would be narrower.
If the sample mean were larger the
interval would have a larger midpoint, but would have the same width.
If the sample values were less spread
out, the standard deviation would be smaller, so the margin-of-error would be
smaller, so interval would be narrower (but would have the same midpoint).
If every person in the sample reported one
more close friend, the sample mean would be larger (by 1), so the midpoint of
the interval would increase by 1, but the width would be unchanged.
Activity
20-8 (p. 404)
a. In this context
represents the average IQ of all people who
claim to have had an intense experience with an UFO.
b. This is a one-sided test, because we wish
to test whether or not the average IQ of this group is greater than 100.
c. In order for this procedure to be valid, we
need to know that the population of IQ scores for this group is approximately
normally distributed, because the sample size is small (n = 25 < 30). We also
need to be willing to believe that this sample is representative of all
individuals who claim to have had such an experience.
d.
= 0.90
e. From the graph, we can ballpark the p-value
between .1 and .33 (actual value
.19).
Based on the t value being less than 1 and the large amount shaded on
the graph, this is not a surprising outcome when the population mean equals
100.
f. If the average IQ of this group is really
100, then we could expect to see a random sample of 25 people from this group
with an average IQ of at least 101.6 in about 19% of samples by random chance
alone. Since this would be a fairly
common occurrence, we have no reason to doubt that the mean IQ of the
population (those who claim to have had an intense experience with an UFO) is
100.
Activity
21-13
a.
This bargraph indicates that mothers given the placebo were
about 3 times as more likely to have babies that were HIV positive than were
the mothers given AZT.
b.
i) The data are from randomly
assigning subjects to two treatment groups.
This condition is met.
ii) The number of successes and failures
in each group should be at least 5. This
condition is also met.
c.
The null hypothesis is that AZT and a
placebo are equally effective in reducing mother-to-infant transmission of
AIDS. Specifically, the proportion of
HIV-positive babies born to mothers who could potentially take AZT is the same
as the proportion of HIV-positive babies born to mothers who could potentially take
a placebo. In symbols, the null hypothesis is H0: πAZT
= πplacebo.
The alternative hypothesis is that AZT
is more effective than a placebo for reducing mother-to-infant transmission of
AIDS, or that the proportion of HIV-positive babies born to mothers who could potentially
take AZT is smaller than the proportion of HIV-positive babies born to mothers
who could potentially take a placebo. In
symbols, the alternative hypothesis is Ha: πAZT <
πplacebo.

With such a small p-value, reject H0 at the α = .01 significance
level.
We have very strong statistical
evidence that AZT is more effective than a placebo for reducing
mother-to-infant transmission of AIDS.
d.
We are 99% confident the difference in HIV
transmission rates is between 5.33 and 23.95 percentage points. Since the values in our interval are all
negative, we know that the AZT transmission rate is lower than the placebo
transmission rate by somewhere between 5.33 to 23.95 percentage points.
e. Since this was a well-designed
experiment, we can conclude that AZT caused the observed difference in HIV
transmission rates. If AZT and a placebo
were equally effecting in reducing mother-to-infant transmission of AIDS, we
virtually never see sample results as or more extreme as those we saw in this
experiment by random assignment alone. We are 99% confident in concluding that
AZT lowers the HIV transmission rate somewhere between 5.33 and 23.95
percentage points over that of a placebo.
Activity 22-5 (p. 451)
From
Activity 22-1: male n = 654,
= 1.861, s
= 1.777; female n = 813,
= 2.089, s
= 1.760
Note:
the difference in sample means (female – male) = 2.089 - 1.861 = .228
a. If all of the women sampled had one more
friend, the mean for the women would increase by one, and all other statistics
would remain the same. This would
increase the size of the difference in the two means (new difference = 1.228)
and therefore would also increase the absolute value of the test statistic (the
difference between the sample means would increase but the denominator would
not change), and thus would decrease
the p-value.
Note:
Since we are a two-sided alternative, it doesn’t matter which direction we
subtract in to find the p-value
b. If all of the men sampled had one more
friend, the mean for the men would increase by one, and all other statistics
would remain the same. The absolute value of the new difference in means would
be actually be larger (2.089 – 2.861 =
-.772). This would increase the absolute value of the test statistic (the
difference between the sample means would increase, but the denominator would
not change), and thus would decrease
the p-value.
c. If every man and every woman sampled had one
more close friend than they originally reported, then the sample means for the
men and for the women would increase by one.
The sample standard deviations and sample sizes would not change. The difference between the sample means would
not change, so the test statistic value would not change, and thus the p-value would not change.
d. If both sample standard deviations were
larger, the denominator of the test statistic would be larger, and thus the
test statistic would be smaller. This
would make the p-value larger.
e. If both sample sizes were larger, the
denominator of the test statistic would be smaller, and thus the test statistic
would be larger (in absolute value).
This would make the p-value smaller.
Activity 22-8 (p. 453)
a.
age - comparison of means
weight – comparison of means
gender – comparison of proportions
number of cigarettes smoked – comparison of means
whether the person made a previous attempt to quit
smoking – comparison of proportions
b. The researchers would hope to fail to
reject the null hypotheses in these tests because the null hypotheses would be
that the two groups (those that use the nicotine lozenge and those that don’t)
are identical with regard to each of these background variables. The researchers would be hoping that they
would not find a statistically significant difference between the groups on any
of these variables so that a difference between the groups on the response
variable could be attributed to the nicotine lozenge.