Stat
322 – HW 3
Due
Friday, Jan. 26
1) Researchers studied
the behavior of drivers on a rural interstate highway in
(a)
Specify the null and alternative hypotheses. Is this a one-sided or a two-sided
test?
(b)
The researchers found that 5690 of 12,931 vehicles in their sample were
exceeding the speed
limit.
Calculate an appropriate test statistic and p-value. Show your work/include output.
(c)
Are the technical conditions for this procedure satisfied?
(d)
Would you reject H0 at the a=.01 significance level? How about at the a=.0001 significance
level? Would you say that the data provide
very strong evidence against H0? Explain.
(e) Does the test result say anything about
how much the proportion of speeders in the
population differs from one-half?
(f) Determine and interpret a 99%
confidence interval for the proportion of speeders in the population.
(g) Explain why it was important for the
design of the study that the device measuring speed was
hidden.
(h) Would you generalize the results of
this study to all drivers on all roads in the
briefly.
2) Problem 38 (p.
343-4), parts a-c
3) Consider four samples of
hypothetical sleeping times. 
|
Sample
number |
Sample
size |
Sample
mean |
Sample
std. dev. |
|
1 |
10 |
6.6 |
.825 |
|
2 |
10 |
6.6 |
1.597 |
|
3 |
30 |
6.6 |
.825 |
|
4 |
30 |
6.6 |
1.597 |
(a)
Between samples 1 and 2, which do you think supplies stronger evidence that m ≠ 7 (that the
population mean sleep time differs from 7 hours)? In other words, which sample (1 or 2) would
produce a smaller p-value of the appropriate test of significance? Explain.
(b)
Between samples 1 and 3, which do you think supplies stronger evidence that m ≠ 7 (that the
population mean sleep time differs from 7 hours)? In other words, which sample (1 or 2) would
produce a smaller p-value of the appropriate test of significance? Explain.
(c)
For each of these four samples, use Minitab to calculate the p-value for
testing that the population mean differs
from 7 hours.
(d)
With which of the samples do you have enough evidence to reject the null
hypothesis at the .05 level and conclude that the mean sleeping time is in fact
different than seven hours?
(e)
Comment on whether your conjectures in (a) and (b) are confirmed by the test
results.
4) One of the
questions in the 2001-2002 National Health and Nutrition Examination Surveys
(NHANES) study asked subjects about their smoking habits. One of the questions was whether the person
has smoked at least 100 cigarettes in his/her life. The 2328 people who answered “yes” were asked
to report the age at which they started smoking. The responses are in SmokingStart.mtw. Suppose we
want to test whether the mean age at which smokers begin to smoke differs from
18 years.
(a) Product and describe a
dotplot or histogram of these data. In
particular, do they appear to follow a normal distribution?
(b) One way to visually
assess whether a normal model can be reasonably applied to a sample of data is
through a probability plot. Choose Graph
> Probability Plot, leave it selected to “Single” and click OK, enter C1 in
the Graph variables box and click OK. If
the data behave like a normal distribution, this will produce a straight
line. It can be visually easier to
assess the fit of a straight line rather than of a curve. Do the red dots follow a linear pattern?
(c) Are the technical
conditions met for a one-sample t-test for these data?
(d) Carry out a one-sample t-test
by stating the hypotheses in symbols and in words and calculate the test
statistic and p-value. Include a
well-labeled sketch of the sampling distribution for the test statistic, and
indicate the area represented by the p-value.
Also indicate whether the sample mean differs significantly from 18 at
the .10 level.
(e) Summarize what you
learned in this study. Your summary should touch on describing the sample data,
whether the technical conditions are met, and if so, the conclusion you would
draw, in English, from this inference procedure.
5) To consider whether there was evidence of sex
discrimination in the starting salaries offered to men and women, the beginning
salaries for all 32 male and all 61 female skilled, entry-level clerical employees
hired by the Harris Trust and Savings Bank between 1969 and 1977 were obtained (BankSalary.mtw).
(a) Produce a graphical
summary to compare the two salary distributions and comment on what these
reveal.
(b) In this context, what are
the type I and type II errors? Which do
you consider more serious?
(c) Can the difference in the
average starting salaries for males and females be reasonably attributed to
random chance? (Hint: Conduct a test of significance.)
(d) Comment on the technical
conditions for the procedure used in (c).
(e) Remove the male with the
highest salary and repeat the analysis in (c).
Do your conclusions change?
(f) Sometimes when we have
skewed data, the inference procedure can instead be applied to transformed
data. The most useful transformation is
the log transformation, particularly applicable to positively skewed data. Take the natural log of each group (let c3=loge(c1)). Would a
two-sample t-test be appropriate for these transformed data?
(g) Carry out the two-sample t-test
on the transformed data. Do your
conclusions about the statistical significance of the difference between the
two groups change?
(h) Does this study provide
evidence of gender discrimination? (Hint:
Even if you have eliminated random chance as an explanation, does this study
establish a cause-and-effect relationship? If not, suggest another explanation
for the tendency for higher salaries among the males.)