Stat 322 – HW 3 Solutions

 

1) Researchers studied the behavior of drivers on a rural interstate highway in Maryland where the speed limit was 55 miles per hour. They measured speed with an electronic device hidden in the pavement and, to eliminate large trucks, considered only vehicles less than 20 feet long. Suppose that the researchers want to test whether their sample data suggest that the proportion of speeders in the population differs from one-half.

(a) Specify the null and alternative hypotheses. Is this a one-sided or a two-sided test?

Let p represent the proportion of all drivers on rural interstate highways in Maryland that speed.

H0: p = .5 (the population proportion equals one-half)

Ha: p ≠ .5 (the population proportion differs from one-half) – this is two-sided

 

(b) The researchers found that 5690 of 12,931 vehicles in their sample were exceeding the speed

limit. Calculate an appropriate test statistic and p-value.  Show your work/include output.

Using the one-sample z-test in Minitab

 

where z = (.440 - .5)/sqrt(.5(.5/12931)

 

(c) Are the technical conditions for this procedure satisfied?

Since the sample size is large (np0 = 12931 = 6565.5 = n(1-p0)  10 ), the normality condition is met.  This probably was not a simple random sample but we can plausibly consider it representative of the population of all drivers on rural interstate highways in Maryland.

 

(d) Would you reject H0 at the a=.01 significance level? How about at the a=.0001 significance

level? Would you say that the data provide very strong evidence against H0? Explain.

We have very small p-value (0.000 < .001), so we would reject at the .01 level.  Minitab doesn’t give us enough information to know if p-value < .0001.  We can look up .0001 (in Minitab using inverse cumulative distribution function) to find that that would correspond to z = -3.719.  Since we have an even more extreme z-value, we would reject at the .0001 level as well.

Thus we have very strong evidence against H0.

 

(e) Does the test result say anything about how much the proportion of speeders in the

population differs from one-half?

No, only that we have strong evidence that p differs from .5..

 

(f) Determine and interpret a 99% confidence interval for the proportion of speeders in the population.

Using Minitab with the confidence level set to 99%:

We are 99% confident that the proportion of the population that speeds is between .429 and .451.

 

(g) Explain why it was important for the design of the study that the device measuring speed was

hidden.

If the drivers knew a device was determining whether or not they were speeding this would be highly likely to alter their behavior instead of allowing us to observe their “natural” habits.

 

(h) Would you generalize the results of this study to all drivers on all roads in the U.S.? Explain

briefly.

No, the sample was only taken from rural interstate highways in Maryland.  Drivers on these roads and in this area may behave different compared to drivers on other roads across the US.

 

2) Problem 38 (p. 343-4), parts a-c

Given: n = 1000, want to see if there is evidence that p < .02 where p is the true proportion of misshelved or unlocatable books in this library.

(a) H0: p = .02 (assume at least 2% of books are misshelved or unlocatable)

Ha: p < .02 (want to know if there is evidence of a smaller of a smaller “mistake” rate)

 

Since the sample size is large (1000(.02) = 20 > 10 and 1000(.98) > 10) and we have a random sample, we can use the one-sample z-test for the population proportion.

 

In Minitab:

We do not have a small p-value (.129 > .05), so we will fail to reject the null hypothesis.  We do not have convincing evidence that p < .02 so she should not postpone the inventory.  Even though the sample proportion (.015) was less than .02, we can attribute this difference to random chance. It is still plausible that p  .02 in the population.

 

With Normal Probability Calculator applet.


(b) The inventory will be taken if p-value  .05 (we fail to reject H0 at the 5% level of significance).

This will occur if z > -1.645, which means   .02 -1.645(sqrt(.02(.98)/1000) = .0127.

Note, this calculation is consistent with our decision to fail to reject with  = .015 in (a).

 

If p = .01, then P(  .0127) = P(Z  (.0127-.01)/sqrt(.01(.99)/1000) = P(Z  .858) = 1-.805 = .195. It’s pretty likely that if p = .01 we will postpone the inventory.

Note, lots of rounding discrepancies here.

 

So about 20% of the time (in repeated sampling of 1000 books) would we fail to reject the null hypothesis even though p = .01. Meaning we would take an inventory unnecessarily – a Type II Error.

 

(c) If p = .05, we will postpone the inventory if   .0127.

            z = (.0127-.05)/sqrt(.05(.95)/1000) = -5.41.

            P(Z  -5.41) ≈ 0.

So with the “high” mistake rate of .05, there is a very low probability that we would reject the null hypothesis and postpone the inventory (a Type I Error).

 

 

3) Consider four samples of hypothetical sleeping times.  

Sample number

Sample size

Sample mean

Sample std. dev.

1

10

6.6

.825

2

10

6.6

1.597

3

30

6.6

.825

4

30

6.6

1.597

(a) Between samples 1 and 2, which do you think supplies stronger evidence that m ≠ 7 (that the population mean sleep time differs from 7 hours)?  In other words, which sample (1 or 2) would produce a smaller p-value of the appropriate test of significance? Explain.

With the smaller sample standard deviation, sample 1 will provide more evidence that m ≠ 7, that the  of 6.6 didn’t happen by “random chance” alone. The results are more consistently around 6.6 and more convincing that they did not come from a population with m = 7.

 

(b) Between samples 1 and 3, which do you think supplies stronger evidence that m ≠ 7 (that the population mean sleep time differs from 7 hours)?  In other words, which sample (1 or 2) would produce a smaller p-value of the appropriate test of significance? Explain.

With the larger sample size, sample 3 will provide more evidence that m ≠ 7, that the  of 6.6 didn’t happen by “random chance” (sampling variability) alone.

 

(c) For each of these four samples, use Minitab to calculate the p-value for testing that the population mean differs from 7 hours.

 

(d) With which of the samples do you have enough evidence to reject the null hypothesis at the .05 level and conclude that the mean sleeping time is in fact different than seven hours?

Just sample 3, that’s the only one with p-value < .05.

 

(e) Comment on whether your conjectures in (a) and (b) are confirmed by the test results.

p-value is smaller for the larger sample size (everything else constant)

p-value is smaller with the smaller sample standard deviation (everything else the same)

4) Suppose we want to test whether the mean age at which smokers begin to smoke differs from 18 years.

(a) Product and describe a dotplot or histogram of these data.  In particular, do they appear to follow a normal distribution?

The distribution of ages appears skewed to the right overall.  The long right tail precludes this distribution from falling a normal distribution.

 

(b)  Do the red dots follow a linear pattern?

No the red dots do not follow a linear pattern!

 

(c) Are the technical conditions met for a one-sample t-test for these data?

Even though we do not believe the population of ages follows a normal distribution, the sample size is large enough that we can still apply the one-sample t-test.  We will assume the NHANES folks have done a lot of work to ensure that the sample is representative of the population of interest.

 

(d) Carry out a one-sample t-test by stating the hypotheses in symbols and in words and calculate the test statistic and p-value.  Include a well-labeled sketch of the sampling distribution for the test statistic, and indicate the area represented by the p-value.  Also indicate whether the sample mean differs significantly from 18 at the .10 level.

Let m = mean age at which the population started smoking.

H0: m = 18

Ha: m ≠ 18 (want to know if mean age differs from 18 years)

 

 = 18.20 years, s = 5.388 years

 

t = (18.20 – 18)/(5.388/sqrt(2328)) = 1.79

 

With df = 2327, this corresponds to a p-value of 2(.0368) = .076

This is conceptually similar to thinking about the sampling distribution of  as having mean m = 18 and standard deviation 5.3885/sqrt(2328) = .1117 (though we are using s instead of s, so this is a bit of a lie), and finding the probability of a sample mean more than .2 from 18 on either side.

 

Using Minitab:

 

With a p-value of .078, we would say it’s a statistically significant different at the 10% level (but would not at the 5% level).

 

(e) Summarize what you learned in this study. Your summary should touch on describing the sample data, whether the technical conditions are met, and if so, the conclusion you would draw, in English, from this inference procedure.

The distribution of ages shows a very long right tail, with a sample mean age of 19.2 years and sample standard deviation 5.39 years.  Since the sample size is large, we can apply the one-sample t-test to these data, to see if we have convincing evidence that the population mean age differs from 18 years. There is some evidence, though not super strong (.05 < p-value < .10), that the mean age of which the population first began smoke differs from 18.  This conclusion is valid as long as the sample selected is representative of the larger population of interest (which was not clearly defined).

 

5) To consider whether there was evidence of sex discrimination in the starting salaries offered to men and women, the beginning salaries for all 32 male and all 61 female skilled, entry-level clerical employees hired by the Harris Trust and Savings Bank between 1969 and 1977 were obtained (BankSalary.mtw).

(a) Produce a graphical summary to compare the two salary distributions and comment on what these reveal.

While both distributions show signs of granularity (especially the males), the male salaries tend to be higher than the female salaries, with one high outlier.

 

(b) In this context, what are the type I and type II errors?  Which do you consider more serious?

A type I error would be concluding that there is discrimination when there isn’t.

A type II error would be concluding there is not discrimination when there is.

Opinions may vary but many might find it more objectionable to continue to pay males a higher average salary and not realize that the female employees are being discriminated against.

 

(c) Can the difference in the average starting salaries for males and females be reasonably attributed to random chance? (Hint: Conduct a test of significance.)

Let m1 = mean female salary at this company and m2 the mean male salary.

This is assuming a population larger than the 93 individuals in this sample.

H0: m1 - m2 = 0(same average salary)

Ha: m1 - m2 ≠ 0(there is a difference in the average starting salary between males and females

(Note: you could do a one-sided test since salary discrimination is commonly assumed to be in favor of the males, but this problem didn’t specifically suggest that.)

 

In Minitab

 

With such a small p-value (not the one-sided p-value would have been even smaller), we easily reject the null hypothesis and conclude that there is a difference in the average starting salaries between men and women in this population.  So while we did not truly have random samples here, we can say that “random chance” is not a viable explanation for the large difference between 1 and 2 that was observed.

 

(d) Comment on the technical conditions for the procedure used in (c).

The distributions look reasonably symmetric and the sample sizes are reasonable (e.g., both above 30), so we will consider the “normality” condition met.

We do not really have random samples from a larger population or a randomized experiment, but it is reasonable to consider them independent.  So we need to hope these data are representative of the overall “salary assigning process” at this company?

 

(e) Remove the male with the highest salary and repeat the analysis in (c).  Do your conclusions change?

While this decreased the male average, it also decreased the sample standard deviation for the males and the test statistic became even larger.  We would say there is an even more statistically significant difference between the male and female average starting salaries.

 

(f) Sometimes when we have skewed data, the inference procedure can instead be applied to transformed data.  The most useful transformation is the log transformation, particularly applicable to positively skewed data.  Take the natural log of each group (let c3=loge(c1)).  Would a two-sample t-test be appropriate for these transformed data?

The distributions still appear to be symmetric so a t-procedure can still be applied.

 

(g) Carry out the two-sample t-test on the transformed data.  Do your conclusions about the statistical significance of the difference between the two groups change?

The test statistic has not changed much so our conclusion would be the same.

 

Note: If we had applied the transformation to the original data set (including the outlier), this would have helped the outlier be less extreme.

 

The t-test would have been more similar to the case without the outlier (t = -6.05, p-value = .000).

(h) Does this study provide evidence of gender discrimination?  (Hint: Even if you have eliminated random chance as an explanation, does this study establish a cause-and-effect relationship? If not, suggest another explanation for the tendency for higher salaries among the males.)

 

No, this was not a randomized experiment, so we would have to be cautious in drawing a cause-and-effect relationship even with the highly significant p-value.  A possible alternative explanation is that the males were more qualified than the females in the study.