Stat 324 – HW 2

Due beginning of class, Friday, April 10 by 2pm

 

Also notice that instructions for the project assignment have been posted.

 

1) In class we discussed minimizing the sum of the absolute errors (residuals) and the sum of squared residuals as criteria for obtaining a “best fitting” line.  Another method is to determine the “median-median” line. 

(a) Apply this method to the airfare data as follows: 

1.  Divide the data along the horizontal axis into 3 equally sized groups. Find the median x value and the median y value for each group (so get 3 points A, B, and C).

Hint: It might be helpful to sort the data by the distances:

Then press OK.

Then look through columns 5 and 6 to determine the 3 groups and (manually) determine the medians of the four x value and the four y values in each group.

2. Determine the equation for the line connecting points A and C:

            Hint: Slope = ; Intercept = y1 – (slope)x1

3. Determine how far point B is from the line (the residual for point B using this line) 

4. To create the median-median line, use the slope from step 2 but shift the line one-third of the way toward point B by adjusting the intercept.

5. Write out the equation for your line. 

 

(b) The following scatterplot shows the median-median line and the least squares regression line. Which is which? (Include a discussion of how the median-median method changes the model for the relationship in terms of the price per mile and the fixed cost.)

(c) Discuss any potential advantages in general to using a median-median line over a least-squares regression line.

(d) Often, we don’t know if the relationship really is linear in the first place.  Once way to explore this graphically is through a smoother.   For example we could look at moving averages or moving medians to help explore the shape of the relationship.  Minitab uses the “lowess method” which fits successive linear regression functions in local neighborhoods. Don’t worry too much about the details, but apply it to the airfare data by creating a scatterplot of airfare vs. distance. Then right click on the scatterplot and choose Add > Smoother.  Click OK in the next box.  What interesting features does this smoother suggest about the behavior of the relationship? Describe what this says in this context and conjecture why such behavior could occur with these data.

 

Note: You can access text files for the data in the book at http://www.biz.uiowa.edu/faculty/jledolter/RegressionModeling/ . Typically you can then select all and copy and paste into Minitab. If this doesn’t work, let me know. You should never be typing in data files by hand.  You are free to use packages other than Minitab to carry out these analyses.

 

2) Exercise 2.3 (p. 56)

 

3) Exercise 2.4 (p. 56) parts (a)-(d). But then answer

(e) Provide interpretations of the slope and intercept coefficients in this context.  Also provide an interpretation of R2 for this model.

Answer (f), but you can use the regression output, you do not need to calculate it by hand.

Answer (g) and (h) and (i) as in text

Then answer

(j) Construct and interpret a 95% confidence interval for  when x = 5.

(k) Construct and interpret a 95% confidence interval for  when x = 5.

 

4) Do people tend of similar ages tend to marry each other?  A student went to the county courthouse and recorded the ages appearing on marriage licenses for a sample of 24 couples; the data can be found in marriage.mtw.

(a) Fit a regression line for predicting the age of husbands from the age of the wife.

(b) Is there statistically significant evidence that husbands’ ages are related to the wives’ ages in this population?  (State your hypotheses, report the test statistic and p-value, and state your conclusion in context.)

(c) Is there statistically significant evidence that a one-year increase in the wife’s age corresponds to a one-year increase in the husband’s age on average? (State your hypotheses, determine the test statistic and p-value, and state your conclusion in context.)

(d) Fit a regression line for predicting the wife’s age from the husband’s.  Report this equation as well as the test statistic and p-value for the slope.  Now solve this equation for the husband’s age. (Hint: So use algebra to get husband age = …. )  Do you get the same equation as in (a)?  Do you get the same test statistic and p-value for the slope?

 

5) Exercise 2.7 part (b) (p. 57-8)