Stat 322 – HW 6
Due Friday, Feb. 23
Remember to include your Minitab output.
1) Parents of
children who speak at a young age like to believe that this bodes well for the
child exhibiting high intelligence later in life. To investigate this
possibility, researchers collected data on the age of first speaking (in
months) and score on the Gesell aptitude test taken later in life for a sample
of 22 children. The data can be found in gesell.mtw.
(a) Produce and
describe (direction, strength, and form) a scatterplot of Gesell score vs. age
of first speaking.
(b) Determine the
regression equation for predicting a child’s Gesell score from the age at which
he/she first speaks. Report the equation, along with the value of R2,
and superimpose the line on the scatterplot. Provide an interpretation for the R2
value.
(c) Provide
interpretations in context of the estimated slope and intercept coefficients.
(d) Do any of the
children appear to be outliers in the age variable? If so, what is the
ID number for this child? How long did it take him/her to speak? Also report
the residual value for this child, and comment on whether it is exceptionally
large (in absolute value) compared to other residual values.
(e) Remove this
child from the analysis. Then reproduce a scatterplot and recalculate the
regression equation and value of R2. Comment on how these
have changed.
(f) Now also remove
the child who took the next longest time to speak, again look at a scatterplot,
and the regression equation and value of R2. Comment again on
how these have changed.
(g) Write a
paragraph explaining (as if to someone with no formal knowledge of statistics)
why these summary statistics have changed so much and summarizing what
these data reveal concerning the relationship between age of first speaking and
aptitude for children.
2) problem 20 (p. 518)
3) The file TVlife.mtw lists the life expectancy and the number of people per television set in a sample of 22 countries.
(a) Produce and describe a scatterplot of life expectancy vs. people per television set.
(b) Take a log transformation of the people per TV variable. Would it be appropriate to use the regression model with life expectancy and log(people per television)? (Discuss the residual plots.)
(c) Is the relationship between life expectancy and log(people per television) statistically significant? (State hypotheses and report test statistic, p-value, decision, and conclusion).
(d) Since the association is so strongly negative, one might conclude that simply sending television sets to the countries with lower life expectancies would cause their inhabitants to live longer. Comment on this argument.
4) problem 73 (p. 551)
5) The data in mammals.mtw report the average gestation period (in
days) and the average longevity (in years) for a variety of mammals.
(a) Produce and
discuss a scatterplot of gestation period vs. longevity.
(b) Determine the
regression equation for predicting gestation period from longevity (Fitted Line
Plot). Comment on how well the line seems to describe the relationship between
the variables.
(c) Conduct a
residual analysis to investigate whether the assumptions of the regression
model are satisfied here. Comment on
your findings.
(d) Take the
logarithm (base 10) of each variable, and examine a scatterplot of
log(gestation) vs. log(longevity). Does
the relationship appear to be roughly linear?
(e) Determine the
regression equation for predicting log(gestation) from log(longevity). Also report the value of R2. Then conduct a residual analysis, and comment
on your findings.
(f) Use this model
to form a 95% prediction interval for the gestation period for a species of
mammal whose longevity is 12 years. [Hint: First find this prediction
interval for log(gestation) using “Options” under Stat > Regression > Regression, then “back-transform” to find the interval for
gestation.]
(g) Use this model
to predict the gestation period for a mammal whose longevity is 20 years. How does this interval compare (center and
width) to the previous one?
(h) Use this model
to predict (with a point estimate and with a 95% prediction interval) the
gestation period for the human species of mammal, whose longevity is about 75
years. Is your prediction
reasonable? Explain.