Investigation 7: Speaking
and intelligence (assigned Tues Feb 2, due Fri Feb 5)
You may work with one
other person on this assignment, handing in one report with both names. Word-processed reports are much preferred to
hand-written ones. Please copy/paste
relevant, well-labeled Minitab output into a Word file as appropriate.
Parents of children who speak at a young age like to believe that this bodes well for the child exhibiting high intelligence later in life. To investigate this possibility, researchers collected data on the age of first speaking (in months) and score on the Gesell aptitude test taken later in life for a sample of 21 children. The data can be found in the Minitab worksheet gesell.mtw.
a) Which is the explanatory variable, and which is the response?
b) Examine a scatterplot of Gesell score vs. age of first speaking, with the response variable on the vertical axis. Comment on whether these two variables appear to be associated.
c) Determine the correlation coefficient between Gesell score and age of first speaking.
d) Do any of the children appear to be outliers in the age variable? If so, what is the ID number for this child? How long did it take him/her to speak?
e) Remove this child from the analysis. Then reproduce a scatterplot and recalculate the correlation coefficient. Comment on how these have changed.
f) Now remove the child who took the next longest time to speak, again look at a scatterplot, and recalculate the correlation coefficient. Comment again on how these have changed.
g) Write a paragraph explaining (as if to someone with no formal knowledge of statistics) why the scatterplot and correlation change so much. Also summarize what your analysis of these data reveals concerning the relationship between age of first speaking and aptitude.
h) Now consider the entire dataset again (all 21 children). Convert the age variable into years rather than months by dividing by 12 (MTB> let c4 = c2/12). Recalculate the correlation coefficient between age (in years, rather than months) and Gesell score. How does it compare to earlier? What does this suggest about the effect of a linear transformation on the correlation coefficient?