Review problems

 

Identify the error in the following analyses:

1)  A confidence interval comparing the rate of developing a flu-like illness from a vaccinated and an unvaccinated group p1-p2 is determined to be (-.056, .016). 

(a) This indicates that 90% of the sample proportions are contained in this interval.

The confidence interval aims to specify p1-p2, the unique difference in the population proportions.  It is not a confidence interval for a difference in sample proportions or for a single proportion.

(b) This indicates that we are 90% confident that between 1.6% and 5.6% of the population developed a flu-like illness.

The confidence interval here is for the difference in the population proportions of success, not for the overall proportion of success of for just one of the population.  That is why this interval may contain both positive and negative values.  Negative values correspond to p2 > p1 and positive values correspond to p2 < p1.

 

2) A confidence interval comparing the average number of words remembered for two different ways of presenting the words (familiar chunks and unfamiliar chunks) is determined to be (1.65, 8.60) with a p-value of .003.

(a) Since the interval does not contain 0, we are 95% confident that the scores for the first group are larger than the scores for the second group.

We are 95% confident that the difference in true treatment means is greater than 0.  It is important that your interpretation does not sound like all values in the first population are larger than all values in the second population.  Of course, it is also best to spell out what you mean by “first group” and “second group.”  In particular, it should be clear that the confidence interval estimates the difference in population means or the true treatment effect; it does not merely pertain to the samples in this study.

(b) Those receiving the letters in familiar chunks will perform better 95% of the time than those receiving the letters in unfamiliar chunks.

The “95%” of the confidence interval does tell you that something will be true “95% of the time.”  The “of the time” refers to repeating the random process many times and observing the resulting differences in sample means.  Your conclusion here is either that the average recall score with the familiar packets is higher or it is not, not that sometimes it is higher and sometimes it is not higher.

(c) The p-value of .003 indicates that there is a .003 probability that those receiving the letters in familiar chunks will perform better than those receiving the letters in unfamiliar chunks.

The interpretation reads too much like the p-value is a probability statement about whether the null hypothesis is true.  The p-value tells you the probability of observing a difference in sample means at least this large assuming the null hypothesis is true.  There is a .3% chance that the difference in mean recall scores between these sample groups would be larger than 5.12 if there was actually no treatment effect.

 

3) Another study of housing prices (in thousands of dollars) found the equation (for Bakersfield homes in April, 2003):

            predicted price = 30.15 + .0695 sq ft.   r2 = 56.4%, p-value < .001

(a) The sample slope coefficient reveals that a house’s price goes up by $69.50 for each additional square foot of size.

the slope is .0695, and the response variable is “thousands of dollars,” so this is close to a correct interpretation, but it does not take the variability of the prices about the regression line into account.  A better interpretation would be to say that the price of a house is increases by an average of  $69.50 for each additional square foot of size or at least that it’s the predicted price that is associated is such an increase.

(b) If the technical conditions are met, the very small p-value suggests that there is no linear association between a house’s size and its price.

The very small p-value would suggest strong evidence against the null hypothesis.  But the null hypothesis says that the population slope equals zero, which means that there is no linear association between house price and size in the population.  So this conclusion is backwards: The small p-value actually provides strong evidence that there is a linear association between house price and size in the population.

(c) If the technical conditions are met and if the p-value had been larger than .10, you could have concluded that the sample data provide strong evidence that there is no association between a house’s size and its price.

This conclusion commits the error of “accepting H0.” Remember that a test of significance assesses the strength of evidence against the null hypothesis, not in favor of it.  A large p-value would suggest no evidence against the null hypothesis of no association, but it would not suggest strong evidence of no association.

(d) Adding square footage to a house causes the price to increase by $69.50 on average.

The conclusion states that the small p-value implies a causal relationship.  Since this was an observational study, such a conclusion is not valid.

 

4) Suppose we want to predict hiking time from hiking distance for Day Hikes in San Luis Obispo County and find predicted time = -1.27 + 31.5distance with r2 = .838.  Identify at least one problem with each of the following interpretations.

(a) The slope shows that for each additional minute, we predict the hike is 31.5 miles longer.

The interpretation of the slope is incorrect.  The student has reversed the role of the x and y variables.  When interpreting the slope, you discuss how a unit increase in the explanatory variable is associated with a change in the predicted value of the response variable.

(b) The slope shows that a hiker’s time increases by 31.5 minutes for each additional mile.

This interpretation of the slope is too definitive;  it makes it sound as if there is no variability present.  A better interpretation would be that the predicted time of a hike increases by 31.5 for each additional mile, or that the hike time increases on average by 31.5 minutes for each additional mile.  Although the y-value of the corresponding position on the line changes by the exact same amount each time we move 1 unit in x, this is not exactly true for the data values themselves.  The interpretation of the slope applies to the line and not to individual hikers.

(c) The predicted time for a 5-mile hike is about 156.

The calculation is correct here, but what’s missing are the units: the predicted time f or a 5-mile hike is 156 minutes.

(d) About 84% of hikes have times that are correctly predicted by the line.

This is a completely incorrect interpretation of r2. It may well be that none of these hikes have a time that is perfectly predicted by the line.  That is not what r2 means.

(e) About 84% of the variability in hikes is explained by time.

This interpretation of r2 comes closer but is still lacking because it does not indicate what variable about the hikes is being explained.  A good interpretation is that about 84% of the variability in the times of the hikes is explained by the least squares line based on the hikes’ distances.

 

Sample Problems from a previous Stat 512 Final Exam (different instructor)

1.  The MINITAB output that follows resulted from taking observations on the percentage of body fat taken from teenagers defined as clinically obese.

 

T Confidence Intervals

 

Variable     N      Mean    StDev  SE Mean       99.0 % CI

Body Fat    18     28.95     4.53     1.07  (   25.86,   32.05)

 

a)     Define the parameter(s) of interest.

m = mean percentage of body fat for an appropriate population of teenagers defined as clinically obese (from what population was this sample selected, all?)

b)     Is the value of 28.95 a parameter or a statistic?  Explain.

A statistic since it was calculated from the sample of 18 teenagers

c)     True or false:

i)      For this interval to be valid, it is necessary that the population of percentages of body fat be normally distributed.

Probably since the sample size is fairly small, we need to have the population distribution of values to be well behaved or else the t-procedure will not be valid.

ii)    A larger sample would make the population distribution more normal.

NO! The size of the sample has no affect of the shape of the population (only on the shape of the sampling distribution of sample means).

iii)  For this interval to be valid, it is necessary that the population standard deviation of the percentages of body fat be known.

No, the point of a t-interval is that we can use s for the sample standard deviation and we don’t need to know s, the population standard deviation.

iv)   99% of the time, the mean percentage of body fat will fall between 25.86 and 32.05.

False (bad interpretation of confidence, m is either in the interval or it’s not, see above)

v)     You can be absolutely sure that the mean percentage of body fat is between 25.86 and 32.05.

False, we are only 99% confidence it is

vi)   It is possible that the mean percentage of body fat of the population is not between 25.86 and 32.05.

True

vii)  99% of all intervals created in this fashion will contain the mean percentage of body fat.

True

viii)         A 95% confidence interval obtained from the same data would be wider.

False, if the confidence level is smaller, the width of the interval will be less (all else staying equal)

ix)   If a two-tailed hypothesis test of H0: m = 30 were performed using these data, H0 would be rejected.

            False, since 30 is inside the CI, the two-sided p-value > .01.

 

2.     “Photo-volume and weight tables for Central Coast hardwoods have not been available prior to this study.  Their importance to hardwood resource evaluation efforts is twofold.  First, a general reconnaissance survey can be rapidly conducted from aerial photos of 1:10,000 scale or greater in the comfort of an office.  Secondly, with a small number of field samples, relatively accurate volume and weight estimates can be obtained for stands of hardwoods.”  This quote is from a Master’s thesis Tree Photo Volume and Weight Tables for California’s Central Coast (Brockhaus, John A., Cal Poly).  As part of the research project described therein, data were collected from multiple forest stands.  Aerial photographs of the stands were taken and used to produce photo volume--an estimate of the volume of wood in a stand.  Then foresters traveled to the same stands and used standard procedures to determine the field volume of the stand--an accepted measure of the total volume of wood in a stand, but one that takes more work, time, and resources than aerial photography.  A lumber company, wanting to see if photo volume would produce adequate estimates of field volume, used these data to generate the following MINITAB output.

 

The regression equation is

Field = 34.4 + 1.14 Photo

 

Predictor        Coef     SE Coef          T        P

Constant        34.37       72.64       0.47    0.642

Photo         1.13710     0.08953      12.70    0.000

 

S = 209.8       R-Sq = 90.0%     R-Sq(adj) = 89.4%

 

Analysis of Variance

 

Source            DF          SS          MS         F        P

Regression         1     7099245     7099245    161.31    0.000

Residual Error    18      792170       44009

Total             19     7891416

 

Unusual Observations

Obs      Photo      Field         Fit      SE Fit    Residual    St Resid

 10       1944     2000.0      2244.9       127.5      -244.9       -1.47 X

 17        957     1876.0      1122.6        55.8       753.4        3.73R

 

R denotes an observation with a large standardized residual

X denotes an observation whose X value gives it large influence.

 

a)      From the output, read or calculate the values of the following.

i)      The y-intercept.   34.37

ii)              The estimate of the average change in field volume for an increase of one in photo volume. 1.137

iii)            The quantity that the least squares line minimizes.  SSE = 792170

iv)   The standard deviation in the estimate of the slope.  .08953

v)     The quantity that measures the proportion of error removed from the estimation of field volume by using a linear regression model with photo volume rather than using the sample mean field volume as the estimate.  90.0%

vi)   The sample correlation coefficient. sqrt(.900) = +.949 (positive since slope is positive)

b)     Test to see if there is a linear relationship between photo volume and field volume.

i)      Define the parameter of interest. b = population slope

ii)    What are the hypotheses? H0: b = 0, Ha: b ≠ 0

iii)  Give the values from the MINITAB output of the two test statistics that may be used to perform the test.  t = 12.70, p-value =. 000 or F = 161.31, p-value=.000

iv)   Reach and justify a decision at a = .05.  Provide an interpretation of the decision. Since p-value < .05, we reject the null hypothesis.  We have convincing evidence that there is a linear relationship between photo volume and field volume in the population.

 

3.     For a class project, a student took a sample of students and determined their age, gender, whether they belonged to a fraternity or sorority, how many years they had been attending college, and the number of alcoholic drinks they had in a week.  A regression analysis by MINITAB resulted in the following output.

 

The regression equation is

num.drinks/week = 2.6 + 1.01 age - 0.01 sex - 9.67 frat/sor

           - 0.10 yr. in school

 

Predictor        Coef     SE Coef          T        P

Constant         2.59       12.48       0.21    0.838

age            1.0091      0.7192       1.40    0.181

sex            -0.013       1.671      -0.01    0.994

frat/sor       -9.666       1.811      -5.34    0.000

yr. in s       -0.102       1.164      -0.09    0.931

 

S = 3.571       R-Sq = 66.0%     R-Sq(adj) = 56.9%

 

Analysis of Variance

 

Source            DF          SS          MS         F        P

Regression         4      371.26       92.81      7.28    0.002

Residual Error    15      191.29       12.75

Total             19      562.55

 

a)      Perform a test to see if at least one of the predictors aids in the prediction of the response.

i)       What are the hypotheses? H0: b1=b2=b3=b4 = 0  vs. Ha: at least one bI ≠ 0

ii)      Make and justify your decision at a = .05 and provide an interpretation. F = 7.28, p-value = .002, since p-value < .05 we reject the null hypothesis and conclude that at least one of the predictors (age, sex, fat/sor, year in school) aids in the prediction of the response (number of drinks per week)

b)     Perform a test to decide if sex is a significant part of the current model.

i)       Give the hypotheses H0: b2 = 0 vs. Ha: b2 ≠ 0

ii)      Make and justify your decision at a = .05 and provide an interpretation. t = -.01 and p-value = .994, with such a large p-value, we fail to reject the null hypothesis.  We conclude that sex is not a significant part of the current model (it does not aid in the prediction of num of drinks / week once we know the other explanatory variables).

c)      The student believed that whether a student belonged to a fraternity or sorority was they only variable that would predict alcoholic ingestion, and that the other variables would not be necessary components of the model.  Based on the t-tests and their associated p-values (use a = .05), does the student have sufficient justification for her claim?  Explain.  Frat/Sort does appear to have the strongest effect but we would need to remove the other variables one at a time, e.g., sex, year in school as perhaps another variable like age will also become significant.

 


Additional Review Problems

1) Suppose that instructors A, B, and C are each teaching three large sections of a course, and each instructor wants to study whether the mean exam scores differ significantly across the three sections.  Suppose that each takes a random sample of ten students, and calculates the following descriptive statistics:

 

A1

A2

A3

B1

B2

B3

C1

C2

C3

Sample size

10

10

10

10

10

10

10

10

10

Sample mean

50

60

70

50

60

70

57

60

63

Sample std dev

24

24

24

5

5

5

5

5

5

(a) Based on these statistics, which instructor has the strongest evidence that the mean scores differ significantly across his/her three sections?  Which has the least evidence?  Explain your answers.

Since the sample sizes are the same, the question becomes – which has the largest different in means compared to the amount of variability in the data.  Between A and B, the differences between the three sample means are larger, but for instructor B these differs appear much more significant since there is so much more consistency in the exam scores (as shown by the smaller standard deviations).  Instructor B will have strong evidence than instructor C since the differences in the group means are larger.  So instructor B has the strongest evidence overall.  It’s hard to tell between A and C since both the differences in means and the sample standard deviations differ – that’s why we need ANOVA!

 

2) Consider the following four data sets, each consisting of four (x, y) data points:

            A: (1,3) (2,5) (3,6) (4,8)                         B: (1,4) (2,7) (3,2) (4,4)

            C: (1,8) (2,6) (3,2) (4,3)                         D: (1,5) (2,3) (3,5) (4,2)

Based on the changes in the x and y values, arrange these data sets in order from the most negative correlation to the most positive.  Explain your reasoning.

In Data Set A, as x increases from 1 to 4, y also increases by a similar amount each time (1 or 2), so r will be close to 1.

In Data Set B, as x increases, y increases and decreases.  The changes in y are 3, -5, and 2.  The overall tendency is negative but the association should be pretty weak.

In Data Set C, as x increases, y tends to decrease (-2, -4, 1).  This should result in a fairly strong negative correlation.

In Data Set D, as x increases, y increases and decreases (-2, 2, -3). This should result in a moderate negative correlation.

Ordering, from most negative to most positive: C, D, B, A.

Note: A: r = .998; B: r = -.313; C: r = -.891; D: r = -.602

 

3) An article in the May 24, 2004 issue of Sports Illustrated raised two separate questions about seven-game series in professional team sports.  One question concerns the proportion of seven-game series that have gone to the full length of seven games.  The article reported that through the year 2003, 44 of 131 (34%) series went to the full length in baseball, compared to 111 of 471 (24%) in hockey and 85 of 303 (28%) in basketball.

(a) Conduct a chi-square analysis of whether these percentages differ more than would be expected by random variation.  Begin with graphical displays and numerical summaries, and then proceed to a chi-square test.  Which type of chi-square test did you do? Summarize your conclusions.

Two-way table:

 

Baseball

Hockey

Basketball

Full length

44

111

85

Less than full length

87

361

218

Total

131

471

303

Segmented bar graph:

Baseball has the largest proportion of series that lasted for the full seven games (.336), following by basketball (.281) and then hockey (.236). 

Let’s pretend these results are independent random samples from the populations of games in each sport and focus on the proportion of each series in the population that lasts the full seven games (homogeneity of proportions).

Let pbaseball represent the probability that a baseball series lasts for the full seven games, and similarly define phockey and pbasketball. 

We can conduct a chi-square test of H0: pbaseball = phockey = pbasketball

vs. Ha: at least one p differs from the rest. 

The expected counts are all at least five (smallest = 35.74). If we regard the observed series as independent random samples from each sport’s process, we can perform the Chi-square test.  Minitab output:

Expected counts are printed below observed counts

Chi-Square contributions are printed below expected counts

 

       baseball  hockey  basketball  Total

    1        44     111          85    240

          34.74  124.91       80.35

          2.468   1.548       0.269

 

    2        87     360         218    665

          96.26  346.09      222.65

          0.891   0.559       0.097

 

Total       131     471         303    905

 

Chi-Sq = 5.831, DF = 2, P-Value = 0.054

The p-value of .054 is not terribly small.  It says that if the three sports all had the same probability that a series would last for the full seven games, then there’s about a 5.4% chance that randomness alone would produce sample proportions as different as the three reported here.  This p-value is somewhat small but not terribly small (.05  p-value  .10), so the data provide some, but not strong, evidence that the sports differ with regard to probability that a series lasts for the full seven games.

(b) Comment on whether these data come from random samples or from randomization to groups, or whether the randomness is hypothetical here.

We analyzed all playoff series in these sports, so there is no real randomness here.  We have to assume that these series can be regarded as a random sample from each sport’s process.

(c) The other question posed by the article compares the proportion of “game sevens” that are won by the home team across these sports.  The article reported that 23 of 44 (52%) were won by the home team in baseball, compared to 70 of 111 (63%) in hockey and 70 of 85 (82%) in basketball.  Analyze these data to assess whether they provide evidence that the three proportions differ significantly, and write a paragraph or two summarizing your conclusions.

Two-way table:

 

Baseball

Hockey

Basketball

Won by home team

23

70

70

Won by visiting team

21

41

15

Total

44

111

85

Segmented bar graph:


These proportions appear to differ fairly considerably.  Basketball has the highest proportion of seven-game series won by the home team (.824), with hockey in the middle (.631) and baseball much lower (.523).  All sports do see the home team win more than half of the game sevens.  Conducting a chi-square test of the null hypothesis that all three sports have the same probability that a game seven is won by the home team produces the following Minitab output:

Expected counts are printed below observed counts

Chi-Square contributions are printed below expected counts

 

         baseball    hockey    basketball  Total

    1          23        70            70    163

            29.88     75.39         57.73

            1.586     0.385         2.608

 

    2          21        41            15     77

            14.12     35.61         27.27

            3.356     0.815         5.521

 

Total          44       111            85    240

 

Chi-Sq = 14.272, DF = 2, P-Value = 0.001

The p-value is quite small (.001), so the observed data would be very unlikely to occur by chance alone if the three probabilities were equal.  Thus, we have strong evidence that the three sports do not have the same probability that a game seven would be won by the home team.  This conclusion is valid as long as the data observed are representative of the overall process for each sport.  Note that the biggest discrepancy comes in the basketball, won by visiting team category where we observed far fewer games (15) than we would have expected if the sports were performing the same as each other (27.27).

 

4) A student conducted a student to examine the ages of people who joined a local health club.  The participants were chosen by systematically sampling the men and the women who joined the health club in August and September 2004.  The data are in the Minitab worksheet GymMembership.mtw.

Analyze the data to compare the mean ages of the men and women and also the mean ages of those who joined the club in August and in September, conditional on the other variable.  Produce and comment on numerical and graphical summaries, state hypotheses, check the technical conditions for each procedure.  Also comment on whether there appears to be a statistically significant interaction between gender and month joined and which factor (gender or month) appears to be more strongly related to the ages of new members.

Response variable = age

Explanatory variable 1 = gender

Explanatory variable 2 = month joined

The sample male and female age distributions look very similar.  The sample means (35.17 years vs. 36.82 years) are very close and the variability is similar (s1 = 14.58 years and s2=16.58 years).  Both distributions have a skewness to the right and perhaps a bimodal shape with peaks around 24 years and 40 years.

 

The sample August and September age distributions look very similar.  The sample means (36.90 years vs. 35.02 years) are very close and the variability is similar (s1 = 15.32 years and s2=15.91 years).  Both distributions have a skewness to the right and a bimodal shape with peaks around 24 years and 50 years.

 

Treating the data as a random sample of new members, we can use two-way ANOVA to analyze the “effect” of each EV on the response, conditional on the other variable.

Since the sample sizes are unequal in this observational study, we will use the “General Linear Model” command in Minitab.  We saw above that the variability was similar in each group and the sample sizes are large enough (over 100 in each group) that we will not be concerned with the normal population condition (good thing since the samples provide strong evidence that these population distributions are not normal).

 

General Linear Model: age versus gender, month

 

Factor  Type   Levels  Values

gender  fixed       2  female, male

month   fixed       2  Aug, Sep

 

Analysis of Variance for age, using Adjusted SS for Tests

 

Source   DF   Seq SS   Adj SS  Adj MS     F      P

gender    1    170.0    155.2   155.2  0.64  0.426

month     1    208.4    208.4   208.4  0.85  0.356

Error   249  60720.5  60720.5   243.9

Total   251  61099.0

 

S = 15.6159   R-Sq = 0.62%   R-Sq(adj) = 0.00%

 

This output indicates that we would fail to reject H0: mf = mm due to a large p-value of .426.  There is not significant evidence of a difference in the mean age of male and female new members after adjusting for the month joined.  This output also indicates that we would fail to reject H0: mS = mA due to a large p-value of .356. There is not significant evidence of a difference in the mean age of new members that join in August or September, after adjusting for the age of the member.

 

Considering an interaction between gender and month:

We do see some graphical evidence of an interaction.  It appears that the male and female ages are similar in August but much different in September.  However:

Analysis of Variance for age, using Adjusted SS for Tests

 

Source         DF   Seq SS   Adj SS  Adj MS     F      P

gender          1    170.0    169.9   169.9  0.70  0.405

month           1    208.4    207.1   207.1  0.85  0.358

gender*month    1    218.8    218.8   218.8  0.90  0.345

Error         248  60501.7  60501.7   244.0

Total         251  61099.0

We see that this interaction is not statistically significant (p-value = .345) at any reasonable level of significance, so we will fail to reject the null hypothesis of no interaction.  None of these factors seem related to age, but if we had to pick one, we would focus on the interaction since the p-value is smallest (and when we have an interaction it becomes more problematic to interpret the “main effects.”)

 

5) In a “matched pairs” experiment, each subject receives both treatments, in random order.  This allows us to see if the treatment is consistently effective, comparing each person to themselves, instead of across individuals, allowing a more direct comparison and “controlling for” the person to person variability.  To analyze the data, we just take the differences in the results for each person and see if the average difference is significantly different from zero.  A “repeated measures” design is just this idea extended to 2 or more treatments.  “Blocking” is the same logic, but we group experimental units that are very similar to each other instead of using the same unit more than once.  We have still minimized the variability for trying to detect the treatment effect itself.  In both these cases, we include “subjects” or “blocks” as one of the variables in an ANOVA analysis (assuming a quantitative response).

     Researchers who are studying a new shampoo formula plan to compare the condition of hair for people who use the new formula with the condition of hair for people who use the current formula.  Twelve volunteers are available to participate in this study.  Information on these volunteers (numbered 1-12) is shown in the table below.

Volunteer

Gender

Age

1

Male

21

2

Female

20

3

Male

47

4

Female

60

5

Female

62

6

Male

61

7

Male

58

8

Female

44

9

Male

44

10

Female

24

11

Male

23

12

Female

46

(a) The researchers want to conduct an experiment involving the two formulas (new and current) of shampoo.  They believe that the condition of hair changes with age but not gender.  Because researchers want the size of the bocks in an experiment to be equal to the number of treatments, they will use blocks of size 2 in their experiment.  Identify the volunteers (by number) that would be included in each of the six blocks and give the criteria you used to form the blocks.

Block

Volunteers

Ages

1

1,2

20,21

2

10,11

23,24

3

8,9

44,44

4

3,12

46,47

5

4,7

58,60

6

5,6

61,62

Since these researchers believe that the condition of hair changes with age but not gender, the volunteers are sorted from youngest to oldest. The volunteers in the sorted list are paired to form six blocks of size two. More specifically, the youngest two volunteers are placed in the first block. The next two volunteers in the sorted list are placed in the second block. This pairing continues until all six blocks of two are formed, with the oldest two volunteers in the sixth block.

 

(b) Other researchers believe that hair condition differs with both age and gender.  These researchers will also use blocks of size 2 in their experiment.  Identify the volunteers (by number) that would be included in each of the six blocks and give the criteria you used to form the blocks.

Block

Volunteers

Ages

Female 1

Female 2

Female 3

Male 1

Male 2

Male 3

2, 10

8, 12

4, 5

1, 11

3, 9

6, 7

20, 24

44, 46

60, 62

21, 23

47, 44

61, 58

Since these researchers believe that the condition of hair changes with both age and gender, the women are sorted from youngest to oldest and then the men are sorted from youngest to oldest. The women (men) in the sorted list are paired to form the blocks of size two. More specifically, the youngest two women (men) are placed in a block. The next two youngest women (men) are placed in another block. Finally, the oldest two women (men) are placed in another block.

 

(c) The researchers in (b) decide to select three of the six blocks to receive the new formula and to give the other three blocks the current formula.  Is this an appropriate way to assign treatments?  If so, describe a method for selecting the three blocks to receive the new formula.  If not, describe an appropriate method for assigning treatments.

No, the researchers in part (b) should not randomly select three blocks to receive the new formula and then give the current formula to the other three blocks. They blocked on both age and gender to form homogeneous groups because they believe hair condition differs with both age and gender. Giving the youngest or oldest women (men) the same formula defeats the purpose of blocking. In a block design, randomization should be carried out separately within each block. That is, for each block, two random numbers are generated (via a random number generator or a table of random digits) and assigned to the two volunteers. The volunteer with the smallest random number is given the new formula and the other volunteer is given the current formula.

 

Review problems from text

1) p. 236, problem 7 see back of book

p. 244, problem 43

The 95% confidence interval for the difference in the two population proportions (p82-p74) that would report watching no television:

.031 - .038 ±  1.96sqrt(.031(.969)/350 + .038(.962)/1965) = -.007 ±  .02 => (-.027, .013).

We are 95% confident that the proportion in 1982 is up to .013 higher but could also be .027 smaller than the proportion in 1974.  Since zero is included in this confidence interval, the difference is not statistically significant at the 5% level of significance.

 

p. 244, problem 47 see back of book (don’t worry about knowing what McNemar’s test is, we would just say we don’t know a method for comparing dependent samples like this)

 

2) p. 288, problem 5 see back of book, there is definitely an association here – the probability of having young children differs greatly depending on whether the woman has grey hair. However, this is an observational study so no causal conclusions can be drawn, no matter how much you might believe it!  Perhaps the women with grey hair are older and have reach child bearing age but the other women living on her block without gray hair are much younger.

 

3) p. 351, problem 33 see back of book

(a) The units of slope are dollars/x unit so if we convert the y variable units, the slope will go through the same conversion.

(b) The correlation coefficient does not change with changes in scale.

 

p. 351, problem 39 see back of book

(a) no constant linear trend

(b) not constant variance in the contributions with income level (megaphone effect)

(c) not constant linear trend

(d) not constant linear trend

 

4) p. 434, problem 45

 

5) p. 481, problem 17 see back of book