Stat 301 – Day 38 Solutions

Choice of Procedure

 

Investigation: The Biggest Loser

Dansinger, Griffith, Gleason, et al. (2005) report on a randomized, comparative experiment in which 160 subjects were randomly assigned to one of four diet plans: Atkins, Ornish, Weight Watchers, and Zone (40 subjects per diet). These subjects were recruited through newspaper and television advertisements in the greater Boston area; all were overweight or obese with body mass index values between 27 and 52.  Among the variables measured were

·         Which diet the subject was assigned to

·         Whether or not the subject completed the 12-month study (0 = yes)

·         The subject’s weight loss after 2 months, 6 months, and 12-months (in kilograms, with a negative value indicating weight gain)

·         The degree to which the subject adhered to the assigned diet, taken as the average of 12 monthly ratings, each on a 1-10 scale (with 1 indicating complete nonadherence and 10 indicating full adherence).

Data for the 80 subjects in the Atkins and Weight Watchers diets are in the Minitab worksheet AtkinsWeightwatchers.mtw (in Stat 301 Data Files page).

 

For each of the following research questions, identify an appropriate inference procedure and discuss the technical conditions. Some of these are worded vaguely, so explain your reasoning.  If you don’t have enough information to check the technical conditions, specify what you would do. If you are choosing between different procedures, discuss advantages and disadvantages of each.  In each case, you should examine graphical and numerical summaries and be prepared to discuss your results.

 

(a) Did a statistically significant majority of subjects complete the 12 month study?

Variable: whether or not complete the 12 month study (categorical), so we are looking for one proportion

Let p represent the proportion of all such subjects who would complete the study

H0: p = .5  Ha: p > .5

One sample z test (40>10 but have to assume representative sample)

Tally c4 reveals 47 of the 80 completed the study,  = 47/80 = .5875 (remember 0 is yes)

In this sample, a slight majority completed the study.

 

Stat > Basic Statistics > 1 proportion, z = -1.57, p-value = .059  Binomial: .073

So there is weak evidence that most people will complete the study for whatever population this is representative of.

 

(b) Estimate the probability of a subject completing the 12-month study based on these data.

 

Even though this just says “estimate” – if you are giving an estimate for a population parameter, convey the precision and reliability of your estimate through a confidence interval!

Technical conditions as in (a) (and have at least 10 successes and failures), now need the one-sample z-interval for p:

Variable      X   N  Sample p         95% CI         Z-Value  P-Value

completion?  33  80  0.412500  (0.304625, 0.520375)    -1.57    0.118

 

Using the normal approximation.

 

We are 95% confident that between 30% and 52% of people will not complete the program, or between 48% and 70% will.

 

(c) Is there a statistically significant difference in the amount of weight lost between the two diets after 2 months?

Variables: diet (categorical) and weight loss after 2 months (quantitative), so we want to compare 2 means.

Both distributions are skewed to the right, including a large outlier in the Weight Watchers group. The WW group average slightly more weight loss (3.465 kilos vs. 3.627 kilos) and had a bit more variability (std dev 3.83 kilos vs. 3.26 kilos).  The difference between the groups does not appear substantial.

 

Let d represent the true treatment effect from being on the Atkins diet instead of Weight Watchers.

H0: d = 0 (no treatment difference)

Ha: d  0 (one of the diets leads to more weight loss on average)

We have 40 people in each diet, so this just passes the technical condition for a two-sample t-test.  Subjects were randomly assigned to the diets so that technical condition is met.

 

Two-sample T for weight loss (2 mos)

 

diet              N  Mean  StDev  SE Mean

Atkins           40  3.63   3.25     0.51

Weight Watchers  40  3.46   3.83     0.61

 

 

Difference = mu (Atkins) - mu (Weight Watchers)

Estimate for difference:  0.162

95% CI for difference:  (-1.420, 1.745)

T-Test of difference = 0 (vs not =): T-Value = 0.20  P-Value = 0.839  DF = 76

 

We have a large p-value (.839 > .05) and no reason to believe that after two months the diets differ with respect to average weight loss.

 

(d) Is there a statistically significant difference in the completion rate between the two diets?

 

Variables: whether or not completed the study (categorical) and which diet (categorical), so want to compare two proportions.

We have 1 = 19/40 = .475 and 2 = 14/40 = .35, indicating higher completion rates with the Atkins diet, but the difference does not seem large.

 

Since we have at least 5 successes and at least 5 failures with each diet, and this was a randomized experiment, we can apply the two-sample z-test.

            H0: d (atkins –ww) = 0

            Ha: d (atkins-ww)  0(one of the diets leads to a higher completion probability)

 

Event = 1

 

 

diet           X   N  Sample p

Atkins        19  40  0.475000

Weight Watch  14  40  0.350000

 

 

Difference = p (Atkins) - p (Weight Watch)

Estimate for difference:  0.125

95% CI for difference:  (-0.0890033, 0.339003)

Test for difference = 0 (vs not = 0):  Z = 1.14  P-Value = 0.256

 

Fisher's exact test: P-Value = 0.364

 

With the large p-value (.256 > .05) we fail to reject H0. We do not have convincing evidence that one of the diets leads to a higher completion probability.

 

 (e) Variable = difference in amount of weight loss

Since we have measured the same individuals at 2 months and at 6 months, we wanted to do a paired t test.  We can create a new column: c8-c9 to measure the additional weight lost in this four month period.

 

There is some interesting clustering in this distribution.  The mean (-.172 kilos) is a bit misleading and the standard deviation is large (3.179 kilos).  With such a large sample size, we can still apply the one-sample t-test since we don’t have severe skewness or outliers. The large spike at zero is probably due to the people who have already dropped out of the study.

 

Let m represent the average additional weight lost between 2 and 6 months by the dieter population.

H0: m = 0 (on average, no change in weight change in this time period)

Ha: m < 0 (tend to lose more weight after 2 months compared to 6 months)

 

One-Sample T: 6mos-2mos

 

Test of mu = 0 vs < 0

                                       95% Upper

Variable    N    Mean  StDev  SE Mean      Bound      T      P

6mos-2mos  80  -0.172  3.179    0.355      0.419  -0.49  0.314

 

 

With the large p-value, we fail to reject H0 and conclude that there is not a significant decrease in weight loss, on average, between 2 and 6 months in this dieter population.

 

(f) What if the previous question had been: “Is there evidence that a majority of such dieters in the population would have lost less weight after 6 months that after 2 months?”

 

Now we could just count how many people had lost more weight after 2 months than 6 months.  If we see how many observations in C13 are strictly less than zero, we get 40.

MTB > let c13=c12<0

MTB > tally c13

 

Tally for Discrete Variables: C13

 

C13  Count

  0     40

  1     40

 N=     80

 

So if we wanted to test H0: p = .5 vs. Ha: p > .5, we know we will fail to reject since  = .5!  One of the rare cases where I would agree you don’t need to carry out the details of the test.

 

Of course, we had looked at the number that were greater than or equal to zero:

MTB > let c13=c12<=0

MTB > tally c13

 

Tally for Discrete Variables: C13

 

C13  Count

  0     24

  1     56

 N=     80

 

and then carried out the test, the technical conditions for a one-sample z-test are met and we find:

Test and CI for One Proportion: C13

 

Test of p = 0.5 vs p > 0.5

 

Event = 1

 

 

                            95% Lower

Variable   X   N  Sample p      Bound  Z-Value  P-Value

C13       56  80  0.700000   0.615726     3.58    0.000

 

Using the normal approximation.

 

That significantly more people lost less weight (or no change) at 6 months compared to 2 months.

 

Part of the point here is if the technical conditions for the t-test had not been met, another alternative is to turn it into a yes/no variable.  The technical conditions are more easily met but we would expect to lose some power as we are throwing away information. This helps you examine whether they lost more weight, but ignores how much more or less weight.

 

Sort the data by whether or not the subjects completed the study, putting the results back in columns 1-9:

            MTB> sort c1-c9 c1-c9;

     SUBC> by c4.

Now manually delete the rows where the subjects did not complete the program (the 33 rows with completion? = 1 now at the end, highlight the row numbers and press Delete).

 

 (g)  Estimate the mean amount of weight loss by all participants in such a program after 12 months.

Variable = weight loss (after 12 months), quantitative, so we want to examine one mean

On average, subjects the completed the program lost 4.291 kilos with standard deviation 5.64 kilos.  The distribution is fairly symmetric, perhaps skewed to the right. (The histogram shows the skewness a bit more.)

 

Since we have more than 30 subjects and as long as we consider them a representative sample, we can calculate a one-sample t-interval.

 

One-Sample T: weight loss (12 mos)

 

Variable               N   Mean  StDev  SE Mean      95% CI

weight loss (12 mos)  47  4.291  5.640    0.823  (2.636, 5.947)

 

We are 95% confident that dieters that stay on the program lose an average of 2.636 to 5.947 kilos.

 

(f) Predict the amount of weight you would lose on the Atkins diet based on these data.

 

This is asking for a prediction interval instead of a confidence interval. We only know how to do one-sample t-prediction intervals.  For this to be valid, we need to believe the weight losses follow a normal distribution.  The sample data give us some suspicion but not overwhelmingly strong to doubt this.

So proceeding with caution we need  = 3.919, s = 6.045 (for just those on the Atkins diet, easiest to just copy and paste those values into another column) and t20 for 95% confidence:

 

 ­+ tn-1 s sqrt(1+1\n) =  3.919 + 2.09(6.045)sqrt(1+1/21) = (-9.01, 16.85)

 

We would expect 95% of dieters on the Atkins diet to either lose up to 16.85 kilos or to gain up to 9.01 kilos in one year.

 

(i) Is there significantly more variability in the adherence level for those on the Atkins diet?

(Use what you know from this course.)

Can assume after 12 months

Variables = diet (categorical) and adherence level (quantitative) but instead of comparing the means, want to compare the variability

 

Variable         diet              N  N*   Mean  SE Mean  StDev  Minimum     Q1

adherence level  Atkins           21   0  5.430    0.359  1.645    1.670  4.500

                 Weight Watchers  26   0  5.442    0.295  1.506    1.420  4.605

 

Variable         diet             Median     Q3  Maximum    IQR

adherence level  Atkins            5.500  6.750    8.000  2.250

                 Weight Watchers   5.875  6.373    7.830  1.768

 

 

The sample standard deviation and the sample IQR are both larger for the Atkins diet.

 

To assess statistical significance, we could do a randomization test looking at the difference or ratio of standard deviations, for example.

sample 47 c2 c12

unstack c6 c13 c14;

subs c12.

let c15(k1)=std(c13)/std(c14)  (Atkins/WW)

let c16(k1)=std(c13)-std(c14)  (Atkins – WW)

let k1=k1+1

Observed = 1.645/1.506 » 1.09

MTB > let c17=c15>1.09

MTB > tally c17

C17  Count

  0    634

  1    366

 N=   1000

Empirical p-value = .366

 

Observed 1.645-1.506 = .139

MTB > let c18=c16>.139

MTB > tally c18

C18  Count

  0    637

  1    363

 N=   1000

 

Empirical p-value = .363

 

Neither statistic displays convincing evidence of a significant difference between the groups.

 

(j) To compare all 4 diets, what would be the main disadvantage to looking at 6 two-sample comparisons?

 

Inflation of overall Type I error rate

 

(k) For which analyses above would you be willing to draw cause and effect conclusions?  For what population(s)?

 

We need both statistical significance and a randomized experiment with respect to the explanatory variable.  So possibly (c), (d), and (h), but (c) and (d) were not statistically significant!  For the population, we should probably at least restrict ourselves to overweight individuals in the Boston area who are likely to volunteer for such a study.