Stat 301 – Final Review

 

Final Exam:  The exam is Tuesday or Wednesday from 10:10am-1:00pm in the Statistics Studio. Check here for reminder of when you are signed up to take the final. There will be a 30-45 minute, closed book multiple choice portion.  The rest will be open course notes, text, calculator, Minitab/applets.   Be ready to interpret output, explain processes, carry out analyses with Minitab/applets (know which), justify conclusions, and explain reasoning. 

 

The final will focus on the material since the last exam (Ch. 5), but will definitely have a cumulative component, including a comparison of methods across chapters. You should focus on the entire statistical process: How do we design a study to achieve particular research goals?  How do we describe the data we have? How do we test claims about population parameters or processes and/or estimate parameters?  How do we state our final conclusions, especially after considering how the study was conducted?  You should also think about the reasoning behind the statistical methods, such as standardizing, significance, and confidence in general.

Advice: Understand and be able to apply the procedures first, then worry about more subtle issues and review the how and why behind the development of the procedure.  Also, for all the procedures, know what must be done by hand and what can be done on the computer, including Minitab macros.

 

From Chapter 5 you should know how to:

·       Create numerical and graphical summaries for comparing two samples and two treatment groups

·       Define parameters and set up the null and alternative hypotheses for two-sample comparisons

o  Comparing populations or treatment groups (and how differ, e.g., H0: no treatment effect, d=0 vs. H0: no difference in population parameters, m1-m2 = 0 )

o  Difference between empirical randomization distributions and empirical sampling distributions

·       State and assess technical conditions of various procedures (including confidence intervals)

·       Carry out two-sample t-tests and z-tests for comparing two sample means and proportions, respectively, using either Minitab or Test of Significance Calculator applet

o   Consider alternatives if technical conditions are not met (e.g., FET, transformations)

·       Calculate and interpret two-sample confidence intervals (for the difference in population proportions, difference in population means, treatment effect on categorical and quantitative response variables, population odds ratio)

o   Consider alternative if technical conditions are not met (e.g., Wilson adjustment, bootstrapping)

·       Interpret Minitab/applet output for two-sample procedures

·       Know what factors (e.g., sample size) affect test statistic and p-values, and confidence intervals (width and coverage properties)

·       Distinguish between matched pairs designs and two-independent “samples” designs

o   Consider benefits and disadvantages (including feasibility) of the two types of designs

·       Discuss how (and why we might want to) to make inferences for other types of statistics (e.g., medians)

·       Consider how the scope of conclusions differs depending on two independent samples vs. random assignment

 

Review Question #35  solution

 

Really from Chapter 4

·       How to create a bootstrap distribution, the philosophy behind bootstrapping, and how to get rough bootstrap confidence intervals (using t critical value)

 

The Cumulative Component (also see old Review handouts)

Things to remember include:

·       Identifying observational units and defining variables, samples vs. populations vs. sampling/ randomization distributions, parameters vs. statistics, explanatory vs. response variable, bias vs. precision, random assignment vs. random sampling (including goals)

·       Experiments vs. Observational Studies

o How to design a randomized experiment, How to properly select a sample 

o Scope of conclusions depending on how study was conducted (Can you draw a cause and effect conclusion? Can you generalize to a larger population?)

o Sampling errors, nonsampling errors, and random sample errors (and which of these are measured by the “margin of error”?)

·       Describing and comparing distributions of data

o Categorical: segmented bar graphs, conditional percentages, difference in proportions vs. relative risk vs. odds ratio (and how to interpret)

o Quantitative: shape, center, and spread, boxplots, histograms, dotplots, resistance of median and IQR

·       How to interpret probability

·       What the Central Limit Theorem(s) are all about

o  Randomization/sampling distribution vs. sample vs. population

·       How to carry out a test of significance

o About a population proportion and/or population mean and/or treatment effect

o One-sided and two-sided alternatives

o Which technical conditions apply and how to check them and what they tell you

·    e.g., proportions: np > 10 and n(1-p) > 10, means: n > 30 or normal population

o Interpretation of test statistic (if appropriate)

·    General form: (estimate-hypothesized)/standard error

o Ideas and distinctions of sampling distribution and randomization distribution

o How to calculate and/or approximate p-value

o How to make a decision based on the p-value and level of significance a

o How to interpret the p-value

o Factors that affect the size of the p-value

o Defining (and stating the consequences of) Type I and Type II Errors

o How to determine the probabilities of a Type I Error and of a Type II Error

o Factors that affect the probability of Type I and Type II Errors

·       How to calculate and interpret a confidence interval

o General form: estimate ± (critical value)×(standard error)

o Interpret confidence “level” (separate from interpreting interval)

o How to solve for the sample size necessary to obtain a specific margin of error for a stated confidence level

·       Duality between intervals and tests: Any parameter value not contained in a C% CI will be rejected by a two-sided test at (100-C)/100 significance level

·       Describe the difference between statistical significance and practical significance

·       Calculating p-values for Fisher’s Exact Test and/or binomial process (when ok to do)

o Conditions for a Binomial random variable

·       How to decide which procedure you should use (quantitative or categorical data, one or two populations, Fisher’s Exact Test vs. binomial vs. normal vs. t)

 

See summary tables (including on technology) on p. 334-5, 359-60, 467-8!

 

Remember that mini-project 3 is also due on or before you take the final exam.