Stat 301 – Final Review
Final Exam: The exam is Tuesday or Wednesday from 10:10am-1:00pm in the Statistics Studio. Check here for reminder of when you are signed up to take the final. There will be a 30-45 minute, closed book multiple choice portion. The rest will be open course notes, text, calculator, Minitab/applets. Be ready to interpret output, explain processes, carry out analyses with Minitab/applets (know which), justify conclusions, and explain reasoning.
The final will focus on the material since the last exam (Ch. 5), but will definitely have a cumulative component, including a comparison of methods across chapters. You should focus on the entire statistical process: How do we design a study to achieve particular research goals? How do we describe the data we have? How do we test claims about population parameters or processes and/or estimate parameters? How do we state our final conclusions, especially after considering how the study was conducted? You should also think about the reasoning behind the statistical methods, such as standardizing, significance, and confidence in general.
Advice: Understand and be able to apply the procedures first, then worry about more subtle issues and review the how and why behind the development of the procedure. Also, for all the procedures, know what must be done by hand and what can be done on the computer, including Minitab macros.
From Chapter 5 you
should know how to:
·
Create numerical and graphical summaries for
comparing two samples and two treatment groups
·
Define parameters and set up the null and
alternative hypotheses for two-sample comparisons
o Comparing
populations or treatment groups (and how differ, e.g., H0: no
treatment effect, d=0 vs. H0:
no difference in population parameters, m1-m2
= 0 )
o Difference
between empirical randomization
distributions and empirical sampling
distributions
·
State and assess technical conditions of various
procedures (including confidence intervals)
·
Carry
out two-sample t-tests and z-tests for comparing two sample means
and proportions, respectively, using either Minitab or Test of Significance
Calculator applet
o Consider alternatives if technical conditions
are not met (e.g., FET, transformations)
·
Calculate and interpret two-sample confidence
intervals (for the difference in population proportions, difference in
population means, treatment effect on categorical and quantitative response
variables, population odds ratio)
o Consider
alternative if technical conditions are not met (e.g., Wilson adjustment,
bootstrapping)
·
Interpret Minitab/applet output for two-sample
procedures
·
Know what factors (e.g., sample size) affect
test statistic and p-values, and confidence intervals (width and coverage
properties)
·
Distinguish between matched pairs designs and
two-independent “samples” designs
o Consider
benefits and disadvantages (including feasibility) of the two types of designs
·
Discuss how (and why we might want to) to make
inferences for other types of statistics (e.g., medians)
·
Consider how the scope of conclusions differs
depending on two independent samples vs. random assignment
Review Question #35 solution
Really from Chapter 4
·
How to create a bootstrap distribution, the
philosophy behind bootstrapping, and how to get rough bootstrap confidence
intervals (using t critical value)
The Cumulative
Component (also see old Review handouts)
Things to remember include:
· Identifying observational units and defining variables, samples vs. populations vs. sampling/ randomization distributions, parameters vs. statistics, explanatory vs. response variable, bias vs. precision, random assignment vs. random sampling (including goals)
· Experiments vs. Observational Studies
o How to design a randomized experiment, How to properly select a sample
o Scope of conclusions depending on how study was conducted (Can you draw a cause and effect conclusion? Can you generalize to a larger population?)
o Sampling errors, nonsampling errors, and random sample errors (and which of these are measured by the “margin of error”?)
· Describing and comparing distributions of data
o Categorical: segmented bar graphs, conditional percentages, difference in proportions vs. relative risk vs. odds ratio (and how to interpret)
o Quantitative: shape, center, and spread, boxplots, histograms, dotplots, resistance of median and IQR
·
How to interpret probability
·
What the Central Limit Theorem(s) are all about
o Randomization/sampling distribution vs. sample vs. population
·
How to carry out a test of significance
o About a
population proportion and/or population mean and/or treatment effect
o One-sided
and two-sided alternatives
o Which
technical conditions apply and how to check them and what they tell you
· e.g., proportions: np > 10 and n(1-p) > 10, means: n > 30 or normal population
o Interpretation
of test statistic (if appropriate)
·
General form: (estimate-hypothesized)/standard
error
o Ideas and distinctions
of sampling distribution and randomization distribution
o How to
calculate and/or approximate p-value
o How to
make a decision based on the p-value and level of significance a
o How to interpret the p-value
o Factors
that affect the size of the p-value
o Defining
(and stating the consequences of) Type I and Type II Errors
o How to
determine the probabilities of a Type I Error and of a Type II Error
o Factors
that affect the probability of Type I and Type II Errors
·
How to calculate and interpret a confidence
interval
o General
form: estimate ± (critical value)×(standard error)
o Interpret
confidence “level” (separate from interpreting interval)
o How to
solve for the sample size necessary to obtain a specific margin of error for a
stated confidence level
·
Duality between intervals and tests: Any
parameter value not contained in a C% CI will be rejected by a two-sided test
at (100-C)/100 significance level
·
Describe the difference between statistical
significance and practical significance
·
Calculating p-values for Fisher’s Exact Test
and/or binomial process (when ok to do)
o Conditions
for a Binomial random variable
·
How to decide which procedure you should use
(quantitative or categorical data, one or two populations, Fisher’s Exact Test
vs. binomial vs. normal vs. t)
See summary tables (including on technology) on p. 334-5, 359-60, 467-8!
Remember that
mini-project 3 is also due on or before you take the final exam.