Stat 321 – Final Review
Additional Review Problems: p. 212: 43, p. 218: 53, 54, p. 222: 65,
p. 224: 80, p. 225: 83, p. 242: 15 (hint:
find E(SXi2), how
would you adjust for its bias), p. 251: 20, p. 269: 23 Ans: (.446, .851)
with simple formula; p. 281: 48, p. 281: 52(a),(c), 53, suppose you have a
standard beta random variable with a =
2 but b unknown. Derive the method of moments estimator for b.
From Chapter 5
(Combinations of Random Variables) you should:
- Sampling Distributions of Sample Statistics (functions of random variables)
Be able to distinguish (in words and with symbols) between sample statistic vs. parameter (of population or of probability distribution)
Understand what is random and what is constant (though perhaps unknown)
Understand the definition of (simple) random sample (SRS)
Understand what is meant by “sampling distribution of a statistic”
Know how to derive the exact sampling distribution of a statistic for small sample spaces (e.g., X discrete, n = 2)
Understand how we approximate the sampling distribution of a statistic/estimator using simulation
Be able to interpret a “repeated sampling” simulation
done in Minitab
Eat, drink, and breathe the Central Limit Theorem
Sampling Distribution of
: E(
)=m (always true), SD(
)=s /
(if independent rv’s),
normal (if population normal or approximately normal if n>30)
Sampling Distribution of
: E(
=X/n)=p (always true when X
is binomial), SD(
)=
(if random sample),
approximately normal (if np>10
and n(1-p)>10)
Can
pay less attention to the distribution of the sum or count (convert to
,
)
Be able to sketch the theoretical sampling distribution (scale and label the horizontal axis).
Be able to apply the formulas for expected value and variance of linear combination of (independent) random variables (p. 105-6, p. 219).
Know distribution is normal if individual random variables are and how to calculate probabilities (p. 221).
Be able to calculate probabilities for sample
statistics (e.g., probability we find a
>.30 when p=.2)
and how these probabilities are affected by changes in sample size.
Be able to apply the empirical rule
Be able to interpret these probability statements in context. Be able to make decisions about “claimed” values of the parameter based on this probability.
From Chapter 6 (Point Estimation) you should:
Understand the difference between an estimator and an estimate
Know how to determine whether an estimator is unbiased by determining its expected value
or by looking at the average of simulated values of the estimate
If choosing an estimator, pick among the unbiased estimators first
What does “unbiased” mean? Why is this a desirable property of an estimator? Always?
Know how to find the variance (formula) of an estimator (e.g., using rules for variances)
Understand the definition of bias in terms of overall systematic
tendency
Be able to compare performance of estimators using Minitab simulation results
Understand the idea of method of moments and maximum likelihood estimators
Know how to determine estimators by method of moments and maximum likelihood (MLE)
From Chapter 7 (Confidence Intervals) you should:
Understand the principle of confidence intervals
General form: estimate + (critical value)(standard error of estimate)
margin of error = half-width of interval
measures amount of random sampling variability (only)
specifies range of plausible values for parameter based on sample statistic
Know how to interpret “confidence” in your own words
without using the words “confidence” or “sure” or “chance” or “probability of parameter”… (lab 7)
Understand limitations, misinterpretations of confidence intervals
Know how intervals are affected by changes in sample size, confidence level, population size, parameter value
Know how to calculate confidence intervals for population mean, population proportion
Know the technical conditions for each method and how to check them
CI for m :
+ ta/2,n-1s/
if SRS; n>30 or pop normal (normal prob plot)
-- t-interval
PI:
+ ta/2,n-1s
if SRS; pop
normal
CI for p:
+ za/2
, if SRS; n
> 10, n(1-
)>10 -- Wald
(95%)
+
1.96
where p*=(X+2)/(n+4),
if SRS -- Adjusted Wald
Be able to determine necessary sample size n to ensure desired width or half-width for given confidence level
Know how and when to calculate a prediction
interval for an individual value (recognize language asking for this)
Earlier material to be especially aware of:
Describing distributions of data numerically and graphically (and in context)
What is probability? What is a random variable?
What is the expected value of a random variable? Standard deviation?
What is a pmf, cdf, pdf? How do I graph them? How do I express the function for all x?
Being able to identify the appropriate discrete probability distribution.
Calculating probabilities (including tables) and expected value involving known
continuous and discrete distributions (e.g., binomial, normal) and generic distributions.
Normal approximation to binomial, binomial approximation to hypergeometric
Conditional probability
The distinction between “data” and “model” and between a distribution of data and a probability distribution
SOME PROBLEM SOLVING
STRATEGIES
1. Are you finding a conditional or an unconditional probability?
2. If it
involves
or
, can you use the central limit theorem to state the
probability distribution?
3. If it involves a random variable which follows one of the common probability distributions (e.g., binomial, gamma) then use the formulas page and/or tables.
Look for the phrase “approximate probability” in case one of the approximations (e.g., normal to binomial, poisson to binomial) might apply.
You may have to recognize if it belongs to a known discrete probability family yourself.
4. If it involves a random variable but you are given the pmf or pdf, determine the probability directly (summing or integrating).
5. If it does not involve a random variable, use techniques from chapter 2 (permutations, combinations, addition rule, multiplication rule). Make sure you are not applying any results without checking assumptions first (e.g., mutually exclusive, independent).
6. If you are told only a situation, you could be asked to perform or interpret a simulation to determine empirical probabilities.
1. If the expression is a linear function of random variable(s), first simplify using the rules for expected value
e.g., E(2X+3Y) = 2E(X)+3E(Y)
2. Once you
get to E(rv), is the rv a sample mean (
) or a sample proportion (
)? If so, then E(
)=m = E(X) or E(
) = p
3. Once you get to E(Y), is Y a random variable from a common probability distribution family? If so, use the formulas page to determine the expected value
e.g., if Y is a binomial random variable, E(Y)=np
4. If Y is not from a common probability distribution family, determine E(Y) or E(h(Y)) directly given the pmf (summing) or pdf (integrating)
discrete: E(Y) = SyP(Y=y) E(h(Y)) = Sh(y)P(Y=y)
continuous: E(Y) = òyf(y)dy E(h(Y)) = òh(y)f(y)dy
1. If the expression is a linear function of random variable(s), first simplify using the rules for variance
e.g., V(2X+3Y) = 4V(X)+9V(Y) if X and Y are independent
2. Once you
get to V(rv) is the rv a sample mean (
) or a sample proportion (
)? If so, then V(
)=s2/n = V(X)/n or V(
) = p(1-p)/n
3. Once you get to V(Y), is Y a random variable from a common probability distribution family? If so, use the formulas page to determine the expected value
e.g., if Y is a binomial random variable, V(Y)=np(1-p)
4. If Y is not from a common probability distribution family, determine V(Y) directly given the pmf or pdf using V(Y)=E(Y2)-[E(Y)]2
discrete: E(Y2) = Sy2P(Y=y) continuous: E(Y2) = òy2f(y)dy
Note: the previous two sections discuss finding a “mean” or a
“standard deviation.” Remember, you
could also be given a set of data and asked to use the techniques from chapter
1 to find
and s.
0. Define the parameter in words (e.g., let p=proportion of all Cal Poly students who…)
1. Is it for a population mean or a population proportion?
If a
population mean, are you told a value of s?
(if yes use z, otherwise use t)
2. Check the technical conditions to see if our formulas are valid
3. Calculate
the interval and write a one sentence summary (e.g., “I’m 95% confident that”
being clear what the parameter is you are estimating)
4. Be able to interpret the phrase “confidence” in your own words if asked (“95% of intervals..”)
5. Know what factors affect the behavior of the confidence intervals (width, midpoint, coverage)
·
Know the
technical conditions required by different procedures and how to check them.
·
Be able
to make interpretations and explanations of your calculations
·
Remember
to follow the “of,” probability “of what”?!
WHAT YOU COULD USE
MINITAB FOR
· Numerical and graphical summaries of a distribution of data (e.g., histogram, interquartile range)
· Calculate probabilities, cumulative probabilities for known probabilities distributions (e.g., binomial, normal, gamma, etc.)
· Calculate confidence intervals for m, p
· Perform a small simulation, including through a macro
SOME GENERAL EXAM
PREPARATION ADVICE:
· Be prepared to think/explain/interpret
· Understand, don’t memorize
· Don’t plan to rely heavily on the notes pages
· Reread handouts, earlier exams, the text
· Rework examples from class, homework exercises, review problems
· Consider the big ideas from labs
· Make sure you can read/use Minitab output
Notation, Acronyms:
|
E(X) |
expected value of the random variable X (aka m) |
|
V(X) |
variance of the random variable X (aka s2) |
|
Z |
standard normal random variable; number of standard
deviations from mean (X-m)/s |
|
m |
population mean; expected value of a random variable; mean of normal distribution |
|
s |
population standard deviation; standard deviation of a random variable; SD of normal |
|
|
sample mean |
|
s |
sample standard deviation |
|
|
sample proportion of successes |
|
p |
population proportion of successes; probability of success |
|
q |
generic unknown parameter |
|
|
estimator of generic unknown parameter value |
|
n |
sample size (number of trials, number of observations recorded) |
N
|
population size |
|
q |
1-p |
|
s |
Standard deviation of sample means SD( |
|
m |
Mean of sample means, E( |
|
a, b |
parameters of distribution, e.g., Weibull, Gamma |
|
l |
parameter of exponential, Poisson distributions |
|
G |
gamma function, see formulas page |
|
F |
cdf of standard normal distribution |
|
rv |
random variable |
|
pmf p(x) |
probability that a discrete random variable is equal to x |
|
pdf f(x) |
integrated to determine the probability for a continuous random variable over interval |
|
cdf F(x) |
cumulative distribution function, P(X<x) for continuous or discrete random variable |