Stat 301 –
Mini-Project 3
Due at Final Exam
This project is to be completed individually. The goal
is to design two separate studies. You do not have to carry out either
study. You are to turn in a typed proposal for each design.
You will be assessed on: the correctness of your designs and
proposed methods of analysis, whether the designs are appropriate for the
research question proposed, whether the designs would be feasible for a
researcher to carry out, creativity in your research questions and designs, and
quality of your writing. For feasibility, it does not need to be feasible
for you to carry out at Cal Poly, but for someone to carry out with more time
and resources (e.g., comparing the mating patterns of African and European
bees). You should concentrate more on finding a research question of
interest to you, justifying why it is of interest, and designing an appropriate
study to answer that question.
The requirements for the designs are:
·
One should be a randomized
comparative experiment and one should be an observational study with random
sampling.
·
One should involve comparing two
groups on a quantitative response variable and one should involve comparing two
groups on a categorical response variable.
·
You should consider how you would
analyze the data obtained from your study using the statistical methods we have
discussed this quarter.
·
The two topics may or may not be
related.
In your report, make sure you:
·
State the research question for
each design and why it is of interest.
·
Provide extensive detail for each
design. This design protocol should be detailed enough that I could hand
it to someone and they would be able to carry it out exactly to your
specifications.
·
Use appropriate statistical
terminology (e.g., observational/experimental units, explanatory and response
variables, randomization and random sampling, sampling frame, sample size).
·
Indicate the methods you would use
to analyze the data both descriptively (numerical and graphical summaries) and
inferentially (e.g., state your null and alternative hypotheses in symbols and
in words, identify which test procedure and/or confidence interval procedure
would you use, and what technical conditions would you check).
·
State the conclusions you would
draw should the difference in groups prove to be statistically significant.
Clarify whether you would draw a cause and effect conclusion and to what
population you would generalize the result. If you do not feel a cause
and effect conclusion is warranted, suggest potential confounding variables.
You should also indicate how you would decide whether this difference is of practical
significance (what “effect size” you would find noteworthy).
Stat 320 --
Mini-Project 2
Designing and Analyzing a Study (Random Sampling)
The goal of Mini-Project 2 is to apply methods from Chs. 3 and 4. The assignment is to take a random sample (meaning “statistically random,” that is, using scientific sampling methods) from a well-defined population or process and measure one categorical or quantitative response variable. Before you collect the data you should state a research question about the population parameter (e.g., a majority of words in Webster’s new pocket dictionary have Wikipedia come up as the first entry in a Google search, the average running time of full-length theatrical releases is under 2 hours, less than 75% of San Luis drivers come to a complete stop at intersections next to campus, on average Lucky’s is cheaper than Scolari’s, see the previous topics webpage for more ideas). The type of study can be an experiment, an observational study, or a survey. The sample size should be at least 30 and the population should be at least 20 times the size of your sample. (Note: The sample does not have to consist of humans. If it does, please do not ask sensitive questions. You should be very careful in how you define your population. Also note, your focus is not on comparing groups, just on making and testing conjectures about your population.) You are free to choose your own topic(s). The topic may be related to your major or another topic of interest. Make sure you choose a topic so that it is straightforward to gather the data or you have access to data from another class or professor. Please run by the topic by me before proceeding. You may work with up to two other people.
Final Report: Due Nov 17. This should be a typed report, written collaboratively by all team members. Your report should be written as if it will be read by other student researchers (so using the common terminology but not overly technical). Make sure it includes at least the following (and include these section headings to improve readability of report):
I. Introduction
Same guidelines as last time (see below). In addition, you should describe the population parameter of interest, your research question about its value including your initial conjecture of its value (that makes sense in the context) and whether you suspected (before you saw any data) the actual value is higher or lower (or just different) than this conjectured value.
II. Data Collection Methods
Same rules as last time, remember to tell me everything, good and bad. Think about designing a study protocol where someone else could mimic exactly the same study that you carried out. In your discussion, be sure to define your observational units, response variable of interest (and whether it was measured categorically or quantitatively), population of interest, sampling frame (if applicable), and parameter of interest. Which type of probability sampling method did you use (SRS, stratified, cluster, systematic, see Investigation 3.1.2)? Discuss any potential sources of sampling or non-sampling errors. For example: If you designed a survey, are there any potential wording issues? Did you “field-test” the questions first? How did you ensure confidentiality or take other precautions to ensure honest responses? What was the response rate? How often did you have to make repeat visits in order to obtain the observational units initially selected?
III. Analysis of Results
Descriptive Statistics
You will need to make choices as to which numerical and graphical summaries are most relevant. Make sure you integrate the output into the body of the report and include discussions of how you are interpreting the message in these summaries. In your discussion you should fully describe your sample, sample size, and report the sample statistic and whether it supports your conjecture.
Inferential Statistics
In carrying out a test of significance and a confidence interval about your population parameter, make sure you
- define the population and parameter in words,
- state the null and alternative hypotheses in symbols and in words,
- state what a type I and a type II error would represent in this setting,
- discuss whether or not your measurements can reasonably be considered a random sample from the population or process of interest,
- select an appropriate probability model for the sampling distribution and identify and check the technical conditions for the validity of this model,
- calculate the test statistic (if appropriate) and the p-value corresponding to your alternative hypothesis. Indicate what conclusion this p-value leads you to draw about the null hypothesis and include an interpretation of what this p-value represents in this context,
- calculate and interpret an appropriate confidence interval to describe the plausible values of your population parameter,
- state your conclusions (about the test and interval) in context.
Make sure you include all relevant output in the body of your report.
IV. Conclusion
Same guidelines as before. Pay particular attention to whether or not the conditions were satisfied for you to generalize your sample to the larger population. Also discuss whether or not the p-value represents true randomness in the study or if the p-value is more fictitious, used to measure the amount of chance variability if there had been randomness (measures the uncertainty but really don’t think it is reasonable to generalize from your sample to your population). Make sure you include a critique of the study you did, as well as make suggestions for future studies.
Stat 301 --
Mini-Project 1
Designing and
Analyzing an Experiment
Goal: To collect, describe, and analyze using the methods of Chapter 1.
Teams: You are to work in teams of 2-3 people. It is up to the members of the group to make sure everyone contributes equally. Plan your schedules so that you will have time to work together on the project outside of class. Teams should be formed and initial project topics selected by Oct. 2. You may be asked to share your proposals with the rest of the class. You are also encouraged to share your ideas with me before you begin collecting any data. Please start early so you have time to ask questions. You should have your data collected by Oct. 6.
The Study: You are free to choose your own topic. You should think of two groups that you can compare with one binary categorical variable through an experiment. Make sure you choose a topic for which it is feasible to gather the data in a relatively short period of time. The question may be related to your major or some other topic of interest. For example, you could have had a group member ask someone for change, randomly determining how you are dressed at the time, or you could have randomly assigned people to take a survey with two different wordings and see if they respond differently depending on how the question is asked. Your study must obtain at least 10 experimental units in each explanatory variable group. Your experimental units do not have to be people! There will be credit for creativity. You do not need to worry about selecting a representative sample, you will focus more on comparing the groups to each other.
Final Report: Due Oct. 9. This should be a typed report, written collaboratively by all team members. Your report should be written as to other student researchers. Make sure it includes at least (and please use these section headings in your report):
I. Introduction (5 pts) – Why did you choose this topic? What did you expect to find? Have similar studies been done elsewhere? Why should the reader be interested in your results and continue reading?
II. Summary of Data Collection Methods (10 pts) – How (e.g., when, where) did you collect the data? What were the experimental units? What groups did you compare, how did you find form them (explanatory variable)? What was your response variable? How was this variable measured? What additional “controls” did you exert on the study? (E.g., did you pre-test any of the questions on a test group to see if the wording was clear?) Any “operational definitions”? (E.g., did you only observe people writing or did you take any behavior such as throwing a football as indication of handedness?) Did you have any problems with non-response or other unexpected results? Did anything go wrong during the course of the study? Note: You can never give me too much detail in this section! In particular, there should be enough information that someone else could replicate your study on their own based only on your description (and hopefully improve upon it based on your suggestions below).
III. Analysis of Results (15 pts) – Include appropriate numerical and graphical summaries of your data, including the two-way table. Write several paragraphs explaining what you found in these data. Use both simulation (using the Java applet) and Fisher’s Exact Test (using Minitab or other technology) to analyze your results, reporting both the approximate and exact p-value (and include the output – you can make a screen capture of the applet window using the “Print Screen” key on the keyboard). Include a careful interpretation of what this p-value tells you. Is the difference between the groups statistically significant? What conclusions can you draw? Be sure to refer back to the type of study conducted in explaining the scope of your conclusions. Address both the question of causation and the question of whether you believe your findings generalize to a larger population. (Note: All computer output should be included in the body of the report. Make sure all figures and graphs are clearly labeled.)
IV. Conclusion (5 pts) – Summarize the results of your study. What did you learn? Did the data behave as you expected? Critique the methods used to collect the data. Is there anything you would do differently next time? How might this affect the conclusions of the study? What similar questions might someone chose to investigate in the future to build on your results?
5 pts: style, organization, layout, grammar, presentation of report, creativity
Example Previous Project
Topics: (Also see http://statweb.calpoly.edu/bchance/projects301.html)
Does being told their results will be made public affect students’ performance on a standard “IQ” test?
Does listening to classical music while studying a group of words improve whether subjects can pass a threshold level of recall?
Are people more likely to loan you money for a phone call depending on how you are dressed?
Does consumption of caffeine affect students’ performance on a test of basic knowledge?
Tell people they are tasting two different types of muffins and see if they are more likely to predict the second taste of the same muffin.
Are people more likely to agree if you randomly decide to ask them “Are you happy with your roommate” or to disagree if you ask them “Are you unhappy with your roommate?”