Data
Collection and Analysis
This analysis project condenses an in-class activity created
by Mary Mortlock and Matt Carlton. For the full version of the activity
(and other similar activities), check out http://statweb.calpoly.edu/carltonm/food/index.html.
Recently, Mary had each of her students measure his/her hand
span and
then try to grab as many Tootsie Pops as possible from a large bowl.
The goal: predict the number of Tootsie Pops a person can grab, based
upon his/her hand span. You can use the data from Mary's class: http://inspire.stat.ucla.edu/unit_14/tootsiepops.txt (a
tab-delimited text file). Or, collect your own data! Please discuss
your findings on the discussion board (if your instructor is using a
discussion board, of course.)
Part I: Descriptive Statistics (review of previous
material)
(a) We want to use hand span to predict the number of Tootsie Pops a
person can pick up. Which is the explanatory variable, and which is the
response variable?
(b) Create a scatter plot of the data. Describe all the features you
see.
(c) Compute and interpret r and r2 for this data.
(d) Compute the least squares regression line for this data. What are
the meanings of the intercept and the slope in this context? Do they
make sense?
(e) Make a residual plot. Identify and discuss any outliers or
influential points.
(f) Predict the number of Tootsie Pops picked up by someone with a hand
span of 22 cm and someone with a hand span of 27 cm. Which prediction
do you feel is more reliable, and why?
Part II: Statistical Inference
We now want to determine whether there exists a true linear
relationship between hand span and the number of Tootsie Pops a person
can pick up.
(g) What is the relevant parameter?
(h) State the appropriate null and alternative hypotheses.
(i) What conditions must be satisfied to validly conduct this
hypothesis test? (Be aware that you can't check all of the assumptions
using what you've learned so far.)
(j) Look at the residual plot again. Does this plot indicate one or
more of our conditions is satisfied or violated?
(k) Create a normal probability plot of the residuals. Does this plot
indicate one of our conditions is satisfied or violated?
(l) Test your hypotheses at the 5% significance level. Be sure you
include your test statistic (with d.f.) and thep-value of your test.
(m) What is your conclusion? Be specific and be sure your conclusion is
in context of the problem.
|