Regression Revisited

 HomeContact us    
  Main Concepts  | Demonstration  | Activity  | Teaching Tips  | Data Collection & Analysis | Practice Questions  | Milestone  | Fathom Tutorial 
 

   

 Data Collection and Analysis

This analysis project condenses an in-class activity created by Mary Mortlock and Matt Carlton. For the full version of the activity (and other similar activities), check out http://statweb.calpoly.edu/carltonm/food/index.html.

Recently, Mary had each of her students measure his/her hand span and then try to grab as many Tootsie Pops as possible from a large bowl. The goal: predict the number of Tootsie Pops a person can grab, based upon his/her hand span. You can use the data from Mary's class: http://inspire.stat.ucla.edu/unit_14/tootsiepops.txt (a tab-delimited text file). Or, collect your own data! Please discuss your findings on the discussion board (if your instructor is using a discussion board, of course.)


Part I: Descriptive Statistics (review of previous material)


(a) We want to use hand span to predict the number of Tootsie Pops a person can pick up. Which is the explanatory variable, and which is the response variable?
(b) Create a scatter plot of the data. Describe all the features you see.
(c) Compute and interpret r and r2 for this data.
(d) Compute the least squares regression line for this data. What are the meanings of the intercept and the slope in this context? Do they make sense?
(e) Make a residual plot. Identify and discuss any outliers or influential points.
(f) Predict the number of Tootsie Pops picked up by someone with a hand span of 22 cm and someone with a hand span of 27 cm. Which prediction do you feel is more reliable, and why?


Part II: Statistical Inference


We now want to determine whether there exists a true linear relationship between hand span and the number of Tootsie Pops a person can pick up.
(g) What is the relevant parameter?
(h) State the appropriate null and alternative hypotheses.
(i) What conditions must be satisfied to validly conduct this hypothesis test? (Be aware that you can't check all of the assumptions using what you've learned so far.)
(j) Look at the residual plot again. Does this plot indicate one or more of our conditions is satisfied or violated?
(k) Create a normal probability plot of the residuals. Does this plot indicate one of our conditions is satisfied or violated?
(l) Test your hypotheses at the 5% significance level. Be sure you include your test statistic (with d.f.) and thep-value of your test.
(m) What is your conclusion? Be specific and be sure your conclusion is in context of the problem.