Regression Revisited |
||
Main Concepts | Demonstration | Activity | Teaching Tips | Data Collection & Analysis | Practice Questions | Milestone | Fathom Tutorial | ||
Activity This activity will (hopefully) reinforce the idea of slope and intercept as statistics, each varying from sample to sample. We will use Beth Chance's regression sampling applet: http://statweb.calpoly.edu/chance/applets/regcoeff/regcoeff.html The applet allows you to select the population slope and intercept, which in turn determine the population regression line (in yellow). You can also choose a mean and standard deviation for the x-values, and finally the population standard deviation for the responses about the regression line. For now, let's all be consistent: • Set the population slope to 1.5 and the population intercept to 2. Keep all other values the same. • Click the Set Population button to create your new
population of data. (Note:If you would like to see more of the graph,
you can change the window frame using the gray boxes along the 4 sides
of the graph.) At the bottom of the page, you should see the equation y = 1.50x+2. That is the population equation. We will now sample from the population displayed on the graph
(the blue
dots). • Hit the Draw Samples button once. The applet randomly
selects a sample of points (in red; n = 80 is the default). Then the
applet calculates the least squares regression line for those n points
and graphs that line in red. The equation of the line appears at the
bottom of the page. Is it exactly the same as our population equation?
Do the graphs line up exactly? Why not? • Hit the Draw Samples button a few more times, just to see
how the samples --and, hence, the resulting least squares regression
lines-- differ from sample to sample. This is an illustration of
sampling variability. • Change the "num samples" from 1 to 100, then click Draw
Samples. The applet will superimpose all 100 sample least squares lines
onto the graph (the "wave" of red) and launch a window with dot plots
of the sampling distributions of the slope and intercept. Focus on the
slopes: what do you see? Is the center of the dot plot reasonably close
to 1.5? Do you notice a shape forming? • Before you close the dot plot window, note the standard
deviation for the slopes somewhere. • Hit the Reset button. • Change the value of sigma from .45 (the default) to 2.45,
and click Set Population. What do you notice happened to the population
graph? Remember, sigma is the standard deviation of the y-values about
the regression line. • Once again, take 100 samples of size n = 80 and look at the
sampling distribution of the slopes. What happened to their spread?
(That is, did the standard deviation of the slopes increase or
decrease, compared to the value you noted earlier?) Is this what you
would expect to happen for a larger value of sigma? • Change sigma back to .45 (the default) and click Set
Population. For our next illustration, change the sample size from 80
to 20. Again, take 100 samples and look at the resulting slopes. What
happened to the spread of the slopes this time? Is this what you would
expect to happen for a smaller sample size? • Finally, change the sample size back to 80 (the default). How do the x-values play a role? To find out, change "x std" (the standard deviation of the x-values) from 1.84 to 4.84. With sample size back at 80, take 100 samples again. What happened to the spread of the slopes? Does this result surprise you?
1) The larger the standard deviation of the responses about
the line,
the more widely-varying our estimates of the slope will be. 2) The variability of the sampling distribution of the slopes is larger for smaller sample sizes. 3) Slopes across different samples are less variable when the x-values are more variable. |