Hypothesis Tests

 HomeContact us    
  Main Concepts  | Demonstration  | Activity  | Teaching Tips  | Data Collection & Analysis  | Practice Questions  | Milestone  | Fathom Tutorial 



Why do we need the t-distribution? This short activity uses simulations on the TI-83 calculator to see why we use another distribution (t instead of z) when we are performing inference on a sample mean and the standard deviation is unknown (as is always the case in practice). It could also be done on the TI-89 calculator or on Fathom, though the instructions here are particular to the TI-83.

We will first simulate a sample of three women's heights, then we will compute and standardize the sample mean assuming the standard deviation is known. The following line simulates the sample of heights, taking the mean to be 65 inches and the standard deviation is 2.8 inches.

randNorm(65, 2.8, 3)

"randNorm" is found on the TI-83 under the math-->prb menu. You will need to scroll right after simulating the sample to see the entire list of three. If you press "enter" again and again, the same command is executed repeatedly, so you can easily simulate many samples of three.
To compute and standardize the sample mean, enter the following two commands, separated by a colon. The colon is the alpha function of the decimal key.

randnorm(65, 2.8, 3)-->L1 : (mean(L1)-65)/(2.8/root(3))

The "-->" represents the "store as" function, located over the on button. mean( ) is found under the 2nd-list-math menu.
If you enter this command and then press enter several times, you will be simulating standardized sample means. We know from theory that the distribution of this statistic should be the standard normal, so you should be seeing numbers that are mostly between -2 and 2. It would be very unusual to see a value larger than 3 in magnitude.

Now we will repeat the simulation using the sample standard deviation instead of 2.8.

randNorm(65, 2.8, 3)-->L1:(mean(L1)-65)/(stdev(L1)/root(3))

The command stdev( ) is found under the 2nd-list-math menu.

If you enter this command and then press enter several times, you will be simulating standardized sample means using the sample standard deviation. You should see a difference between this simulation's results and the last one. Values larger than 3 in magnitude are not nearly so uncommon as before. This is the reason we have the t-distribution. It has heavier tails than the standard normal and takes into account the extra variability that comes from not knowing sigma.

If you do this in the classroom, it is interesting to have students say out loud any values they get that exceed 3 in magnitude. It will happen very infrequently with the first simulation, but quite frequently with the second.

It is interesting to stop when you get a very large value of the statistic and then go look at the contents of list L1. Generally, you will see three numbers that are all somewhat far from the mean of 65 inches, but the three numbers will be close to one another, producing a small sample standard deviation. The numerator of the statistic is largish, the denominator is small, hence the large t-statistic. Such an occurrence in real life with a real sample would be misleading--you would see little variability and would have a lot of confidence in your results, but in fact they would (unknown to you) be unusually far from the mean. The t-distribution quantifies how often such atypical samples occur.

You can also repeat the second simulation with a larger sample size than three (say, 10) using the following command:

randNorm(65, 2.8, 10)-->L1:(mean(L1)-65)/(stdev(L1)/root(10))

This time the large values are relatively unlikely, because the sample standard deviation has less variability in it (and behaves more like the constant sigma) when the sample size is large. t-distributions with large degrees of freedom look more like the standard normal distribution.