Main Concepts: The Normal Distribution

 HomeContact us    
  Main Concepts | Demonstration  | Activity  | Teaching Tips  | Data Collection & Analysis  | Practice Questions   | Milestone   | Fathom Tutorial 


Main Concepts

Everybody believes in the normal approximation, the experimenters because they think it is a mathematical theorem,the mathematicians because
they think it is an experimental fact!
G. Lippman, quoted in D'Arcy Thompson's On Growth and Form.

•The normal distribution is a mathematical model that idealizes distributions of variables that are symmetric and unimodal. However, keep in mind that there are other distributions that can model symmetric and unimodal distributions.

• The normal distribution is merely one of many distributions, all of which are used as idealized summaries of distributions in populations. The area
under a distribution between any two values represents the proportion of a population that lies between those two values

• The standard unit is a useful and fundamental way of comparing observations from two different populations. In essence, we use the standard
deviation as a basic unit of measurement that replaces the natural units a varible was first measured in.

• Empirical Rule: 95% of the data are within two standard deviations of the mean in a normal curve. 68% are within one standard deviation. 99.7%
(almost all) are within three standard deviations. This turns out to be approximately true for many symmetric distributions.

• Fewer populations than you may think or your book may suggest actually have distributions that are well approximated by the normal
distribution. However, it is of fundamental importance for statistical analyses because of a result called the Central Limit Theorem, which we
discuss in a later unit.

Tie-in to Probability

We're currently using the normal curve as a descriptive summary of a set of data or of a population. For example, by
"heights follow a normal distribution with mean 67 inches and standard deviation 3 inches" we are giving quite a bit
of information with very few words. In particular, you can now say approximately what percent of the population/data
are between any two values, say 65 and 68 inches, by finding the appropriate area under the corresponding normal
curve (37.8%, in fact). Note that we are making a statement of fact; if the model is a good fit to the population, then
we are claiming that 37.8% of the population falls between these heights.

Later, we will talk about probabilities, and we will use the same mathematical function -- the normal curve-- to describe probability
processes. In this context the same mathematical function serves a subtly different
process. Now we would say that "if we were to randomly select one person from this population, the
probability that he or she is between 65 and 68 inches is 0.378." This is not a description of the population, but
instead a prediction about what will happen when we carry out a particular action.
the normal curve to model probability processes.

"Normality is a myth; there never has, and never will be, a normal distribution", Geary, R.C. in Biometrika, "Testing for Normality", Volume 34, 1947 (p. 241)