Exploring Data

Home | Contact us   
  Main Concepts  |Demonstration | Activity  | Teaching Tips  | Data Collection & Analysis  | Practice Questions   | Milestone   | Fathom Tutorial 


 Teaching Tips

    •A fundamental difficulty in this unit is interpreting a graph and then explaining in writing what you've seen at an appropriately sophisticated level.

    •Interpretation and communication is the focus. They need to know what to ask for from the software, but not how the software creates it.

    •When examining the shape of a distribution, we look for the general pattern, but also exceptions from the pattern. Exception include outliers, potential outliers, gaps, or anything that piques interest. For example, often self-reported weights come in multiples of 5 and this can sometimes be detected in a graphic.

    •Identifying outliers is tricky and confusing. There is no technical definition of an outlier. It is instead a "term of art." It is meant to help us label values that are exceptional with respect to the bulk of the data. There are several techniques for identifying potential outliers, and the most common of these is the "1.5 IQR rule". Some books use different definitions for outliers, but it's important not to get hung up on a particular definition. This is not a mathematically defined quantity and different books will differ.

    • Students will be tempted to remove outliers. Don't let them do this. Outliers should be investigated. Sometimes, the story is the outlier.

    • There are many "fuzzy" terms in Statistics (as you may remember from the conversation in July, Statistics is not Math.) Symmetric sample distributions are rarely precisely symmetric. "Spread" is another term that is used colloquially.

    • Be aware that students will have misconceptions based on a clash between their informal definitions and statisticians' formal definitions. Here are some difficult words for which you should make sure the students know what a statistician means: random, independent, expected, "on average", normal.

    •There are (at least) two types of means: sample means and population means. Some books are careful about calling the sample mean the sample mean, and others just call it a mean. Later it will be very important to distinguish sample and population means.

    •n-1 in the standard deviation formula: For your students, you need to tell them that we can't adequately explain why it's n-1 and not n until later. Take it as a definition, and the reasons will be explained when we talk about "unbiasedness."

    • One of the conceptual shifts we're expecting from the students is to stop thinking in terms of individuals and exceptional values, and instead think in terms of groups and general trends. For this reason, when comparing two groups, students should focus on the centers (the "typical"), the spread (how much variety within a group) and the general shape (are there exceptional responses, or differences in the overall shapes from the two groups?)

    •Students should focus on the center, spread, and shape of a distribution when comparing groups or interpreting distributions.

    • Insist that your students put the graphs of the same scale, or they may make erroneous comparisons.

    •Be precise with descriptions of data: "Men make more money than women" is too vague to be meaningful. Some women make more than some men, and some women make a LOT more than 95% of the men. What might be true is that the mean/median income for men is higher than the mean/median for women.

    •There are certain "reserved letters" in Statistics that (almost) always represent the same concepts. For example, n is always sample size. X (upper case) is always a random variable. Lower case s is always the sample standard deviation. Lower case r is always sample correlation. You'll find more, and it's worth pointing these out to your students when they arise.