Solutions
to Practice Problems
1. Three things influence the margin of error in a
confidence interval estimate of a population mean: sample size,
variability in the population, and confidence level. For each of these
quantities separately, explain briefly what happens to the margin of
error as that quantity increases.
Answer: As sample size increases, the margin of
error decreases. As the variability in the population increases, the
margin of error increases. As the confidence level increases, the
margin of error increases. Incidentally, population variability is not
something we can usually control, but more meticulous collection of
data can reduce the variability in our measurements. The third of
these—the relationship between confidence level and margin of error
seems contradictory to many students because they are confusing
accuracy (confidence level) and precision (margin of error). If you
want to be surer of hitting a target with a spotlight, then you make
your spotlight bigger.
2. A survey of 1000 Californians finds reports that 48% are
excited by the annual visit of INSPIRE participants to their fair
state. Construct a 95% confidence interval on the true proportion of
Californians who are excited to be visited by these Statistics
teachers.
Answer: We first check that the sample size
is large enough to apply the normal approximation. The true value of p
is unknown, so we can't check that np > 10 and n(1-p) > 10, but
we can check this for p-hat, our estimate of p. 1000*.48 = 480 > 10
and 1000*.52 > 10. This means the normal approximation will be good,
and we can apply them to calculate a confidence interval for p.
.48 +/- 1.96*sqrt(.48*.52/1000)
.48 +/- .03096552 (that mysterious 3% margin of error!)
(.45, .51) is a 95% CI for the true proportion of all
Californians who are excited about the Stats teachers' visit.
3. Since your interval contains values above 50% and
therefore does finds that it is plausible that more than half of the
state feels this way, there remains a big question mark in your mind.
Suppose you decide that you want to refine your estimate of the
population proportion and cut the width of your interval in half. Will
doubling your sample size do this? How large a sample will be needed to
cut your interval width in half? How large a sample will be needed to
shrink your interval to the point where 50% will not be included in a
95% confidence interval centered at the .48 point estimate?
Answer: The current interval width is about
6%. So the current margin of error is 3%. We want margin of error =
1.5% or
1.96*sqrt(.48*.52/n) = .015
Solve for n: n = (1.96/.015)^2 * .48*.52 = 4261.6
We'd need at least 4262 people in the sample. So to cut the
width of the CI in half, we'd need about four times as many people.
Assuming that the true value of p = .48, how many people would
we need to make sure our CI doesn't include .50? This means the margin
of error must be less than 2%, so solving for n:
n = (1.96/.02)^2 *.48*.52 = 2397.1
We'd need about 2398 people.
4. A random sample of 67 lab rats are enticed to run through
a maze, and a 95% confidence interval is constructed of the mean time
it takes rats to do it. It is [2.3min, 3.1 min]. Which of the following
statements is/are true? (More than one statement may be correct.)
(A) 95% of the lab rats in the sample ran the maze in between 2.3 and
3.1 minutes.
(B) 95% of the lab rats in the population would run the maze in between
2.3 and 3.1 minutes.
(C) There is a 95% probability that the sample mean time is between 2.3
and 3.1 minutes.
(D) There is a 95% probability that the population mean lies between
2.3 and 3.1 minutes.
(E) If I were to take many random samples of 67 lab rats and take
sample means of maze-running times, about 95% of the time, the sample
mean would be between 2.3 and 3.1 minutes.
(F) If I were to take many random samples of 67 lab rats and construct
confidence intervals of maze-running time, about 95% of the time, the
interval would contain the population mean. [2.3, 3.1] is the one such
possible interval that I computed from the random sample I actually
observed.
(G) [2.3, 3.1] is the set of possible values of the population mean
maze-running time that are consistent with the observed data, where
“consistent” means that the observed sample mean falls in the middle
(“typical”) 95% of the sampling distribution for that parameter value.
Answer: F and G are both correct statements.
None of the others are correct.
If you said (A) or (B), remember that we are estimating a
mean.
If you said (C), (D), or (E), remember that the interval
[2.3, 3.1] has already been calculated and is not random. The parameter
mu, while unknown, is not random. So no statements can be made about
the probability that mu does anything or that [2.3, 3.1] does anything.
The probability is associated with the random sampling, and thus the
process that produces a confidence interval, not with the resulting
interval.
5. Two students are doing a statistics project in which they
drop toy parachuting soldiers off a building and try to get them to
land in a hula-hoop target. They count the number of soldiers that
succeed and the number of drops total. In a report analyzing their
data, they write the following:
“We constructed a 95% confidence interval estimate of the proportion of
jumps in which the soldier landed in the target, and we got [0.50,
0.81]. We can be 95% confident that the soldiers landed in the target
between 50% and 81% of the time. Because the army desires an estimate
with greater precision than this (a narrower confidence interval) we
would like to repeat the study with a larger sample size, or repeat our
calculations with a higher confidence level.”
How many errors can you spot in the above paragraph?
Answer: There are three incorrect statements.
First, the first statement should read “…the proportion of jumps in
which soldiers land in the target.” (We’re estimating a population
proportion.) Second, the second sentence also refers to past tense and
hence implies sample proportion rather than population proportion. It
should read, “We can be 95% confident that soldiers land in the target
between 50% and 81% of the time.” (The difference is subtle but shows a
student misunderstanding.) And the third error is in the last sentence.
A higher confidence level would produce a wider interval, not a
narrower one.
|