Answers: Confidence Intervals Part
1. A student takes a random sample of 49 books from the
library and finds that the average number of pages per book
is 270 with a standard deviation of 35.
- a) Construct a 90 percent confidence interval for the
true mean number of pages.
b) Construct a 95% confidence interval for the true
c) If the student wanted a 95% confidence interval that
was only five pages wide, approximately how big would the
sample size have to be? (Comment: it must be an integer.
You cannot take a sample of 6.345 items.)
a) Going to the t-tables with
48 degrees of freedom and 90% confidence gives a t-value of
1.627?; so the interval is 270 ± 1.627*5 or 270 ±
b)t-value is 2.011, so the interval is 270 ± 2.011 *5
or 270 ±10.06
c) 2*35/√n = 5; 5√n = 70; √n = 14; n =
2. A student wants to estimate the average age of books
in the library. She goes through the stacks, pulling off a
mix of books that she thinks are representative of the
entire collection. Obtaining 64 books, she finds the average
age of her sample is 29 years and the standard deviation is
- a) Are there problems with the way she has taken the
sample? Explain carefully.
- b) If she tries to construct a 95% confidence
interval, what should she get?
- c) If she tries to construct a 99% confidence
interval, what should she get?
- d) Should she be 95% confident of the interval she
found in part b? Explain carefully.
a) Yes. Humans do not make
good random generators. What they think is random is often
b) With 63 degrees of freedom, the t-table gives a value of
1.998 for a 95% confidence interval. Hence, the interval
would be 29 ± 1.998*16/8, or roughly 29 ± 4.
c) The t-value for 99% confidence with 63 degrees of freedom
is 2.656. Hence, the interval is 29 ± 2.656*2 or 29
d) No. The method of constructing confidence intervals
requires that the sample be drawn randomly. With a properly
drawn sample, the only reason that the sample mean will
differ from the true mean is random chance. With a poorly
drawn sample there can be other reasons--various biases.
Since those are rarely understood, no one can tell how
accurate or inaccurate the estimate is.
3. A random sample of 100 books is selected from the
library. The average age of the books is 24.3 years and the
standard deviation of the sample is 16 years.
- a) Compute a 95% confidence interval for the true age
of the books in the library.
- b) Compute a 99% confidence interval for the true age
of the books in the library.
- c) A second random sample of 100 books is selected
and 95% and 99% confidence intervals are constructed.
True or false and explain: the second sample will arrive
at exactly the same confidence interval as the
a) Using 99 degrees of
freedom, the t-table gives us a value of 1.984. The interval
is thus 24.3 ± 1.984*16/10 or 24.3 ± 3.2.
b) The t-value is 2.626, so the interval is 24.3 ±
2.626*1.6 or 24.3 ± 4.2.
c) The sample mean will vary from sample to sample. The fact
that it varies is the reason that we make the estimate as an
interval. The method of constructing 95% confidence
intervals will give us intervals that contain the true mean
95% of the time.
4. A test of 25 felt-tip pens shows that they can write
for an average of 11,291 feet with a standard deviation of
- a) Construct a 95% confidence interval for the
population mean using the normal table.
b) Construct a 95% confidence interval for the population
mean using the t table.
c) Why do your results differ in part a and part b, and
which is the more accurate? Explain clearly.
d) Approximately how much confidence would you have in
the interval 11,291 ± 100?
e) How large must the sample be if we want a precision of
± 100 with 95% confidence?
a) 11291 ± 1.96*100
b) 11291 ± 2.064*100
c) Using the Normal tables would be correct if in some way
we knew what the standard deviation of the population was.
Usually we do not but estimate it from the sample. This
estimation is usually quite good when the sample size is
big, so the t-tables get closer and closer to the Normal
tables as the degrees of freedom go up. But with small
samples the estimated standard deviation may be off. The
t-tables compensate for the additional source of
d) An interval of ± 100 implies the z-value is 1. On
the Normal tables about 68% of the time the sample mean will
be within one standard error of the true mean, so we would
have about 68% confidence.
e) 100 = 2*500/√n; 100√n = 1000; n = 100.
5. In a random sample of 225 students from a very large
university, only 20% were able to identify Denmark on a map
of Europe. Construct a 95% confidence interval for the
percentage of all students at this university who can
identify Denmark on the map.
The standard deviation of a
proportion is √pq where p is the proportion of
successes and q is the proportion of non-successes. Here, √(.2*.8)
= .4. For the sample proportion, which is just form of mean,
the standard error will be .4/√225 = .4/15 = .027.
Using the Normal tables, we get an interval of 20% ±
1.96*.027% or 20% ± 5%
6. Below is sample data from a statistical computer
Std. Deviation: 17.8126
Std. Error Mean: 2.2266
- a) How big is the sample size? How big is the
b. How is the standard error of the mean computed?
c. What is a 95% confidence interval for the true
d. What is a 99% confidence interval for the true
a) Sample size is 64;
population size is unknown; b) 2.2266 = 17.8126/8; 8 is the
square root of 64, the sample size; c) It would be
approximately 118.89 ± 2*2.2266.
d) It would be approximately 118.89 ±
7. Suppose that a preliminary sample of 50 items yields a
sample mean price of $60 with a standard deviation of $8.
Approximately how big would the sample size have to be so
that a 95% confidence interval would be the sample mean plus
or minus $1?
2*8/√n = 1; √n =
16; n = 256.
8. A student is interested in the amount of hamburger he
gets at a local fast-food eatery. Over a period of two weeks
he obtains a sample of 16 hamburgers and finds that the mean
weight of a hamburger patty is 2.7 ounces with a standard
deviation of .25 ounces.
- a) What is a 95% confidence interval for the
- b) Why should you use the Student's t distribution in
this problem instead of the Normal curve?
a) 2.7 ± 2.131*.0625
(t-value for 95% confidence with 15 degrees of freedom); 2.7
b) When the sample size is very small and the standard
deviation is estimated from the sample, we introduce
additional uncertainty into the interval, and the student-t
distribution compensates for this uncertainty. It is worth
noting that technically we also require that the original
population be Normal or close to Normal, a condition that is
often not met in the real world. So if the distribution of
weights is highly asymmetric, this procedure will not
produce accurate confidence intervals.
9. A large lecture class has 210 students. On the last
lecture of the year, a survey is given to all the students
in attendance to see how many read the books. On one book,
112 of the 140 students who handed in the survey indicated
that they read the book and 28 indicate that they did not.
The professor in charge of the class wants a 95% confidence
interval telling how many of the whole class, not just the
sample of 140 students, read the book. Compute the 95%
confidence interval for him, or if it is not possible,
explain why it is not possible. (Hint: is this likely to
satisfy the criteria for a random sample?)
Samples in which people self
select are not random. The students who did not fill out the
survey probably were not in class, and students who were not
in class may have very different reading habits than those
who do attend. A less important issue here is that the
sample is large relative to the population. However, there
is a way to correct for this, though we are not going to
worry about it here. (The effect for the correction is to
make the interval smaller, which is a good
10. You have a box with 100 marbles. Ninety-five are
green and five are red. You pick a marble out of the box but
do not look at it. Would it be better to say, the
probability that the marble in my hand is green is 95%, or,
I am 95% confident that the marble in my hand is green?
The marble that you picked is
either red or green--once picked, there is no longer any
probability involved even if you do not know what the result
is. It is more accurate to say that you are 95% confident.
To use a somewhat different
example, suppose a marksman hits the target 95% of the time.
He takes his shot but neither you nor he has seen the
result. Would you and he say that the probability that he
hit the target is 95% or that you were 95% confident that he
had hit the target?
to Part 2