Answers: Confidence Intervals

Answers: Confidence Intervals Part 1

1. A student takes a random sample of 49 books from the library and finds that the average number of pages per book is 270 with a standard deviation of 35.

a) Construct a 90 percent confidence interval for the true mean number of pages.
b) Construct a 95% confidence interval for the true mean.
c) If the student wanted a 95% confidence interval that was only five pages wide, approximately how big would the sample size have to be? (Comment: it must be an integer. You cannot take a sample of 6.345 items.)

a) Going to the t-tables with 48 degrees of freedom and 90% confidence gives a t-value of 1.627?; so the interval is 270 ± 1.627*5 or 270 ± 8.135
b)t-value is 2.011, so the interval is 270 ± 2.011 *5 or 270 ±10.06
c) 2*35/√n = 5; 5√n = 70; √n = 14; n = 196

2. A student wants to estimate the average age of books in the library. She goes through the stacks, pulling off a mix of books that she thinks are representative of the entire collection. Obtaining 64 books, she finds the average age of her sample is 29 years and the standard deviation is 16 years.

a) Are there problems with the way she has taken the sample? Explain carefully.

b) If she tries to construct a 95% confidence interval, what should she get?

c) If she tries to construct a 99% confidence interval, what should she get?

d) Should she be 95% confident of the interval she found in part b? Explain carefully.

a) Yes. Humans do not make good random generators. What they think is random is often not.
b) With 63 degrees of freedom, the t-table gives a value of 1.998 for a 95% confidence interval. Hence, the interval would be 29 ± 1.998*16/8, or roughly 29 ± 4.
c) The t-value for 99% confidence with 63 degrees of freedom is 2.656. Hence, the interval is 29 ± 2.656*2 or 29 ±5.3.
d) No. The method of constructing confidence intervals requires that the sample be drawn randomly. With a properly drawn sample, the only reason that the sample mean will differ from the true mean is random chance. With a poorly drawn sample there can be other reasons--various biases. Since those are rarely understood, no one can tell how accurate or inaccurate the estimate is.

3. A random sample of 100 books is selected from the library. The average age of the books is 24.3 years and the standard deviation of the sample is 16 years.

a) Compute a 95% confidence interval for the true age of the books in the library.

b) Compute a 99% confidence interval for the true age of the books in the library.

c) A second random sample of 100 books is selected and 95% and 99% confidence intervals are constructed. True or false and explain: the second sample will arrive at exactly the same confidence interval as the first.

a) Using 99 degrees of freedom, the t-table gives us a value of 1.984. The interval is thus 24.3 ± 1.984*16/10 or 24.3 ± 3.2.
b) The t-value is 2.626, so the interval is 24.3 ± 2.626*1.6 or 24.3 ± 4.2.
c) The sample mean will vary from sample to sample. The fact that it varies is the reason that we make the estimate as an interval. The method of constructing 95% confidence intervals will give us intervals that contain the true mean 95% of the time.

4. A test of 25 felt-tip pens shows that they can write for an average of 11,291 feet with a standard deviation of 500 feet.

a) Construct a 95% confidence interval for the population mean using the normal table.
b) Construct a 95% confidence interval for the population mean using the t table.
c) Why do your results differ in part a and part b, and which is the more accurate? Explain clearly.
d) Approximately how much confidence would you have in the interval 11,291 ± 100?
e) How large must the sample be if we want a precision of ± 100 with 95% confidence?

a) 11291 ± 1.96*100
b) 11291 ± 2.064*100
c) Using the Normal tables would be correct if in some way we knew what the standard deviation of the population was. Usually we do not but estimate it from the sample. This estimation is usually quite good when the sample size is big, so the t-tables get closer and closer to the Normal tables as the degrees of freedom go up. But with small samples the estimated standard deviation may be off. The t-tables compensate for the additional source of randomness.
d) An interval of ± 100 implies the z-value is 1. On the Normal tables about 68% of the time the sample mean will be within one standard error of the true mean, so we would have about 68% confidence.
e) 100 = 2*500/√n; 100√n = 1000; n = 100.

5. In a random sample of 225 students from a very large university, only 20% were able to identify Denmark on a map of Europe. Construct a 95% confidence interval for the percentage of all students at this university who can identify Denmark on the map.

The standard deviation of a proportion is √pq where p is the proportion of successes and q is the proportion of non-successes. Here, √(.2*.8) = .4. For the sample proportion, which is just form of mean, the standard error will be .4/√225 = .4/15 = .027. Using the Normal tables, we get an interval of 20% ± 1.96*.027% or 20% ± 5%

6. Below is sample data from a statistical computer program:

One-Sample Statistics
N: 64
Mean: 118.8910
Std. Deviation: 17.8126
Std. Error Mean: 2.2266

a) How big is the sample size? How big is the population?
b. How is the standard error of the mean computed?
c. What is a 95% confidence interval for the true mean?
d. What is a 99% confidence interval for the true mean?

a) Sample size is 64; population size is unknown; b) 2.2266 = 17.8126/8; 8 is the square root of 64, the sample size; c) It would be approximately 118.89 ± 2*2.2266.
d) It would be approximately 118.89 ± 2.66*2.2266.

7. Suppose that a preliminary sample of 50 items yields a sample mean price of $60 with a standard deviation of $8. Approximately how big would the sample size have to be so that a 95% confidence interval would be the sample mean plus or minus $1?

2*8/√n = 1; √n = 16; n = 256.

8. A student is interested in the amount of hamburger he gets at a local fast-food eatery. Over a period of two weeks he obtains a sample of 16 hamburgers and finds that the mean weight of a hamburger patty is 2.7 ounces with a standard deviation of .25 ounces.

a) What is a 95% confidence interval for the population mean?

b) Why should you use the Student's t distribution in this problem instead of the Normal curve?

a) 2.7 ± 2.131*.0625 (t-value for 95% confidence with 15 degrees of freedom); 2.7 ± .13
b) When the sample size is very small and the standard deviation is estimated from the sample, we introduce additional uncertainty into the interval, and the student-t distribution compensates for this uncertainty. It is worth noting that technically we also require that the original population be Normal or close to Normal, a condition that is often not met in the real world. So if the distribution of weights is highly asymmetric, this procedure will not produce accurate confidence intervals.

9. A large lecture class has 210 students. On the last lecture of the year, a survey is given to all the students in attendance to see how many read the books. On one book, 112 of the 140 students who handed in the survey indicated that they read the book and 28 indicate that they did not. The professor in charge of the class wants a 95% confidence interval telling how many of the whole class, not just the sample of 140 students, read the book. Compute the 95% confidence interval for him, or if it is not possible, explain why it is not possible. (Hint: is this likely to satisfy the criteria for a random sample?)

Samples in which people self select are not random. The students who did not fill out the survey probably were not in class, and students who were not in class may have very different reading habits than those who do attend. A less important issue here is that the sample is large relative to the population. However, there is a way to correct for this, though we are not going to worry about it here. (The effect for the correction is to make the interval smaller, which is a good thing.)

10. You have a box with 100 marbles. Ninety-five are green and five are red. You pick a marble out of the box but do not look at it. Would it be better to say, the probability that the marble in my hand is green is 95%, or, I am 95% confident that the marble in my hand is green?

The marble that you picked is either red or green--once picked, there is no longer any probability involved even if you do not know what the result is. It is more accurate to say that you are 95% confident.

To use a somewhat different example, suppose a marksman hits the target 95% of the time. He takes his shot but neither you nor he has seen the result. Would you and he say that the probability that he hit the target is 95% or that you were 95% confident that he had hit the target?

Back to Problems

Go to Part 2

Start . Text