Problems: univariate review 1

Start . Text

Next

Problems: Review of Univariate Statistics

We live in an uncertain and risky world. Statistics can sometimes help us deal with this world.

In studying chance, mathematicians developed the concept of a probability distribution. The normal curve is one example. A simple example, but one that is often useful, is the model of tickets in a box. Suppose we have the following tickets in a box: 1, 1, 4.

What is the average of the box?

What is the standard deviation of the box?

Suppose we have a box of unknown tickets and we draw out these tickets: 1, 1, 1, 1, 4, 4. What is the average of this sample? What is the standard deviation of this sample?

There is some magic in statistics. Although we cannot predict what will happen on any single draw from a box of tickets (or when taking one random sample), we can be pretty sure about what will happen if we take many draws. This is the result of the central limit theorem and is the basis of the gambling and insurance industries.

If we draw 200 times from the box of tickets, 1, 1, 4, we should get a total of _____ give or take about_____. This second number is the standard error of the sum. We expect an average of ______ give or take about ________. (This second number is the standard error of the mean.)

Why do we have this tendency to end up close to the expected sum or expected average?

Statistical inference uses this theory to turn things around. If we draw 200 times from an unknown population and get an average of 2.1 and a standard deviation of 1.4, would we be suspicious of the claim that the average of the box is in fact 2? How about 2.5? Explain. (This type of procedure is fundamental in quality control.)

Suppose we have no claims about the unknown population but just want to know what the average is. How would we explain what we think the average is and how precise that estimate is using a confidence interval if the total of the sample of 100 is 191 and the standard deviation of the sample is 1.4? (This type of inference is used every time you see polls about the election that will be happening in November.)

Now that we have had this trip down memory lane, are you ready to move on to the much more exciting area of multivariate statistics?

Alternative numbers

1. In studying chance, mathematicians developed the concept of a probability distribution. The normal curve is one example. A simple example, but one that is often useful, is the model of tickets in a box. Suppose we have the following tickets in a box: 1, 3, 4, 5, 7.

What is the average of the box?
What is the standard deviation of the box?

2. Suppose we have a box of unknown tickets and we draw out these tickets: 7, 7, 5, 5, 4, 4, 3, 3, 1, 1. What is the average of this sample? What is the standard deviation of this sample?

There is some magic in statistics. Although we cannot predict what will happen on any single draw from a box of tickets (or when taking one random sample), we can be pretty sure about what will happen if we take many draws. This is the result of the central limit theorem and is the basis of the gambling and insurance industries.

3. If we draw 90 times from the box of tickets, 1,3,4,5,7, we should get a total of _____ give or take about_____. This second number is the standard error of the sum. We expect an average of ______ give or take about ________. (This second number is the standard error of the mean.)

Why do we have this tendency to end up close to the expected sum or expected average?

4. Statistical inference uses this theory to turn things around. If we draw 144 times from an unknown population and get an average of 3.9 and a standard deviation of 2.1, would we be suspicious of the claim that the average of the box is in fact 4? How about 4.5? Explain. (This type of procedure is fundamental in quality control.)

5. Suppose we have no claims about the unknown population but just want to know what the average is. How would we explain what we think the average is and how precise that estimate is using a confidence interval if the total of the sample of 100 is 419 and the standard deviation of the sample is 1.9? (This type of inference is used every time you see polls about the election that will be happening in November

1. Professor Box gave his class of 20 students a multiple-choice test with 15 questions on it. Each question had four options, and the average score on the test was just 32% correct. Professor Box wants to know if his students were just guessing randomly, or if they actually knew something. His friend Professor Model says that if they put their heads together, they should be able to figure it out (a Box-Model solution).

Having 20 students randomly answer 15 questions, each question with one correct option and three incorrect options, is like taking 300 draws from the box that has these tickets: ________________

The average of the box is______ and its standard deviation is______

If all the students guess randomly, the expect sum is: ______ and the expected average is _________.

The give-or-take value for the sum is____________

Another name for the give-or-take value is _________________

In trying to determine whether the students are doing better than randomly guessing, we are doing a hypothesis test. Should the null hypothesis be that they are just guessing randomly, or that they know something?

To do the test, we need to compute a z-value. That value is _______

What is the observed level of significance (or p-value)? _________

Answers here.

Go to Part 2

Start . Text

Next