Central Limit Theorem

The Central Limit Theorem

The central limit theorem tells us what we can expect to see happening to the sum or the sample mean, x-bar, if we take a random sample from a population. For example, suppose we toss a coin 10,000 times. We could get anywhere from zero heads to 10,000 heads, but the central limit theorem says that it is very unlikely that we would get more than 5,200 or less than 4,800 heads. It is more precise than this: it says that we should expect about 5000 heads give or take 50. That is because the expected sum is N*(expected value of the original distribution), or 10,000*.5 = 5,000. We would not get the same value each time we tossed a coin 10,000 times, but rather there would be variation in the sums. We can measure this variation with a standard deviation, which is called the standard error of the sample sum. It is found by multiplying the standard deviation of the original distribution by the square root of n, or 100*.5 = 50. Because 5200 is four standard errors above where we expect to be, the probability of being this many standard errors away from the mean is very small. Further, if we have a big enough sample size (usually 30 is big enough unless the distribution is highly skewed), then the sample sums and means will be distributed in something very close to a normal distribution. (Although the sample sum and mean are fixed numbers for any one sample, when we take repeated samples, we can treat them as variables.)

If we are looking for the sample mean rather than the sample sum, we divide both the expected sum and the standard error of the sum by n. Hence, the expected value of the sample mean will be the mean of the original distribution and the standard deviation or standard error of the sample mean will be the standard deviation of the original distribution divided by the square root of the sample size.

The central limit theorem is a bit of magic because it implies that if we have a process that is totally random, we can get something predictable if we run the process over and over again. This predictability is what makes both the insurance and the gambling industries possible.

Start

Problems

Computer Exercise