The Central Limit Theorem
The central limit theorem tells us what we can expect to
see happening to the sum or the sample mean, x-bar, if we
take a random sample from a population. For example, suppose
we toss a coin 10,000 times. We could get anywhere from zero
heads to 10,000 heads, but the central limit theorem says
that it is very unlikely that we would get more than 5,200
or less than 4,800 heads. It is more precise than this: it
says that we should expect about 5000 heads give or take 50.
That is because the expected sum is N*(expected value of the
original distribution), or 10,000*.5 = 5,000. We would not
get the same value each time we tossed a coin 10,000 times,
but rather there would be variation in the sums. We can
measure this variation with a standard deviation, which is
called the standard error of the sample sum. It is found by
multiplying the standard deviation of the original
distribution by the square root of n, or 100*.5 = 50.
Because 5200 is four standard errors above where we expect
to be, the probability of being this many standard errors
away from the mean is very small. Further, if we have a big
enough sample size (usually 30 is big enough unless the
distribution is highly skewed), then the sample sums and
means will be distributed in something very close to a
normal distribution. (Although the sample sum and mean are
fixed numbers for any one sample, when we take repeated
samples, we can treat them as variables.)
If we are looking for the sample mean rather than the
sample sum, we divide both the expected sum and the standard
error of the sum by n. Hence, the expected value of the
sample mean will be the mean of the original distribution
and the standard deviation or standard error of the sample
mean will be the standard deviation of the original
distribution divided by the square root of the sample
size.
The central limit theorem is a bit of magic because it
implies that if we have a process that is totally random, we
can get something predictable if we run the process over and
over again. This predictability is what makes both the
insurance and the gambling industries possible.
|
|