Computer Exercises: Central Limit
Theorem
(This exercise was designed to work with SPSS. It may be
modified to work with other programs, and it may have to be
modified to work with the current version of SPSS.)
Simulating the Kerrich Experiment
A mathematician named John Kerrich tossed a coin 10,000
times and recorded all of the results. You might wonder why
someone would do something like this, and in the case of
Kerrich it was because he had time to kill—he was in a
POW camp for several years. In lab today we will try to
replicate Kerrich's experiment in less than an hour.
Open SPSS 15 for Windows and start with an empty sheet.
Holding down the <enter> key, scroll down to line
10,000. (This will take several minutes.) Place a number at
10,000. You should see that the count above your entry is
now black. (If you mess this up, you may have to redo it,
and you really do not want to repeat it.)
Pull down Transform>Compute Variables. You will get
a dialog box. Place the name temp in the target variable and
the number 1 in the box for numeric expression. Press OK.
This will create a column of 10,000 ones.
Pull down Transform>Create Time Series. In the
dialog box, pull down the function menu to Cumulative Sum.
Then click over temp to new variable. (It will put an
equation in the box—that is OK.) Click OK. You should
get a column in your data sheet numbered 1 to 10000. (Ignore
what happens in the output window.) Rename this column to
count.
Pull down Transform—Random Number Generators. Check
the "Set Starting Point" box and make sure the Random radio
button is on. Click OK. (If you do not do this, you may get
a preset set of random numbers. I prefer everyone get his or
her own unique random results.)
Pull down Transform> Compute Variables. In the
target variable put the name toss. In the numeric expression
box put trunc(rv.uniform(0,2)). What this will do is create
a series of random numbers between 0 and 2, and then chop
off all the decimals. Anything between 0 and 1 will become a
zero, and anything between 1 and 2 will become a 1. So you
will get a column of zeros and ones once you click the OK
button. (Chopping off the decimals is called truncation.
That is the trunc part of the expression. The
rv.uniform(0,2) creates random numbers between 0 and 2.)
4. Just to make life a bit more difficult, let us change
the zeros to 1. The easiest way to do this is to go back to
Transform>Compute Variables. In the target variable
leave the name toss. In the numeric expression box put
2*toss1. Call this new variable toss1. Click OK. Can you
explain why this will leave the 1s as 1s and make the 0s
into 1s?
You have just simulated tossing a coin 10000 times. Let
us see what we get.
We want to look at the sum of the tosses as we continue
to toss. We can do this with the Transform>Create Time
Series. In the dialog box, pull down the function menu to
Cumulative Sum. Then click over toss to new variable. Click
OK. (Ignore what happens in the output window.)
Change the name of this new column to sum.
5. What does a positive number in this column mean? What
does a negative number mean?
6. What do you think will happen to the sum as the number
of tosses gets bigger? Will it get closer and closer to
zero, or will it get further and further from zero? We can
look at the numbers, or we can look at a graph. Let us look
at the graph. Pull down Graphs>Scatter. Click Define.
Put sum on the Y axis and count on the X axis. Click OK.
Describe what you find.
7. In an imaginary dialog with Kerrich, an assistant
says, "In the long run the number of heads and the number of
tails even out." Kerrich says that is not true. He says that
the more you toss, the more you are likely to be away from
zero. Is that what happens in your case or not?
Kerrich does say that the average should approach 0. We
can see if this happens by dividing sum by count. You can do
this using Transform>Compute. (You can figure out what
you need to do. Call the result average.)
8. Do a scatter diagram with count on the x axis and
average on the y axis. Does it tend to approach zero?
9. In the next week or so we will show that when you have
tossed the coin 100 times using these numbers of 1 and 1,
you will usually have a rest that is between 20 and +20. Do
you? At the 900 toss count, your result is expected to be
between 60 and + 60. Is yours? At 3600 your result is
expected to between 120 and 120. Is yours? And at 10,000,
your result is expected to be between 200 and + 200. Is
yours?
If you have time, repeat the experiment ten times, (hint—call
your variables toss1, toss2, toss3, etc.) Do all the results
look pretty similar or not? (Hint—do all of each step
at the same time—it goes very fast that way.)
The chance error with 10,000 tosses is likely to be about
50. That means that most of the time your column called sum
is unlikely to be more than 200 or less than 200. Of your
eleven trials, how many had bigger sums at 10000?
10. Record all the sums that you had at 100, 1000, and
10000. Suppose you had to pay the house 1 cent every time
you played. You win $1 with a heads and lose $1 with a tail.
What percentage of times would you be ahead at 100 tosses?
At 1000? At 5000? At 10000?
