The Box Model
I had never heard of the Box Model until I used Statistics
by David Freedman, Robert Pisani, and Roger Purves as a
course textbook a few years into the 21st century. It
simplifies a great many problems by seeing them as a process
of pulling tickets from a box. So, for example, suppose you
bet a dollar on red while playing roulette. There are 38
slots on the roulette wheel, 18 red, 18 black, and two
green. Thus, you can view the game as one of drawing a
ticket from a box of 38 tickets. Eighteen of them have $1.00
on them, and 20 of them have -$1.00 on them (because you
will either win a dollar or lose a dollar.)
The advantage of converting problems to this form is that
it allows us to analyze problems with a very simple
probability distribution, the uniform distribution, even
when they do not initially appear to be uniform problems.
Each ticket has the same probability, so to find the
expected value or mean of the distribution, we simply add up
the sums on the tickets and divide by the number of tickets.
If we do this in the case of the roulette wheel, we see that
the expected value is about a negative five cents; the odds
favor the casino. We can also compute the standard deviation
fairly simply. We use the formula that we learned near the
beginning, but instead of
dividing by n-1, we divide by n. (Check back to discrete
probability distributions to figure out why.)
For some boxes there is a short-cut formula for the
standard deviation that is useful. If the box contains only
two kinds of tickets, we can multiply the fraction of
tickets with the big number by the fraction with the small
number. Then we take the square root of that product, and
multiply the result by the difference between the big number
and the small number. In the roulette case, we multiply
18/38 times 20/38 to get .24930747922. Taking the square
root gives .499307, or approximately .5. The difference
between the big number and the small number is $2.00, so the
standard deviation is $.998614 or approximately $1.00.
The power of this model comes when we use it with the
Central Limit Theorem.
If we play roulette 100 times, it would be like pulling 100
tickets from the box with the 38 tickets described above.
The expected sum would be 100 times the expected value of
the box, or a negative $5.26. However, there will be
variation each time we draw those 100 tickets, and the
expected error of the sum, which is the square root of 100
multiplied by the standard deviation of the box, tells us
how much variation we can expect in the sums. Ten times
.998614 is $9.99. So if we play roulette 100 times, always
betting one dollar, we would expect to lose about $5.26 give
or take about $10.00.
To compute our chances of breaking even or better, we use
the normal curve. It is centered at -5.26 and we want to
know what the probability of getting $0 or above is. We need
to convert to standardized or z-scores. To convert the $0 to
a z-score, we subtract the mean and divide by the standard
error of the sum, or ($0-(-$5.26))/$9.99 = .53. Checking
this on a table shows that the player has a 30% chance of
walking out of the casino with more than the started with
(and, obviously, a 70% chance of leaving with less.)
In the long run the casino will win. A player who makes
this bet 10,000 times will have an expected loss of $526.53
give or take $99.86. The chance of leaving the casino ahead
after this many plays is less than one in a thousand.
|
|