Start

Problems

Next


Measuring the Center

One of the meanings of the word "statistics" is numbers, so it should be no surprise that the study of statistics begins with the organization of numbers.

The human brain does not deal well with masses of numbers. However, the human brain does deal well with pictures. A big chunk of the brain does nothing but process images. Hence, most statistical analysis begins with a picture. A commonly-used picture in statistics is the histogram, a type of bar graph that shows relative frequency.

Pictures can start us on a journey of statistical analysis, but they do not lead very far. Using a more abstract trail built on mathematics will takes us much further. Although pictures are an intuitive and widely-used first step in analyzing data, they are only a first step. Mathematicians have found that much more analysis is possible if we summarize the information in the data into a few numbers.

One way of summarizing the information in a group of numbers is to find a way of presenting what is the typical or representative number. Three common measures of this are the arithmetic mean or average, the median, and the mode. The simplest of these three, when it exists, is the mode or the most frequently-occurring number. If we have this set of numbers: {1, 2, 2, 3, 3, 3, 7}, the mode is 3. If we have this set of numbers: {1, 2, 3, 4, 5} there is no mode. It is possible to have more than one mode. If the numbers are {2, 2, 2, 3, 3, 3, 7} then both 2 and 3 are modes.

More useful than the mode is the median, or the middle number when the numbers are arranged from lowest to highest. In this set of seven numbers: {1, 2, 2, 3, 3, 3, 7} the median is 3.

When there is an even number of numbers, the median is the midpoint or average of the two numbers closest to the middle. For example, in this set of numbers: {1, 6, 7, 2} the median is 4 because after they are put into order 1, 2, 6, 7, the two middle numbers are 2 and 6, and their average or midpoint is (6+2)/2 = 4.

The final common measure of the middle is the arithmetic mean, which is also called the mean or the average. It is found by adding up the set of numbers and dividing by N, the number of numbers we have added together. In this set of numbers: {1, 2, 2, 3, 3, 3, 7} the mean is 3 because they total 21 and 21/7 = 3.

The mean is by far the most important of these three measures for statistical purposes because it has nice mathematical and statistical properties. The mean and median are identical if the distribution is symmetrical. It the distribution is lopsided with a long tail to either the left or right, (statisticians call this type of distribution of numbers skewed), the mean and median will not be the same. We can see this with a simple example starting with these numbers: {1, 2, 3, 4, 5}. The mean and median are both 3. Now let us increase the 5 to 15, so we have these numbers: {1, 2, 3, 4, 15}. The median is unaffected; it remains 3. The mean however is now 25/5=5.

The last paragraph points to one of the advantages of using the mean as the measure of the center: it is affected by every number in the set. The median and mode, in contrast, are not.


Start

Problems

Next