## Problems: Chi-Square

1. In real estate, location is very important. Is it as important in politics? We can ask that by using Chi-Square analysis to look at the relationship of presidential winner and state location. There are 26 states east of the Mississippi River and 24 west of it. If we do a cross-tabulation of the location of states and presidential winner in the 2000 election, we get the following tables.

 WINNER * WHERE Crosstabulation Count WHERE:: WINNER: east west Total bush 13 17 30 gore 13 7 20 Total 26 24 50

Chi-Square Test
 Value df Asymp. Sig. (2-sided) Pearson Chi-Square 2.257 1 .133 N of Valid Cases 50
Computed only for a 2x2 table
0 cells (.0%) have expected count less than 5. The minimum expected count is 9.60.

a) If we compute the expected number of eastern states that Bush should have won if geography has no influence, how many do we get?
b) Are these results statistically significant? Explain carefully in a paragraph that someone who has not had statistics would understand.

2. A researcher has surveyed a large number of high schools trying to determine their attitudes toward college. One question is whether or not they have ever heard of Saint Joseph's College. The researcher suspects that as students advance through high school, they learn more about colleges, and hence seniors should be more likely to have heard of Saint Joseph's College than freshmen. She decides to test this hypothesis using Chi-Square. Below are the results she gets:

QUES7 * YEAR Crosstabulation

 YEAR Total 2003 2004 2005 2006 QUES7 1 Count 143 87 42 10 282 Expected Count 132.7 83.7 49.4 16.1 282.0 2 Count 120 79 56 22 277 Expected Count 130.3 82.3 48.6 15.9 277.0 Total Count 263 166 98 32 559 Expected Count 263.0 166.0 98.0 32.0 559.0

Chi-Square Tests

 Value df Asymp. Sig. (2-sided) Pearson Chi-Square 8.853 ____ ____ N of Valid Cases 559
0 cells (.0%) have expected count less than 5. The minimum expected count is 15.86.

(For this question, an answer of 1 indicates that the student has heard of SJC, while a 2 indicates they the student has not heard of SJC. Year represents year of graduation, so 2003 is a senior and 2006 is a freshman.)

a) The first cell of the table indicates that 143 seniors have heard of SJC. Where does the expected count of 132.7 come from?
b) Just eyeballing these data, do they tend to support the researcher's hypothesis? Explain.
c) To formally test the hypothesis, what does the researcher establish as the null hypothesis? What is the alternative?
d) The results indicate a Chi-Square statistic of 8.853. Set up the equation that yielded this result. (Put in the numbers that you would begin with--no need to do any of the calculation.)
e) How many degrees of freedom would this test have? Explain.
f) If you were to test the null hypothesis with alpha = .05, what would you decide? Explain.
g) If you were to test the null hypothesis with alpha = .01, what would you decide? Explain.

3. A political candidate has a survey done to determine how popular he is with various groups. He finds the following:

 Age: Preference: over 65 under 65 Total support him 18 12 30 oppose him 22 48 70 Total 40 60 100
a) Complete a contingency table showing expected frequencies under the null hypothesis that there is no difference in support across age groups.

 Age: Preference: over 65 under 65 Total support him _____ _____ 30 oppose him _____ _____ 70 Total 40 60 100

b) How many degrees of freedom will we have in the Chi-square test? Explain how you find out.
c) If he wants to test the hypothesis that he is equally supported in different age groups and he sets his alpha level to .05, what is the critical value of the chi-square statistic?
d) If he sets alpha to .01, what is the critical value of the chi-square statistic? What does this value mean?
e) Compute the chi-square statistic in this case.
f) Do you accept or reject the null hypothesis at the .01 level? What does your decision mean?

4. A statistics professor believes that students in his morning classes do better than students in his afternoon classes. His department chairman says the differences are random and that students do equally well regardless of time. The professor finds his grades for the past semesters and finds the following:

 Letter Grade: Time the Classes Met: A B C D F Total Morning 12 18 13 11 6 60 Afternoon 3 7 17 4 9 40 Total 15 25 30 15 15 100

Using the Chi-square test, do you conclude that the morning classes and afternoon classes were somehow different?