Next

### Problems: ANOVA

1. How important is the make of car in auto racing? This is a problem that can be solved with Analysis of Variance. We are testing the hypothesis that the chances of winning are unaffected by the make of the car. To do this we get a sample of drivers using Fords, a sample of drivers using Chevys, and a sample of drivers using Pontiacs and see how many top ten finishes they have. If the make of car does not matter, the average number of wins in each group should be fairly close.

Here are the ANOVA results from such a study:

 Model Sum of Squares degrees of freedom Mean Square F-statistics Significance Regression: Residual: Total: 100.888 2540.772 2641.660 2 44 48 50.444 57.745 .874 .425
a) How big was the sample? How can you tell?
b) What does this result tell you? Does it convince you that make of car does matter? Does it prove that make of car does not matter? Carefully explain what it tells you and why it tells you that. (Hint: the null hypothesis is that the make of car does not matter.)

2. Every fall 20 teams from around the state of Indiana would meet in the cross-country state championships. Some coaches maintain that the four semi-state regions, each of which contributes 5 teams to the state finals, are not equal in talent. They argue that a 6th or 7th place team in a strong semi-state final would easily go to the state finals if they could compete in a weak semi-state final. (The data for this problem was collected before the rules changed and six teams were allowed to advance.)

a) In the graph below we have taken the final placing of the twenty women's teams that made it to the state finals and grouped them by the semi-state that they came from. You can see that the first place team came from the group 2, and the 20th place team came from group 4. Based on this graph, which groups look weak and which look strong? Does the contention of the coaches mentioned above look like it may have substance? Explain. b) Ultimately, we need to do a statistical analysis to see what we can determine. The null hypothesis will be that all the semi-states are equally strong. We can do an Analysis of Variance test on the rank (place 1 to 20) or on the points scored (in cross country, like golf, fewer points are better). Based on the Analysis of Variance results, should we accept the hypothesis that all semi-state regions are equally strong, or should we reject it and decide that the coaches in the introduction are right? Explain.

ANOVA

 Sum of Squares df Mean Square F Sig. RANK Between Groups 341.400 3 113.800 5.627 .008 Within Groups 323.600 16 20.225 Total 665.000 19 Points Between Groups 153059.800 3 51019.933 6.630 .004 Within Groups 123126.400 16 7695.400 Total 276186.200 19

c) If we try to predict how well a team does, we can use regression with final rank as the dependent variable and as independent variables semi-state rank plus a variable to indicate in which semi-state the team ran. Valparaiso finished first in the NP semi-state. What rank do we predict for it in the state meet?

d) Penn High School finished sixth in the NP semi-state, and did not go on to the state meet. If they had been allowed to go, where does this regression predict they would have finished?

Model Summary

 Model R R Square Adjusted R Square Std. Error of the Estimate 1 .904 .818 .769 2.8414
a Predictors: (Constant), SEMIRANK, MAN, FC, NP

Coefficients

 Unstandardized Coefficients Standardized Coefficients t Sig. Model B Std. Error Beta 1 (Constant) 8.450 1.852 4.562 .000 NP -9.200 1.797 -.691 -5.120 .000 FC -8.400 1.797 -.631 -4.674 .000 MAN -1.200 1.797 -.090 -.668 .514 SEMIRANK 2.250 .449 .552 5.008 .000
a Dependent Variable: RANK

3. After running a regression, I received the following Analysis of Variance results:

 Source Sum of Squares Deg Freedom Mean Square F Regression 8.70 4 2.18 7.18 Residuals 6.97 23 .30 Total 15.67 27 .58
a) What would the R2 be for this regression?
b) How could we tell if it is what we would expect to get by random chance, or if it is bigger than what we would expect to get by random chance?