# Chapter 10 Analysis of Variance (Hypothesis Testing III) Statistics: A Tool For Social Research Eighth Edition Joseph F. Healey Chapter 10 Hypothesis Testing III : The Analysis of Variance Learning Objectives 1. Identify and cite examples of situations in which ANOVA is appropriate.

2. Explain the logic of hypothesis testing as applied to ANOVA. 3. Perform the ANOVA test using the five-step model as a guide, and correctly interpret the results. 4. Define and explain the concepts of population variance, total sum of squares, sum of squares between, sum of squares within, mean square estimates, and post hoc tests. 5. Explain the difference between the statistical significance and the importance of relationships between variables.

Chapter Outline Introduction The Logic of the Analysis of Variance The Computation of ANOVA

Computational Shortcut A Computational Example A Test of Significance for ANOVA An Additional Example for Computing and Testing the Analysis of Variance The Limitations of the Test Interpreting Statistics: Does Sexual Activity Vary by Marital Status? In This Presentation

The basic logic of analysis of variance (ANOVA) A sample problem applying ANOVA The Five Step Model Limitations of ANOVA post hoc techniques Basic Logic ANOVA can be used in situations where the researcher is interested in the differences in sample

means across three or more categories. Examples: How do Protestants, Catholics and Jews vary in terms of number of children? How do Republicans, Democrats, and Independents vary in terms of income? How do older, middle-aged, and younger people vary in terms of frequency of church attendance? Basic Logic Think of ANOVA as extension of t test for more than two groups.

ANOVA asks are the differences between the samples large enough to reject the null hypothesis and justify the conclusion that the populations represented by the samples are different? (pg. 235) The H0 is that the population means are the same: H0: 1= 2= 3 = = k Basic Logic If the H0 is true, the sample means should be about the same value. If the H0 is true, there will be little difference between sample means.

If the H0 is false, there should be substantial differences between categories, combined with relatively little difference within categories. The sample standard deviations should be low in value. If the H0 is false, there will be big difference between sample means combined with small values for s. Basic Logic The larger the differences between the sample means, the more likely the H0 is false.-- especially when there is little difference within categories.

When we reject the H0, we are saying there are differences between the populations represented by the sample. Basic Logic: Example Could there be a relationship between religion and support for capital punishment? Consider these two examples. Steps in Computation of ANOVA 1. 2.

3. Find total sum of squares (SST) by Formula 10.1. Find sum of squares between (SSB) by Formula 10.3. Find sum of squares within (SSW) by subtraction (Formula 10.4). NX2 Steps in Computation of ANOVA 4. 5.

6. Calculate the degrees of freedom (Formulas 10.5 and 10.6). Construct the mean square estimates by dividing SSB and SSW by their degrees of freedom. (Formulas 10.7 and 10.8). Find F ratio by Formula 10.9. Example of Computation of ANOVA Problem 10.6 (255) Does voter turnout vary by type of election? Data are presented for local, state, and national elections.

Example of Computation of ANOVA: Problem 10.6 X X2 Group Mean Local State

National 441 559 723 20,213 27,607

45,253 36.75 46.58 60.25 Example of Computation of ANOVA: Example 10.6 The difference in the means suggests that turnout does vary by type of election.

Turnout seems to increase as the scope of the election increases. Are these differences statistically significant? Example of Computation of ANOVA: Example 10.6 Use Formula 10.1 to find SST. Use Formula 10.4 to find SSB Find SSW by subtraction SSW = SST SSB SSW = 10,612.13 - 3,342.99 SSW= 7269.14

SST X 2 NX 2 SST 93073 36 47.86 SST 93073 (82460.87) SST 10612.13 Use Formulas 10.5 and 10.6 to calculate degrees of freedom. 2 Example of Computation of ANOVA:

Example 10.6 Use Formulas 10.7 and 10.8 to find the Mean Square Estimates: MSW = SSW/dfw MSW =7269.14/33 MSW = 220.28 MSB = SSB/dfb MSB = 3342.99/2 MSB = 1671.50 Example of Computation of ANOVA: Example 10.6

Find the F ratio by Formula 10.9: F = MSB/MSW F = 1671.95/220.28 F = 7.59 Step 1: Make Assumptions and Meet Test Requirements Independent Random Samples Level of Measurement is Interval-Ratio The dependent variable (e.g., voter turnout) should be I-R to justify computation of the mean. ANOVA is often used with ordinal variables with wide ranges.

Populations are normally distributed. Population variances are equal. Step 2: State the Null Hypothesis H0: 1 = 2= 3 The H0 states that the population means are the same. H1: At least one population mean is different. If we reject the H0, the test does not specify which population mean is different from the others.

Step 3: Select the Sampling Distribution and Determine the Critical Region Sampling Distribution = F distribution Alpha = 0.05 dfw = (N k) = 33

dfb = k 1 = 2 F(critical) = 3.32 The exact dfw (33) is not in the table but dfw = 30 and dfw = 40 are. Choose the larger F ratio as F critical. Step 4 Calculate the Test Statistic F (obtained) = 7.59 Step 5 Making a Decision and Interpreting the Test Results

F (obtained) = 7.59 F (critical) = 3.32 The test statistic is in the critical region. Reject the H0. Voter turnout varies significantly by type of election. Suggestion Go carefully through the examples in the book to be sure you understand and can apply ANOVA. Support for capital punishment example: Section

10.3. Efficiency of three social service agencies: Section 10.6 Limitations of ANOVA 1. Requires interval-ratio level measurement of the dependent variable and roughly equal numbers of cases in the categories of the independent variable. 2. Statistically significant differences are not necessarily important. 3. The alternative (research) hypothesis is not

specific. Asserts that at least one of the population means differs from the others. Use post hoc techniques for more specific differences. See example in Section 10.8 for post hoc technique.