Chi-square and F Distributions Children of the Normal Questions What is the chi-square distribution? How is it related to the Normal? How is the chi-square distribution related to the sampling distribution of the variance? Test a population value of the variance; put confidence intervals around a population value. Questions How is the F distribution related the Normal? To Chi-square? Distributions There are many theoretical distributions, both continuous and discrete. Howell calls these test statistics We use 4 test statistics a lot: z (unit
normal), t, chi-square ( ), and F. to the Z and t are closely related sampling distribution of means; chisquare and F are closely related to the sampling distribution of variances. 2 Chi-square Distribution (1) (X X ) ( X ) z ;z SD ( X )2 z 2 2 2 z
2 (1) z score z score squared Make it Greek What would its sampling distribution look like? Minimum value is zero. Maximum value is infinite. Most values are between zero and 1; most around zero. Chi-square (2) What if we took 2 values of z2 at random and added them? 2 2 ( X1 )2 2 ( X 2 )2 2 ( X
) ( X ) 2 2 1 2 z ; z2 z z 2
2 ( 2 ) 1 2 2 2 2 1 Same minimum and maximum as before, but now average should be a bit bigger. Chi-square is the distribution of a sum of squares. Each squared deviation is taken from the unit normal: N(0,1). The shape of the chi-square distribution depends on the number of squared deviates that are added together. Chi-square 3
The distribution of chi-square depends on 1 parameter, its degrees of freedom (df or v). As df gets large, curve is less skewed, more normal. Chi-square (4) The expected value of chi-square is df. The mean of the chi-square distribution is its degrees of freedom. The expected variance of the distribution is 2df. If the variance is 2df, the standard deviation must be sqrt(2df). There are tables of chi-square so you can find 5 or 1 percent of the distribution. Chi-square is additive. (2v v ) (2v ) (2v ) 1 2 1 2
Distribution of Sample Variance s 2 (y y)2 N1 (2N 1) ( N 1) s 2 2 Sample estimate of population variance
(unbiased). Multiply variance estimate by N-1 to get sum of squares. Divide by population variance to stadnardize. Result is a random variable distributed as chi-square with (N-1) df. We can use info about the sampling distribution of the variance estimate to find confidence intervals and conduct statistical tests. Testing Exact Hypotheses about a Variance 2 H 0 : 2 ( N 1)
2 0 ( N 1) s 2 02 Test the null that the population variance has some specific value. Pick alpha and rejection region. Then: Plug hypothesized population variance and sample variance into equation along with sample size we used to estimate variance. Compare to chi-square distribution. Example of Exact Test Test about variance of height of people in inches. Grab 30 people at random and measure height. H 0 : 2 6.25; H 1: 2 6.25. Note: 1 tailed test on
small side. Set alpha=.01. N 30; s 2 4.55 Mean is 29, so its on the (29)(4.55) 2 29 21.11 small side. But for Q=.99, the 6.25 value of chi-square is 14.257. Cannot reject null. H 0 : 2 6.25; H 1: 2 6.25. N 30; s 2 4.55 Note: 2 tailed with alpha=.01. Now chi-square with v=29 and Q=.995 is 13.121 and also with Q=.005 the result is 52.336. N. S. either way. Confidence Intervals for the Variance 2
2 We use s to estimate . It can be shown that: 2 ( N 1) s 2 ( N 1 ) s p 2 2 2 .95 ( N 1;.975) ( N 1;.025) 2 Suppose N=15 and s is 10. Then df=14 and for Q=.025 the value is 26.12. For Q=.975 the value is 5.63. (14)(10)
(14)(10) 2 p .95 5.63 26.12 2 p 5.36 24.87 .95 Normality Assumption We assume normal distributions to figure sampling distributions and thus p levels. Violations of normality have minor implications for testing means, especially as
N gets large. Violations of normality are more serious for testing variances. Look at your data before conducting this test. Can test for normality. Review You have sample 25 children from an elementary school 5th grade class and measured the height of each. You wonder whether these children are more variable in height than typical children. Their variance in height is 4. Compute a confidence interval for this variance. If the variance of height in children in 5th grade nationally is 2, do you consider this sample ordinary? The F Distribution (1) The F distribution is the ratio of two variance estimates: 2 2
s1 est. 1 F 2 s2 est. 22 Also the ratio of two chi-squares, each divided by its degrees of freedom: F (2v1 ) / v1 (2v2 ) / v2 In our applications, v2 will be larger than v1 and v2 will be larger than 2. In such a case, the mean of the F distribution (expected value) is v2 /(v2 -2). F Distribution (2) F depends on two parameters: v1 and v2 (df1 and df2). The shape of F changes with these. Range is 0 to infinity. Shaped a bit like chi-square.
F tables show critical values for df in the numerator and df in the denominator. F tables are 1-tailed; can figure 2-tailed if you need to (but you usually dont). F table critical values Numerator df: dfB dfW 1 2 3 4 5 5 5% 1%
2.96 4.70 e.g. critical value of F at alpha=.05 with 3 & 12 df =3.49 Testing Hypotheses about 2 Variances Suppose H 0 : 12 22 ; H1 : 12 22 Note 1-tailed. We find N1 16; s12 5.8; N 2 16; s22 1.7 Then df1=df2 = 15, and s12 5.8 F 2 3.41 s2 1.7
Going to the F table with 15 and 15 df, we find that for alpha = .05 (1-tailed), the critical value is 2.40. Therefore the result is significant. A Look Ahead The F distribution is used in many statistical tests Test for equality of variances. Tests for differences in means in ANOVA. Tests for regression models (slopes relating one continuous variable to another like SAT and GPA). Relations among Distributions the Children of the Normal Chi-square is drawn from the normal. N(0,1) deviates squared and summed. F is the ratio of two chi-squares, each divided by its df. A chi-square divided
by its df is a variance estimate, that is, a sum of squares divided by degrees of freedom. F = t2. If you square t, you get an F with 1 df in the numerator. t(2v ) F(1,v ) Review How is F related to the Normal? To chi-square? Suppose we have 2 samples and we want to know whether they were drawn from populations where the variances are equal. Sample1: N=50, s2=25; Sample 2: N=60, s2=30. How can we test? What is the best conclusion for these data?
A Day in Economics. Do Now: some simple task, usually either a short written response to a posed question or a directive to prepare for class Check-in/Quiz: not always graded, ways for me to check student understanding. Lecture/Note-taking: review of...
Send me an email with your name and student id so that I can add you to our list server and to our homework server. See you next week on Tuesday! Physics 121, Spring 2005 Mechanics John Howell Department of...
Pedigree activity. Assume the pedigree below displays information about a recessive, X-linked disorder. Work with a partner to fill in the phenotypes for generation I individuals and genotypes and phenotypes for generation II individuals. Working together in pairs, fill out...
eventually become extinct Left: Red Wolf Cub (though not pure bred red wolf because pure red wolves are extinct) Right: Minke Whale (targeted for whaling quota) How it affects our environment, human health, and economy Species endangerment is a result...
Cost-Effectiveness Analysis of Results-Based Financing in Zimbabwe and Zambia. Donald S Shepard, PhD. Wu Zeng, MD, PhD ... Pay a fixed dollar amount for remoteness. ... Training on cost-effectiveness analysis of results based financing
Xerte Standard Setting in Psychiatry Undergraduate OSCE Dr Ben di Mambro Clinical Lecturer in Psychiatry Psychiatry Examination 80 EMIs 1 hour 15 minutes Standard set 2 OSCE stations Each station 10 minutes 3 tasks based on competencies % given for...
Emotional/Loaded Words Positive words make you feel good- so, consumers buy the product. "Do you want your home smelling clean, fresh, and odor-free?" Negative words make you feel bad- this plays on your heart. "[Our product] cleans even the dirtiest...
Ready to download the document? Go ahead and hit continue!