Hypothesis Testing – Chi-Squared

The following table lists the weights of 50 boys (6-year-olds) sampled randomly. Can you test the hypothesis that the weights of 6-years old boys follow a normal distribution with a mean = 25 and a standard deviation = 2? We will do a chi-squared test to find out the answer to this.

2824272427
2625292224
2325212225
2627272629
2827222321
2924232322
2522292830
2428262525
2829262730
2231252427

The hypotheses

The null hypothesis, H0, in this case: there is no difference between the observed frequencies and the expected frequencies of a normal distribution with mean = 25 and standard deviation = 2.

The alternative hypothesis, HA: there is a difference between the observed frequencies and the expected frequencies of a normal distribution with mean = 25 and standard deviation = 2.

Estimation of chi2

Let us divide the data in the previous table into six groups of equal ranges. The frequencies of those ranges are counted. The expected frequency is estimated from the cumulative distribution function of the normal distribution for each of the ranges using the formula

Ei = n x [F(Ui) – F(Li)]

n is the number of samples, F(Ui) is the upper limit of a range, and F(Li) is the lower limit.

RangeObserved
Frequency (O)
Expected
Frequency (E)
(O-E)2/E
20 – 2120.831.65
22 -23106.651.69
24 -251316.450.72
26 – 271216.071.03
28 – 29106.212.31
30 – 3130.944.51
11.92

The critical value at the 5% significance level and the p-value are estimated using the following R code.

qchisq(0.05, 5, lower.tail = FALSE)
pchisq(11.92, df=5, lower.tail=FALSE)

The critical value is 11.07, and the p-value is 0.036. Since the estimated chi-square (11.92) is outside the critical value, we reject the null hypothesis that the data follow the normal distribution with a mean = 25 and a standard deviation of 2.