Here is a step-by-step process for performing a chi-squared test of independence using R. The following is a survey result from a random sample of 395 people. The survey asked about participants’ education levels. Based on the collected data, do you find any relationships? Consider a 5% significance level.
High School | Bachelors | Masters | Ph.d. | Total | |
Female | 60 | 54 | 46 | 41 | 201 |
Male | 40 | 44 | 53 | 57 | 194 |
Total | 100 | 98 | 99 | 98 | 395 |
Step 1: Make a Table
data= matrix(c(60, 54, 46, 41, 40, 44, 53, 57), ncol=4, byrow=TRUE)
colnames(data) = c('High School','Bachelors','Masters','Ph.d.')
rownames(data) <- c('Female','Male')
survey=as.table(data)
survey
High School Bachelors Masters Ph.d.
Female 60 54 46 41
Male 40 44 53 57
Step 2: Apply chisq.test on the table
chisq.test(survey)
Pearson's Chi-squared test
data: survey
X-squared = 8.0061, df = 3, p-value = 0.04589
Step 3: Interpret the results
The chi-squared = 8.0061 at degrees of freedom = 3. As the p-value = 0.04589 < 0.05, we reject the null hypothesis; the education level depends on the gender at a 5% significance level.
Chi-Square Tests: PennState