Last time we saw the issue with sub-group analysis, an example of multiple-hypothesis testing. Here, we illustrate the problem with multiple hypothesis testing. Before that, a few recaps.
- Hypothesis testing is a statistical procedure to put assumptions (hypotheses) about a population parameter to test based on evidence collected from samples.
- The Null hypothesis is the default assumption (what we assume is true before evidence)
- Alpha (significance level) represents the strength of the evidence that must be present in your sample that the effect is statistically significant
- p-value is the probability that the observed statistics appeared purely by chance.
- if p < alpha, the null hypothesis is rejected
- A type I error is when the Null hypothesis is true but you rejected it.
- Rejection of the null hypothesis may be called a discovery
Based on item # 6, the probability of type I error is alpha.
Assume five tests are done at a 5% significance level, and the null hypothesis is true. What is the probability that at least one of the tests rejects the null hypothesis?
We know the old formula: at least one = 1 – none. Therefore, at least one type I error = 1 – no type I error = 1 – (1-alpha)5.
1 – (1- 0.05)5 = 0.226 or 22.6%
So, we have a 22.6% chance of rejecting at least one null hypothesis (and making a type I error).