Tukey’s Method: Who Made the Difference?

In the previous ANOVA exercises, we found that data suggested rejecting the null hypothesis. To remind you of the two hypotheses,
N0 – All means are equal
NA – Not all means are equal
So, we at least rejected the proposition that all means are equal because the p-value was lower than the chosen significance level of 0.05 (or the F value was outside the critical F value corresponding to the 0.05 level). But we have no idea which of the pairs of means had the most significant difference.

Tukey method can create confidence intervals for all pair-wise differences while controlling the family error rate to whatever we specify.

Family error rate

You know what is an error rate. It is the probability that the null hypothesis is correct when you reject it when the p-value is less than the significance level. At a significance level is 0.05, there is a 5% chance of getting your outcome when the null hypothesis is correct. The situation is called a false positive.

The p-value we obtained for the material testing problem was 0.03, but it was for the entire family of four vendor groups (each with ten samples). This is the experiment-wise or family-wise error rate. Since our significance level for the F-test was kept at 0.05, we can regard the family error rate to be 5%.

Four groups, six comparisons

Since we had four groups (factors) of samples, each representing one vendor, we have six possible comparisons. They are:

#Comparison
1Vendor 2-Vendor 1
2Vendor 3-Vendor 1
3Vendor 4-Vendor 1
4Vendor 3-Vendor 2
5Vendor 4-Vendor 2
6Vendor 4-Vendor 3

Family-wise error rate, 0.05, is the grand union of all pair-wise error rates. If the pair-wise error is alpha, family-wise error = (1 – (1-alpha)C), where C is the number of comparisons. If you substitute alpha = 0.05 and C = 4, you get the family-wise error as 0.26. Obviously, 26% is too high a significance level.

Tukey method preserves the family-wise error rate to what we specify, say, 0.05, and therefore the pair-wise error rates could be about 0.0085.

By keeping all these points in mind, let’s perform the Tukey’s method on our dataset using R.

TukeyHSD(res.aov)

Which leads to the following output:

Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = Strength ~ factor(Sample), data = AN_data)

$`factor(Sample)`
                         diff        lwr       upr     p adj
Vendor 2-Vendor 1 -2.26479763 -4.7917948 0.2621995 0.0924842
Vendor 3-Vendor 1 -0.51997359 -3.0469707 2.0070236 0.9448076
Vendor 4-Vendor 1 -2.36456760 -4.8915647 0.1624295 0.0736423
Vendor 3-Vendor 2  1.74482404 -0.7821731 4.2718212 0.2632257
Vendor 4-Vendor 2 -0.09976996 -2.6267671 2.4272272 0.9995613
Vendor 4-Vendor 3 -1.84459400 -4.3715911 0.6824031 0.2197059

In the next post, we will do a complete exercise of ANOVA including the Post Hoc test.

Hypothesis Testing: An Intuitive Guide: Jim Frost