We have seen Simpson’s paradox multiple times before. Here is another illustration. Consider two countries; each has a million people. Following is the number of diseased individuals in a particular episode of the illness. So which country is safe to live?
Country A | Country B | |
# deaths (per mln) | 76.8 | 54.8 |
The conclusions seem pretty obvious,? Until you see the following breakdowns. First, the demographic distribution.
Age | A | B |
0 – 9 | 0.8 | 2 |
10 – 19 | 1.2 | 2 |
20 – 29 | 3.5 | 8 |
30 – 39 | 5.5 | 17 |
40 – 49 | 11 | 19 |
50 – 59 | 18 | 22 |
60 – 69 | 21 | 19 |
70 – 79 | 21 | 8 |
> 80 | 18 | 3 |
Overall | 100 | 100 |
And the incident rate of the disease
Age | A | B |
0 – 9 | 0 | 0 |
10 – 19 | 0 | 1 |
20 – 29 | 0 | 1 |
30 – 39 | 1 | 2 |
40 – 49 | 10 | 20 |
50 – 59 | 10 | 30 |
60 – 69 | 80 | 100 |
70 – 79 | 100 | 200 |
> 80 | 200 | 300 |
Multiplying the respective columns gives the number of death per million people.
Age | A | B |
0 – 9 | 0 | 0 |
10 – 19 | 0 | 0.02 |
20 – 29 | 0 | 0.08 |
30 – 39 | 0.055 | 0.34 |
40 – 49 | 1.1 | 3.8 |
50 – 59 | 1.8 | 6.6 |
60 – 69 | 16.8 | 19 |
70 – 79 | 21 | 16 |
> 80 | 36 | 9 |
76.755 | 54.84 |
The country that saved more people in each age category had more fatalities because it had more people in those buckets where the illness was severe.