Ecological fallacy

A lot of data we know describe the general trends of a region or group than from its members. E.g. the crime rate of a city is estimated based on the total number of crimes divided by the number of people. It is not calculated based on surveys of every individual member. A city can be high in crime rates, yet 99% of the individuals are not under any threat to their life or property. In other words, there may be a few pockets in the city that experience disproportionately more crimes than the rest.

Ecological fallacy describes the logical error when we take a statistic meant to represent an area or group and apply it to the individuals or objects inside. It gets the name because the data was meant to describe the system, the environment or the ecology.

A lot of stereotypes arise out of ecological fallacy. A well-known example is racial profiling, in which a person is discriminated against or stereotyped against her ethnicity, religion or nationality. Simpson’s paradox, something we had discussed in the past, is a special case of ecological fallacy.

A classical case was the 1950 paper published by Robinson in American Sociological Review. He found a positive correlation between migrants (colour) and illiteracy. Yet, he found, at the state level, a negative correlation (-0.53) between illiteracy and the number of people born outside the US. This was counterintuitive. One possible explanation is that while migrants tend to be more illiterate, they tend to migrate to regions that are, on average, more literate, such as big cities.

Robinson, W. S; American Sociological Review, 1950, 15 (3),  351-357.