It is the title of a famous analysis paper published by Ioannidis in 2005. While the article goes a bit deeper in its commentary, we check the basic understanding behind the claim – through Bayesian thinking.
Positive predictive value, the ability of analysis to predict the positive outcome correctly, is the posterior probability of an event based on prior knowledge and the likelihood. The definition of PPV in the language of Bayes’ theorem is,
P(T|CT) – The probability that the hypothesis is true given it is claimed to be true (in a publication)
P(CT|T) – The probability that the claim is true given it is true (true hypothesis proven correct)
P(T) – The prior probability of a true hypothesis
P(CT|nT) – The probability that the claim is true given it is not true (false hypothesis not rejected = 1 – false hypothesis rejected)
P(nT) – The prior probability of an incorrect hypothesis (1 – P(T))
Deluge of data
The last few years have seen an exponential growth of correlations due to a flurry of information and technology breakthroughs. For example, the US government issues data of ca. 45000 economic statistics and an imaginative economist can find out several millions of correlations among those, most of which are just wrong. In other words, the proportion of causal relationships in these millions of correlations is declining with more data. In the language of our equation, the prior (P(T)) drops.
Suppose the researcher can rightly identify a true hypothesis 80% of the time (which is quite impressive) and rightly reject an incorrect one at 90% accuracy. Yet, the overall success, PPV, is only 47% if the prior probability of a true relationship is only 1 in 10.
References
Why Most Published Research Findings Are False: John P. A. Ioannidis; PLoS Medicine, 2005, 2(8)
The Signal and the Noise: Nate Silver