Why tests of significance lead to many false discoveries If you read almost any scientific paper you’ll find statements like this result was statistically significant (P = 0.047). Tests of significance were designed to prevent you from making a fool of yourself by claiming to have discovered something, when in fact all you are seeing is the effect of random chance. But P values don’t measure that. It is much more sensible to think about the false discovery rate (FDR). This is the probability that, when a test comes out as ‘statistically significant’, there is actually no real effect. This is therefore the probability that you make a fool of yourself by claiming, falsely, to have discovered a real effect. Although the false discovery rate is clearly the relevant thing to consider, there has been a great deal of disagreement among statisticians about how to calculate its value. The arguments are often presented as Bayesian, but that description, though not wrong, is unnecessary. Three different approaches to calculation of the false discovery rate are given in (1). None of them involve the contentious subjective probabilities that have caused such controversy in Bayesian arguments. All three arguments give much the same answer, despite making rather different assumptions. If you observe, say, P = 0.045, and declare that you have discovered something, your chance of being wrong is at least 30%, and if the sample size is small (under-powered experiments) the false discover rate may be 70%. It is still almost universal for authors to make claims to have discovered a real effect when they see P < 0.05. The term “significant” has done enormous harm, It is likely that neglect of the false discovery rate is a large part of the reason for the conclusions of John Ioannidis (2) that “most published research findings are false” (1) Colquhoun, D. (2104) An investigation of the false discovery rate and the misinterpretation of P values. http://arxiv.org/abs/1407.5296 (2) Ioannidis, J. P. (2005), Why most published research findings are false, PLoS. Med., 2, e124.
|