Abstract:
P-values are a common component and outcome measure in most every published observational or randomized clinical trial. However, many physicians, researchers, journalists, and policy makers have little or no training in statistics and are forced to rely on the interpretation of results based solely on the authors or secondary sources. Statistical analysis of data often involves the calculation and reporting of the p-value as statistically significant or not, without much further thought. But p-values are highly un-replicable and their definition is not directly associated with reproducibility. Findings from clinical studies are not valid if they cannot be reproduced. Although other methodological issues relate to reproducibility the p-value is arguably at the root of the problem. Many common misinterpretations and misuses of the p-value are practiced. The American Statistical Association (ASA) recently published its first ever policy statement concerning their proper use and interpretation of p-values for scientists and researchers. This policy statement addresses the misguided practice of interpreting study results based solely on the p-value, given that it is often irreproducible in subsequent, similar studies. We investigated the irreproducibility of the p-value by using simulation software and results reported from a published randomized control trial. We show that the probability of attaining another statistically significant p-value varied quite widely on replication. We also show that power alone determines the distribution of p, and will vary with sample size and effect size. In conclusion, p-values interpreted solely by themselves, can be misleading potentially leading to biased inferences from clinical studies.