The perils of P-values: How to be smart when writing about stats

By Leslie A. Pray

A 2010 study[^1] in the Journal of the American Medical Association reported that exercising 60 minutes a day leads to less weight gain over time. The study, which involved more than 34,000 women, prompted widespread media coverage. Sure, p<0.001. But the actual effect was less than a half pound difference over three years! That is what Kristin Sanaini, freelance science writer and professor at Stanford University, Palo Alto, California, discovered when she dug into the paper.

Regina Nuzzo speaks in the "Perils of P-values" session. Photo by Leslie A. Pray.

Sanaini is a big fan of exercise. But based on a half pound difference over three years, she suggested that exercise is probably not where we should be putting our public health dollars. She cautioned science journalists to ignore the p-value, especially in large studies. Focus instead on effect size. That was a key message from the NASW workshop session on the perils of p-values.

Veronica Vieland, of Nationwide Children’s Hospital, Columbus, Ohio, highlighted the distinction between p-values and evidence. Unlike a thermometer, which behaves predictably, with mercury rising as temperature increases, p-values do not behave like the evidence they are purported to measure. Sometimes p-values actually increase as evidence accumulates. Vieland said, “The best we can do is to be very, very careful about what the p-value is and what it is not and one thing it is definitely not is a measure of the strength of the evidence.”

Regina Nuzzo, freelance journalist and professor at Gallaudet University, Washington, D.C., urged science journalists to remember that p-values are about behavior, not truth. They determine whether a study gets published or is newsworthy and whether research continues. She reiterated Sanaini’s suggestion to report the effect, not the p-value. Sometimes, as Sanaini did with the JAMA paper, you have to dig for it. If you don’t see it, ask. Or look for the r2 or 95% confidence interval. More importantly, put the statistics aside altogether. Ask researchers how extraordinary their findings are or if they had prior reason to believe that they would find what they found. Was their study exploratory or validating? “Get them back to the science,” Nuzzo said.

Tom Siegfried speaks in the "Perils of P-values" session. Photo by Leslie A. Pray.

The “most diabolical fact” about p-values, in moderator and freelance science journalist Tom Siegfried’s opinion, is that scientific papers with the most significant p-value issues are the ones most likely to get the greatest news coverage. They are usually first reports, advances in a “hot” research field, or findings “contrary to previous belief.” He encouraged journalists to look beyond p-values in the abstract and to write stories that provide more context. Consider other ways besides single study stories to present news.

The presentations prompted not just questions, but for one audience member, a self-described “existential crisis.” How can science journalists trust scientists, given the pervasiveness of p-values in the scientific literature? P-value-free Bayesian statistics, while still in what Nuzzo described as a “magic pony” phase, offer hope. Meanwhile, Nuzzo said, “p-values are where the discussion should begin, not end.”

[^1]: Lee, I.M., et al. 2010. Physical activity and weight gain prevention. JAMA 303:1173-9.