This is a guest editorial by Richard Harris, a science correspondent at NPR and author of “Rigor Mortis: How Sloppy Science Creates Worthless Cures, Crushes Hope, and Wastes Billions.”
Conversations about the “reproducibility crisis” in science often focus on preclinical medical research and social psychology experiments. There’s good reason for that. These fields generally rely on messy systems—living cells, animals, or in the case of psychology, human volunteers. But is reproducibility simply limited to those fields? The National Academies are about to start a study to look beyond those disciplines. And judging by the problems that drive reproducibility issues, the committee is likely to find problems elsewhere.
Last year, Nature surveyed 1,576 scientists from all disciplines. Overall, 52% perceived a “significant reproducibility crisis,” and another 38% said there was a “slight crisis.” Another 7% said they didn’t know, and only 3% said there was no crisis.
In the case of biomedicine, I find multiple layers of causes, all of which exist to one extent or another in other fields. First is that scientists put too much faith in the ingredients they use. Some 500,000 antibodies are commercially available for experiments, but the quality of those reagents is all over the map, and labs often don’t run enough controls to identify problems. Immortal cell lines are another example, with cross-contamination a major problem. Scientists are paying less attention to more garden-variety reagents, but those are also problematic. That warning would obviously extend beyond the world of biomedicine.
Another huge area of trouble in biomedicine and psychology involves experimental design and statistical analysis. Chemists may have more predictable systems and more reproducible experimental designs, but to the extent they are trapped in the dubious analytical system built around the P value, chemists should be greatly concerned as well. This system is so often misused in science, the American Statistical Association felt compelled to publish a paper in 2016 decrying the shoddy understanding of P values.
And when John Ioannidis wrote the heavily cited essay “Why Most Published Research Findings Are False,” he did not confine his analysis to the life sciences. “For most study designs and settings, it is more likely for a research claim to be false than true,” he notes.
A root cause is that scientists are human beings, and we tend to see what we want (or expect) to see. As the physicist Richard Feynman noted during a memorable commencement address at the California Institute of Technology in 1974, the scientific method is about finding ways to avoid fooling yourself—“and you are the easiest person to fool.” In that speech, Feynman told the story about the circuitous journey to get the correct value for the charge of an electron. The famous Millikan oil-drop experiment got close, but it took many iterations to improve that value. Feynman said that was because when experimentalists got a value higher than Millikan’s benchmark, they assumed they had made a mistake and adjusted their results accordingly.
Another common driver in science is the hypercompetitive world of academia, with a grave imbalance between the amount of grant money available and the number of labs vying for it. That disconnect creates perverse incentives—conscious and unconscious—that reward flashy results over careful substance. One manifestation of that is the dramatic increase in the use of superlatives in scientific abstracts. A 2015 study in The BMJ found that, between 1974 and 2014, positive words such as “robust,” “novel,” “innovative,” and “unprecedented” increased in relative frequency in the PubMed database by up to 15,000%.
So what’s a careful scientist to do? First and foremost, be aware of the conditions around you that may increase the risk of irreproducible results, whether they are bad ingredients, dubious statistical traditions, or outside pressures that can shape behavior. Also take heart. This reproducibility “crisis” isn’t really a crisis at all. These are not new problems. Rather, I think of this moment as an awakening. And that’s a good thing, because we need to recognize that a problem exists before we can seek solutions.
Views expressed on this page are those of the authors and not necessarily those of ACS or C&EN.