Major study shows need to improve how scientists approach early-stage cancer research


Preclinical studies, the kind scientists do before testing in humans, don’t get as much attention as their clinical counterparts. But these are the first vital steps towards eventual treatments and cures. It is important to get the right preclinical results. When they get it wrong, scientists waste resources by following the wrong leads. Worse, false discoveries can trigger clinical studies in humans.

Last December, the Center for Open Science (COS) released the disturbing results of its eight-year, $1.5 million investment Reproducibility project: cancer biology study. Produced in collaboration with the research market Scientific exchangeindependent scientists found that the odds of replicating the results of 50 preclinical experiments from 23 large published studies were no better than a coin toss.

Praises and controversies have followed the project from the start. The newspaper Nature applauded replication studies as “the practice of science at its best.” But the review Science noted that the reactions of some scientists whose studies were chosen ranged from “annoyance anxiety indignation», hampering the replications. Although none of the original experiments have been described in sufficient detail to allow scientists to repeat them, a third of the original authors were uncooperativeand some were even hostile when asked for help.

It is important to get the right preclinical results. When they get it wrong, scientists waste resources by following the wrong leads.

COS executive director Brian Nosek warned that the results pose “challenges to the credibility of preclinical cancer biology.” In tacit recognition that biomedical research has not been universally rigorous or transparent, the US National Institutes of Health (NIH), the world’s largest funder of biomedical research, announced that they increase the requirements for these two qualities.

I have taught and written about good scientific practices in psychology and biomedicine for over 30 years. I’ve reviewed more grant applications and journal manuscripts than I can count, and I’m not surprised.

A stack of journal articles, with passages highlighted in the top one, with a pen resting on them.
Independent scientists found that the odds of replicating the results of 50 preclinical experiments from 23 high-profile published studies were no better than a coin toss.

The two pillars of trustworthy science – transparency and impartial rigor – have faltered under the pressure of incentives that improve careers at the expense of reliable science. Too often, proposed – and surprisingly, published and peer-reviewed – preclinical studies don’t follow the scientific method. Too often, scientists do not share their data funded by the government, even when required by the publishing journal.

Bias control

Numerous preclinical experiments lack rudimentary bias controls which are taught in the social sciences, although rarely in biomedical disciplines such as medicine, cell biology, biochemistry and physiology. Bias control is a key part of the scientific method because it allows scientists to disentangle the experimental signal from procedural noise.

Confirmation bias, the tendency to see what we want to see, is a type of bias that good science controls by “blinding.” Think of “double-blind” procedures in clinical trials where neither the patient nor the research team knows who is getting the placebo and who is getting the drug. In preclinical research, blinding experimenters to the identity of samples minimizes the risk of them changing their behavior, even subtly, in favor of their hypothesis.

Seemingly insignificant differences, such as whether a sample is processed in the morning or afternoon or whether an animal is caged in the top or bottom row, can also alter the results. It’s not as unlikely as you might think. Instantaneous changes in the microenvironment, such as exposure to light and air ventilation, for example, can alter physiological responses.

A row of clear acrylic animal cages, each housing a white rat.
Seemingly insignificant differences, such as whether an animal is caged in the top or bottom row, can alter the results.

If all the animals that receive a drug are caged in one row and all the animals that do not receive the drug are caged in another row, any difference between the two groups of animals may be due to the drug, to their place of accommodation or an interaction between the two. You honestly can’t choose between the alternative explanations, and neither can the scientists.

Randomizing sample selection and processing order minimizes these procedural biases, makes interpretation of results clearer, and makes them more likely to be reproduced.

Lots of blinded and randomized replication experiments, but it’s unclear if the original experiments did. All we know is that for the 15 animal experiments, only one of the original studies reported randomization and none reported blinding. But it would not be surprising if many studies were neither randomized nor blinded.

Study design and statistics

According to one estimate, more than half of the million articles published each year have biased study planscontributing to the waste of 85% of the 100 billion US dollars spent each year on research (mainly preclinical).

In a widely reported commentary, industry scientist and former academic Glenn Begley said he was able to replicate the results of only six out of 53 university studies (11%). He listed six practices reliable research, including blinding. The six studies that replicated followed the six practices. The 47 studies that were not replicated followed few or sometimes none of the practices.

Three people in white coats with a microscope in the foreground, superimposed with bar graphs and data points.
The misuse of statistics is common in biomedical research despite calls for better data analysis practices.

Another way to skew results is to misuse statistics. As with blinding and randomization, it is unclear which, if any, of the original studies in the reproducibility project misused the statistics, due to the lack of transparency of the studies. But this too is common practice.

A dictionary of terms describes a host of bad data analysis practices that can produce statistically significant (but false) results, such as HARKing (Hypothesizing After the Results are Known), p-hacking (repeating statistical tests until a desired result is produced) and following a series of data-dependent analysis decisions called “garden of forking paths” to publishable results.

These practices are mainstream in biomedical research. decades of advocacy from methodologistsAnd one unprecedented statement of the American Statistical Association to change data analysis practices, however gone unnoticed.

A better future

A woman wearing a lab coat, safety glasses and green gloves examining lab samples
Incentives and standards should reward practices that produce reliable science and censor practices that do not, without killing innovation.

Those who are anti-science should not rejoice in these discoveries. The achievements of preclinical science are real and impressive. Decades of preclinical research have led to the development of COVID-19 mRNA vaccines, for example. And most scientists do their best in a system that rewards fast flashy results on the slowest reliable ones.

But science is made by humans with all the strengths and weaknesses that come with it. The trick is to reward practices that produce trustworthy science and censor practices that don’t, without killing innovation.

Changing incentives and enforcing standards are the most effective ways to improve scientific practice. The goal is to improve efficiency by ensuring that scientists who value transparency and rigor over speed and flash have a chance to thrive. It was tried beforewith minimal success. This time may be different. the Reproducibility project: cancer biology The study and the NIH policy changes it prompted might just be the impetus needed to get there.


Comments are closed.