• International edition
  • Australia edition
  • Europe edition

Journals in a library

Research findings that are probably wrong cited far more than robust ones, study finds

Academics suspect papers with grabby conclusions are waved through more easily by reviewers

Scientific research findings that are probably wrong gain far more attention than robust results, according to academics who suspect that the bar for publication may be lower for papers with grabbier conclusions.

Studies in top science, psychology and economics journals that fail to hold up when others repeat them are cited, on average, more than 100 times as often in follow-up papers than work that stands the test of time.

The finding – which is itself not exempt from the need for scrutiny – has led the authors to suspect that more interesting papers are waved through more easily by reviewers and journal editors and, once published, attract more attention.

“It could be wasting time and resources,” said Dr Marta Serra-Garcia, who studies behavioural and experimental economics at the University of California in San Diego. “But we can’t conclude that something is true or not based on one study and one replication.” What is needed, she said, is a simple way to check how often studies have been repeated, and whether or not the original findings are confirmed.

The study in Science Advances is the latest to highlight the “replication crisis” where results, mostly in social science and medicine, fail to hold up when other researchers try to repeat experiments. Following an influential paper in 2005 titled Why most published research findings are false , three major projects have found replication rates as low as 39% in psychology journals , 61% in economics journals , and 62% in social science studies published in the Nature and Science, two of the most prestigious journals in the world.

Working with Uri Gneezy, a professor of behavioural economics at UCSD, Serra-Garcia analysed how often studies in the three major replication projects were cited in later research papers. Studies that failed replication accrued, on average, 153 more citations in the period examined than those whose results held up. For the social science studies published in Science and Nature, those that failed replication typically gained 300 more citations than those that held up. Only 12% of the citations acknowledged that replication projects had failed to confirm the relevant findings.

The academic system incentivises journals and researchers to publish exciting findings, and citations are taken into account for promotion and tenure. But history suggests that the more dramatic the results , the more likely they are to be wrong . Dr Serra-Garcia said publishing the name of the overseeing editor on journal papers might help to improve the situation.

Prof Gary King, a political scientist at Harvard University, said the latest findings may be good news. He wants researchers to focus their efforts on claims that are subject to disagreement, so that they can gather more data and figure out the truth. “In some ways, then, we should regard the results of this interesting article as great news for the health of the scholarly community,” he said.

Prof Brian Nosek at the University of Virginia, who runs the Open Science Collaboration to assess reproducibility in psychology research, urged caution. “We presume that science is self-correcting. By that we mean that errors will happen regularly, but science roots out and removes those errors in the ongoing dialogue among scientists conducting, reporting, and citing each others research. If more replicable findings are less likely to be cited, it could suggest that science isn’t just failing to self-correct; it might be going in the wrong direction.’

“The evidence is not sufficient to draw such a conclusion, but it should get our attention and inspire us to look more closely at how the social systems of science foster self-correction and how they can be improved,” he added.

  • Higher education

More on this story

why was a research paper in 2013 proven wrong

Write down your thoughts and shred them to relieve anger, researchers say

why was a research paper in 2013 proven wrong

Fridge magnets can be cool aid to holiday memory recall, study finds

why was a research paper in 2013 proven wrong

Want to skip that Christmas party? The host probably won’t mind, study shows

why was a research paper in 2013 proven wrong

‘Succession syndrome’ prevalent among wealthy households, psychiatrists warn

why was a research paper in 2013 proven wrong

Drugs and alcohol do not make you more creative, research finds

why was a research paper in 2013 proven wrong

Why being rude to the waiter (or other staff) is the worst strategy

why was a research paper in 2013 proven wrong

Authors of original dating profiles rated more attractive, research finds

why was a research paper in 2013 proven wrong

I fear my children are overexposed to technology. Experts say I’m right to worry

why was a research paper in 2013 proven wrong

I used to be ashamed of being a fangirl. Now I see how joyous and creative it was

why was a research paper in 2013 proven wrong

Autistic scholar Temple Grandin: ‘The education system is screening out visual thinkers’

Most viewed.

  • Follow us on Facebook
  • Follow us on Twitter
  • Criminal Justice
  • Environment
  • Politics & Government
  • Race & Gender

Expert Commentary

Don’t say ‘prove’: How to report on the conclusiveness of research findings

This tip sheet explains why it's rarely accurate for news stories to report that a new study proves anything — even when a press release says it does.

research studies don't say prove tip sheet

Republish this article

Creative Commons License

This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License .

by Denise-Marie Ordway, The Journalist's Resource February 13, 2023

This <a target="_blank" href="https://journalistsresource.org/media/dont-say-prove-research-tip-sheet/">article</a> first appeared on <a target="_blank" href="https://journalistsresource.org">The Journalist's Resource</a> and is republished here under a Creative Commons license.<img src="https://journalistsresource.org/wp-content/uploads/2020/11/cropped-jr-favicon-150x150.png" style="width:1em;height:1em;margin-left:10px;">

When news outlets report that new research studies prove something, they’re almost certainly wrong.

Studies conducted in fields outside of mathematics do not “prove” anything. They find evidence — sometimes, extraordinarily strong evidence.

It’s important journalists understand that science is an ongoing process of collecting and interrogating evidence, with each new discovery building on or raising questions about earlier discoveries. A single research study usually represents one small step toward fully understanding an issue or problem.

Even when scientists have lots of very strong evidence, they rarely claim to have found proof because proof is absolute. To prove something means there is no chance another explanation exists.

“Even a modest familiarity with the history of science offers many examples of matters that scientists thought they had resolved, only to discover that they needed to be reconsidered,” Naomi Oreskes , a professor of the history of science at Harvard University, writes in a July 2021 essay in Scientific American. “Some familiar examples are Earth as the center of the universe, the absolute nature of time and space, the stability of continents, and the cause of infectious disease.”

Oreskes points out in her 2004 paper “ Science and Public Policy: What’s Proof Got To Do With It? ” that “proof — at least in an absolute sense — is a theoretical ideal, available in geometry class but not in real life.”

Math scholars routinely rely on logic to try to prove something beyond any doubt. What sets mathematicians apart from other scientists is their use of mathematical proofs, a step-by-step argument written using words, symbols and diagrams to convince another mathematician that a given statement is true, explains Steven G. Krantz , a professor of mathematics and statistics at Washington University in St. Louis.

“It is proof that is our device for establishing the absolute and irrevocable truth of statements in our subject,” he writes in “ The History and Concept of Mathematical Proof .” “This is the reason that we can depend on mathematics that was done by Euclid 2300 years ago as readily as we believe in the mathematics that is done today. No other discipline can make such an assertion.”

If you’re still unsure how to describe the conclusiveness of research findings, keep reading. These four tips will help you get it right.

1. Avoid reporting that a research study or group of studies “proves” something — even if a press release says so.

Press releases announcing new research often exaggerate or minimize findings, academic studies have found . Some mistakenly state researchers have proven something they haven’t.

The KSJ Science Editing Handbook urges journalists to read press releases carefully. The handbook, a project of the Knight Science Journalism Fellowship at MIT , features guidance and insights from some of the world’s most talented science writers and editors.

“Press releases that are unaccompanied by journal publications rarely offer any data and, by definition, offer a biased view of the findings’ value,” according to the handbook, which also warns journalists to “never presume that everything in them is accurate or complete.”

Any claim that researchers in any field outside mathematics have proven something should raise a red flag for journalists, says Barbara Gastel , a professor of integrative biosciences, humanities in medicine, and biotechnology at Texas A&M University.

She says journalists need to evaluate the research themselves.

“Read the full paper,” says Gastel, who’s also director of Texas A&M University’s master’s degree program in science and technology journalism . “Don’t go only on the news release. Don’t go only on the abstract to get a full sense of how strong the evidence is. Read the full paper and be ready to ask some questions — sometimes, hard questions — of the researchers.”

2. Use language that correctly conveys the strength of the evidence that a research study or group of studies provides.

Researchers investigate an issue or problem to better understand it and build on what earlier research has found. While studies usually unearth new information, it’s seldom enough to reach definitive conclusions.

When reporting on a study or group of studies, journalists should choose words that accurately convey the level of confidence researchers have in the findings, says Glenn Branch , deputy director of the nonprofit National Center for Science Education , which studies how public schools, museums and other organizations communicate about science.

For example, don’t say a study “establishes” certain facts or “settles” a longstanding question when it simply “suggests” something is true or “offers clues” about some aspect of the subject being examined.

Branch urges journalists to pay close attention to the language researchers use in academic articles. Scientists typically express themselves in degrees of confidence, he notes. He suggests journalists check out the guidance on communicating levels of certainty across disciplines offered by the Intergovernmental Panel on Climate Change , created by the United Nations and World Meteorological Organization to help governments understand, adapt to and mitigate the impacts of climate change.

“The IPCC guidance is probably the most well-developed system for consistently reporting the degree of confidence in scientific results, so it, or something like it, may start to become the gold standard,” Branch wrote via email.

Gastel says it is important journalists know that even though research in fields outside mathematics do not prove anything, a group of studies, together, can provide evidence so strong it gets close to proof.

It can provide “overwhelming evidence, particularly if there are multiple well-designed studies that point in the same direction,” she says.

To convey very high levels of confidence, journalists can use phrases such as “researchers are all but certain” and “researchers have as much confidence as possible in this area of inquiry.”

Another way to gauge levels of certainty: Find out whether scholars have reached a scientific consensus ,  or a collective position based on their interpretation of the evidence.

Independent scientific organizations such as the National Academy of Sciences, American Association for the Advancement of Science and American Medical Association issue consensus statements on various topics, typically to communicate either scientific consensus or the collective opinion of a convened panel of subject experts.

3. When reporting on a single study, explain what it contributes to the body of knowledge on that given topic and whether the evidence, as a whole, leans in a certain direction. 

Many people are unfamiliar with the scientific process, so they need journalists’ help understanding how a single research study fits into the larger landscape of scholarship on an issue or problem. Tell audiences what, if anything, researchers can say about the issue or problem with a high level of certainty after considering all the evidence, together.

A great resource for journalists trying to put a study into context: editorials published in academic journals. Some journals, including the New England Journal of Medicine and JAMA , the journal of the American Medical Association, sometimes publish an editorial about a new paper along with the paper, Gastel notes.

Editorials, typically written by one or more scholars who were not involved in the study but have deep expertise in the field, can help journalists gauge the importance of a paper and its contributions.

“I find that is really handy,” Gastel adds.

4. Review headlines closely before they are published. And read our tip sheet on avoiding mistakes in headlines about health and medical research.

Editors, especially those who are not familiar with the process of scientific inquiry, can easily make mistakes when writing or changing headlines about research. And a bad headline can derail a reporter’s best efforts to cover research accurately.

To prevent errors, Gastel recommends reporters submit suggested headlines with their stories. She also recommends they review their story’s headline right before it is published.

Another good idea: Editors, including copy editors, could make a habit of consulting with reporters on news headlines about research, science and other technical topics. Together, they can choose the most accurate language and decide whether to ever use the word ‘prove.’

Gastel and Branch agree that editors would benefit from science journalism training, particularly as it relates to reporting on health and medicine. Headlines making erroneous claims about the effectiveness of certain drugs and treatments can harm the public. So can headlines claiming researchers have “proven” what causes or prevents health conditions such as cancer, dementia and schizophrenia.

Our tip sheet on headline writing addresses this and other issues.

“’Prove’ is a short, snappy word, so it works in a headline — but it’s usually wrong,” says Branch. “Headline writers need to be as aware of this as the journalists are.”

About The Author

' src=

Denise-Marie Ordway

share this!

July 5, 2018

Beware those scientific studies—most are wrong, researcher warns

by Ivan Couronne

Seafood is one of many food types that have been linked with lower cancer risks

A few years ago, two researchers took the 50 most-used ingredients in a cook book and studied how many had been linked with a cancer risk or benefit, based on a variety of studies published in scientific journals.

The result? Forty out of 50, including salt, flour, parsley and sugar. "Is everything we eat associated with cancer?" the researchers wondered in a 2013 article based on their findings.

Their investigation touched on a known but persistent problem in the research world: too few studies have large enough samples to support generalized conclusions.

But pressure on researchers, competition between journals and the media's insatiable appetite for new studies announcing revolutionary breakthroughs has meant such articles continue to be published.

"The majority of papers that get published, even in serious journals, are pretty sloppy," said John Ioannidis, professor of medicine at Stanford University, who specializes in the study of scientific studies.

This sworn enemy of bad research published a widely cited article in 2005 entitled: "Why Most Published Research Findings Are False."

Since then, he says, only limited progress has been made.

Some journals now insist that authors pre-register their research protocol and supply their raw data, which makes it harder for researchers to manipulate findings in order to reach a certain conclusion. It also allows other to verify or replicate their studies.

Because when studies are replicated, they rarely come up with the same results. Only a third of the 100 studies published in three top psychology journals could be successfully replicated in a large 2015 test.

Medicine, epidemiology, population science and nutritional studies fare no better, Ioannidis said, when attempts are made to replicate them.

"Across biomedical science and beyond, scientists do not get trained sufficiently on statistics and on methodology," Ioannidis said.

Too many studies are based solely on a few individuals, making it difficult to draw wider conclusions because the samplings have so little hope of being representative.

The wine museum in Bolgheri, Italy: a famous 2013 study on the benefits of the Mediterranean diet against heart disease had to b

Coffee and Red Wine

"Diet is one of the most horrible areas of biomedical investigation," professor Ioannidis added—and not just due to conflicts of interest with various food industries.

"Measuring diet is extremely difficult," he stressed. How can we precisely quantify what people eat?

In this field, researchers often go in wild search of correlations within huge databases, without so much as a starting hypothesis.

Even when the methodology is good, with the gold standard being a study where participants are chosen at random, the execution can fall short.

A famous 2013 study on the benefits of the Mediterranean diet against heart disease had to be retracted in June by the most prestigious of medical journals, the New England Journal of Medicine , because not all participants were randomly recruited; the results have been revised downwards.

So what should we take away from the flood of studies published every day?

Ioannidis recommends asking the following questions: is this something that has been seen just once, or in multiple studies? Is it a small or a large study? Is this a randomized experiment? Who funded it? Are the researchers transparent?

These precautions are fundamental in medicine, where bad studies have contributed to the adoption of treatments that are at best ineffective, and at worst harmful.

In their book "Ending Medical Reversal," Vinayak Prasad and Adam Cifu offer terrifying examples of practices adopted on the basis of studies that went on to be invalidated, such as opening a brain artery with stents to reduce the risk of a new stroke.

Studies regularly single out the consumption of red wine as either a cancer risk—or a way to fend off the disease

It was only after 10 years that a robust, randomized study showed that the practice actually increased the risk of stroke.

The solution lies in the collective tightening of standards by all players in the research world, not just journals but also universities, public funding agencies. But these institutions all operate in competitive environments.

"The incentives for everyone in the system are pointed in the wrong direction," Ivan Oransky, co-founder of Retraction Watch, which covers the withdrawal of scientific articles, tells AFP. "We try to encourage a culture, an atmosphere where you are rewarded for being transparent."

The problem also comes from the media, which according to Oransky needs to better explain the uncertainties inherent in scientific research, and resist sensationalism.

"We're talking mostly about the endless terrible studies on coffee, chocolate and red wine," he said.

"Why are we still writing about those? We have to stop with that."

Journal information: New England Journal of Medicine

Explore further

Feedback to editors

why was a research paper in 2013 proven wrong

The experimental demonstration of a verifiable blind quantum computing protocol

19 hours ago

why was a research paper in 2013 proven wrong

A machine learning-based approach to discover nanocomposite films for biodegradable plastic alternatives

20 hours ago

why was a research paper in 2013 proven wrong

Saturday Citations: Listening to bird dreams, securing qubits, imagining impossible billiards

21 hours ago

why was a research paper in 2013 proven wrong

Physicists solve puzzle about ancient galaxy found by Webb telescope

22 hours ago

why was a research paper in 2013 proven wrong

Researchers study effects of solvation and ion valency on metallopolymers

why was a research paper in 2013 proven wrong

Chemists devise easier new method for making a common type of building block for drugs

why was a research paper in 2013 proven wrong

Research team discovers more than 50 potentially new deep-sea species in one of the most unexplored areas of the planet

Apr 12, 2024

why was a research paper in 2013 proven wrong

New study details how starving cells hijack protein transport stations

why was a research paper in 2013 proven wrong

New species of ant found pottering under the Pilbara named after Voldemort

why was a research paper in 2013 proven wrong

Searching for new asymmetry between matter and antimatter

Relevant physicsforums posts, cover songs versus the original track, which ones are better.

2 hours ago

A Rain Song -- Favorite one? Memorable one? One you like?

12 hours ago

Interesting anecdotes in the history of physics?

Favorite mashups - all your favorites in one place, which ancient civilizations are you most interested in, biographies, history, personal accounts.

More from Art, Music, History, and Linguistics

Related Stories

Science says: what happens when researchers make mistakes.

Jun 13, 2018

why was a research paper in 2013 proven wrong

One reason so many scientific studies may be wrong

Oct 6, 2016

New guidelines issued for reporting of genetic risk research

Mar 28, 2011

why was a research paper in 2013 proven wrong

Two studies cast doubt on credibility of medical research

Jan 4, 2016

Study: Published research often is false

Aug 30, 2005

why was a research paper in 2013 proven wrong

Researchers announce master plan for better science

Jan 10, 2017

Recommended for you

why was a research paper in 2013 proven wrong

Building footprints could help identify neighborhood sociodemographic traits

Apr 10, 2024

why was a research paper in 2013 proven wrong

Are the world's cultures growing apart?

why was a research paper in 2013 proven wrong

First languages of North America traced back to two very different language groups from Siberia

Apr 9, 2024

why was a research paper in 2013 proven wrong

Can the bias in algorithms help us see our own?

why was a research paper in 2013 proven wrong

The 'Iron Pipeline': Is Interstate 95 the connection for moving guns up and down the East Coast?

why was a research paper in 2013 proven wrong

Americans are bad at recognizing conspiracy theories when they believe they're true, says study

Apr 8, 2024

Let us know if there is a problem with our content

Use this form if you have come across a typo, inaccuracy or would like to send an edit request for the content on this page. For general inquiries, please use our contact form . For general feedback, use the public comments section below (please adhere to guidelines ).

Please select the most appropriate category to facilitate processing of your request

Thank you for taking time to provide your feedback to the editors.

Your feedback is important to us. However, we do not guarantee individual replies due to the high volume of messages.

E-mail the story

Your email address is used only to let the recipient know who sent the email. Neither your address nor the recipient's address will be used for any other purpose. The information you enter will appear in your e-mail message and is not retained by Phys.org in any form.

Newsletter sign up

Get weekly and/or daily updates delivered to your inbox. You can unsubscribe at any time and we'll never share your details to third parties.

More information Privacy policy

Donate and enjoy an ad-free experience

We keep our content available to everyone. Consider supporting Science X's mission by getting a premium account.

E-mail newsletter

Loading metrics

Open Access

Correspondence

Why Most Published Research Findings Are False: Problems in the Analysis

  • Steven Goodman,
  • Sander Greenland
  • Steven Goodman, 

PLOS

Published: April 24, 2007

  • https://doi.org/10.1371/journal.pmed.0040168
  • Reader Comments

Citation: Goodman S, Greenland S (2007) Why Most Published Research Findings Are False: Problems in the Analysis. PLoS Med 4(4): e168. https://doi.org/10.1371/journal.pmed.0040168

Copyright: © 2007 Goodman and Greenland. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Funding: The authors received no specific funding for this article.

Competing interests: The authors have declared that no competing interests exist.

The article published in PLoS Medicine by Ioannidis [ 1 ] makes the dramatic claim in the title that “most published research claims are false,” and has received extensive attention as a result. The article does provide a useful reminder that the probability of hypotheses depends on much more than just the p -value, a point that has been made in the medical literature for at least four decades, and in the statistical literature for decades previous. This topic has renewed importance with the advent of the massive multiple testing often seen in genomics studies.

Unfortunately, while we agree that there are more false claims than many would suspect—based both on poor study design, misinterpretation of p -values, and perhaps analytic manipulation—the mathematical argument in the PLoS Medicine paper underlying the “proof” of the title's claim has a degree of circularity. As we show in detail in a separately published paper [ 2 ], Dr. Ioannidis utilizes a mathematical model that severely diminishes the evidential value of studies—even meta-analyses—such that none can produce more than modest evidence against the null hypothesis, and most are far weaker. This is why, in the offered “proof,” the only study types that achieve a posterior probability of 50% or more (large RCTs [randomized controlled trials] and meta-analysis of RCTs) are those to which a prior probability of 50% or more are assigned. So the model employed cannot be considered a proof that most published claims are untrue, but is rather a claim that no study or combination of studies can ever provide convincing evidence.

  • Calculating the evidential effect only of verdicts of “significance,” i.e., p ≤ 0.05, instead of the actual p-value observed in a study, e.g., p = 0.001.
  • Introducing a new “bias” term into the Bayesian calculations, which even at a described “minimal” level (of 10%) has the effect of very dramatically diminishing a study's evidential impact.

In addition to the above problems, the paper claims to have proven something it describes as paradoxical; that the “hotter” an area is (i.e., the more studies published), the more likely studies in that area are to make false claims. We have shown this claim to be erroneous [ 2 ]. The mathematical proof offered for this in the PLoS Medicine paper shows merely that the more studies published on any subject, the higher the absolute number of false positive (and false negative) studies. It does not show what the papers' graphs and text claim, viz, that the number of false claims will be a higher proportion of the total number of studies published (i.e., that the positive predictive value of each study decreases with increasing number of studies).

The paper offers useful guidance in a number of areas, calling attention to the importance of avoiding all forms of bias, of obtaining more empirical research on the prevalence of various forms of bias, and on the determinants of prior odds of hypotheses. But the claims that the model employed in this paper constitutes a “proof” that most published medical research claims are false, and that research in “hot” areas is most likely to be false, are unfounded.

  • View Article
  • Google Scholar
  • 2. Goodman S, Greenland S (2007) Assessing the unreliability of the medical literature: A response to “Why most published research findings are false”. Johns Hopkins University, Department of Biostatistics. Available: http://www.bepress.com/jhubiostat/paper135 . Accessed 21 March 2007.

Skip to content

Feature Article — No. This Study Does Not Prove What You Think It Does: Part I

Published on Mar 10, 2022

Parents PACK

Editorial note: This series was originally published in the Parents PACK newsletter, a free monthly e-newsletter for the public that addresses vaccines and related topics. To learn more about the program, visit vaccine.chop.edu/parents .

Throughout the course of the COVID-19 pandemic, a few scientific papers have gained attention because they were considered to be proof of popular points of view. Typically, a single paper does not prove anything; instead, it supports a theory or adds evidence to a body of knowledge. Indeed, often when a study is enthusiastically supported for a popular social idea, a deeper look finds that the conclusions are not as clear-cut as they are being deemed. Often, this is because the findings were “cherry-picked.” Unfortunately, cherry-picking, or selectively choosing parts of the data to accept while ignoring other parts, is one of many tactics used by people who want to fit science to their beliefs, but that is not how science works.

In this three-part series, we address common misconceptions about the practice of science (and scientists) and use some specific examples of studies to help you sort out the headlines and wade through your social media feeds. Part two looks at some studies that ultimately did change our understanding of previously accepted science, and part three addresses some studies that were misinterpreted with a look at what aspects were at the heart of the confusion and why.

So, let’s get started.

There is no such thing as “my science and your science”

If your car won’t start, you’re likely to go through a variety of possible issues that could be preventing the car from starting — battery, gas, engine, etc. Your goal is to figure out why the car won’t start. This is similar to how science is done. Scientists start with a problem or a question and they go through a variety of experiments to get more information about the matter.

Now back to your car. You may have an idea that the problem is the battery. This “hunch” probably comes from existing information — the age of your battery, a hesitation you noticed when you started the car last week, etc. Your hunch about the battery is your educated guess. In science, it is called your hypothesis. If you find that the battery is not the issue, you will keep looking for the problem. It does not matter that your hypothesis was not correct; it just means something else is going on. The same is true of science. Each study adds a bit more information, and a scientist needs to let go of or adjust a hypothesis that does not agree with their findings.

If we think about that in terms of your car problem, it would not help you to keep focusing on the battery when you have evidence that it’s not the problem because your car still won’t start.  The same is true for science. When done properly, science aims to find an answer — not to support a position. Scientists can’t start out with the end in mind. They can make a hypothesis about what they think they will find, but they can’t work as though the answer is already known. They need to remain open minded and be OK with their hypothesis being wrong. They also can’t assume they are right and that the experiment is wrong just because what they wanted to be true is not. Science is a way of knowing about the world. It is about both finding and ruling out information about a topic.

Maybe you are hoping the problem is your battery because you have a spare one in your garage. In this scenario, you have a pre-existing bias toward the battery being the problem. Because you are biased toward this outcome, you may not be as careful in your evaluation of other possibilities. Rather you may just go get the battery and put it in your car — only to find that your car still won’t start. Your bias got in the way of your assessment. While everyone has biases, scientists are trained to study in a manner that reduces their personal biases in the outcome. Now, just like any profession, some scientists are better at overlooking their biases than others, but it is important to realize that even though scientists can be biased, science cannot. Said another way, even if you really want the issue to be your car battery, your car won’t start simply by replacing the battery if something else is going on — no matter how much you want it to.

Scientists are skeptics

Generally speaking, people who practice science are skeptics. They are not just going to believe what they are told; they will seek out weaknesses and evaluate the validity of the conclusions. This is where the peer-review process comes in. Anytime a study is published, other scientists critically evaluate it, meaning they take a long and careful look under the hood. What method was used? How many samples were taken? Were the appropriate statistics used? Do the conclusions follow from the data? From the outside, this process can look like taking sides based on one’s own beliefs, but it’s not that simple. Peer review requires more than just saying a study isn’t good. It also requires substantiating that feedback with evidence.

Because most journals only publish studies that have been peer reviewed, the process often keeps poorly constructed studies from being published, but that is not always the case. For example, some journals allow people to publish simply by paying a fee. Now, publishing in one of these journals does not necessarily mean the study is poorly done, but it is important to know whether a study was peer reviewed or not because it helps us understand if other scientists got to evaluate it before it was published, and even after publication, studies will continue to be evaluated as more and more scientists review them. In some cases, as more scientists review a study, concerns arise about its quality. On very rare occasions, these concerns can lead to a study being retracted, or removed from publication, but more often, the study will just fade into the background of scientific literature. You can think of studies like pieces of a giant puzzle with scientists from all over the world working to add pieces to their own little corner of that puzzle. Some studies will add a very important piece to the puzzle, filling in a hole and tying together several other findings, but most studies add one little piece to the edge — important, but still a baby step toward seeing the big picture.

Scientists who repeatedly set up studies to get a pre-determined outcome will not contribute much to the puzzle, nor will they gain the respect of their colleagues. Other scientists working on the puzzle will come to realize that those individuals or groups are pushing an agenda rather than practicing science.

Mavericks are rare

Sometimes a scientist keeps publishing studies that go against the generally accepted body of knowledge about a topic. As their work continues to be dismissed or criticized by colleagues, they often take their ideas directly to the media and the public, assuming the role of victim. Sometimes they attempt to position themselves as mavericks whose findings are so earth-shattering that they are upsetting established understanding about a topic. They suggest that they are being targeted for their “breakthrough,” but vow to keep fighting. Often this rhetoric is accompanied by suggestions of a cover-up — by the government, by companies interested only in profits, by scientists, or by some combination of these. To this, we ask: Have you ever tried to get a group of people to keep a secret? How’d that go? When someone is suggesting a massive cover-up, start to think about how many people would need to be involved, and then think about the last surprise party that you planned.

While sometimes scientists do uncover novel findings that dispute established understanding, those who respect the scientific enterprise welcome the criticism of their colleagues and go back into the lab to generate more data that will support their findings or refute their critics. Practicing science means having a thick skin; scientists have to get used to accepting criticism and using it to make their work even better. If they have uncovered something real that will substantially change how we think about a particular topic, they’ll continue generating data that will make their case stronger. Likewise, others will try to repeat the findings. Similar findings by other groups of scientists add credibility to the findings. This is another reason mavericks are rare because as more people find similar results, they too start supporting the ideas and what was originally an idea of one becomes that of many. The saying popularized by Carl Sagan comes to mind, “extraordinary claims require extraordinary evidence.” So, if a scientist is in front of the cameras presenting themselves as a victim, consider that they should be in their lab collecting some extraordinary evidence.

The takeaways

  • It takes more than one study to support a new idea or change current thinking about a topic. Keep watching for a theme to emerge from many papers.
  • Individual scientists may be biased, but science isn’t. As such, don’t rely on a single expert’s point of view. Look to what is being said about a topic by multiple scientists. Do this by not only looking at multiple news outlets, but also by actively seeking out information from science-based organizations, like National Geographic , Smithsonian , American Association for the Advancement of Science (AAAS) or National Academy of Science (NAS) .
  • Be wary of scientists who portray themselves as victims.

Resources for evaluating information

Find resources that can help when evaluating information.

Download a PDF version of this article.

Categories: Parents PACK March 2022

Materials in this section are updated as new information and vaccines become available. The Vaccine Education Center staff regularly reviews materials for accuracy.

You should not consider the information in this site to be specific, professional medical advice for your personal health or for your family's personal health. You should not use it to replace any relationship with a physician or other qualified healthcare professional. For medical concerns, including decisions about vaccinations, medications and other treatments, you should always consult your physician or, in serious cases, seek immediate assistance from emergency personnel.

Why most published research findings are false

Affiliation.

  • 1 Department of Hygiene and Epidemiology, University of Ioannina School of Medicine, Ioannina, Greece. [email protected]
  • PMID: 16060722
  • PMCID: PMC1182327
  • DOI: 10.1371/journal.pmed.0020124

There is increasing concern that most current published research findings are false. The probability that a research claim is true may depend on study power and bias, the number of other studies on the same question, and, importantly, the ratio of true to no relationships among the relationships probed in each scientific field. In this framework, a research finding is less likely to be true when the studies conducted in a field are smaller; when effect sizes are smaller; when there is a greater number and lesser preselection of tested relationships; where there is greater flexibility in designs, definitions, outcomes, and analytical modes; when there is greater financial and other interest and prejudice; and when more teams are involved in a scientific field in chase of statistical significance. Simulations show that for most study designs and settings, it is more likely for a research claim to be false than true. Moreover, for many current scientific fields, claimed research findings may often be simply accurate measures of the prevailing bias. In this essay, I discuss the implications of these problems for the conduct and interpretation of research.

  • Data Interpretation, Statistical*
  • Likelihood Functions
  • Meta-Analysis as Topic
  • Publishing*
  • Reproducibility of Results
  • Research Design*
  • Sample Size

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Cambridge Open

Logo of cambridgeopen

The Flaws and Human Harms of Animal Experimentation

Nonhuman animal (“animal”) experimentation is typically defended by arguments that it is reliable, that animals provide sufficiently good models of human biology and diseases to yield relevant information, and that, consequently, its use provides major human health benefits. I demonstrate that a growing body of scientific literature critically assessing the validity of animal experimentation generally (and animal modeling specifically) raises important concerns about its reliability and predictive value for human outcomes and for understanding human physiology. The unreliability of animal experimentation across a wide range of areas undermines scientific arguments in favor of the practice. Additionally, I show how animal experimentation often significantly harms humans through misleading safety studies, potential abandonment of effective therapeutics, and direction of resources away from more effective testing methods. The resulting evidence suggests that the collective harms and costs to humans from animal experimentation outweigh potential benefits and that resources would be better invested in developing human-based testing methods.

Introduction

Annually, more than 115 million animals are used worldwide in experimentation or to supply the biomedical industry. 1 Nonhuman animal (hereafter “animal”) experimentation falls under two categories: basic (i.e., investigation of basic biology and human disease) and applied (i.e., drug research and development and toxicity and safety testing). Regardless of its categorization, animal experimentation is intended to inform human biology and health sciences and to promote the safety and efficacy of potential treatments. Despite its use of immense resources, the animal suffering involved, and its impact on human health, the question of animal experimentation’s efficacy has been subjected to little systematic scrutiny. 2

Although it is widely accepted that medicine should be evidence based , animal experimentation as a means of informing human health has generally not been held, in practice, to this standard. This fact makes it surprising that animal experimentation is typically viewed as the default and gold standard of preclinical testing and is generally supported without critical examination of its validity. A survey published in 2008 of anecdotal cases and statements given in support of animal experimentation demonstrates how it has not and could not be validated as a necessary step in biomedical research, and the survey casts doubt on its predictive value. 3 I show that animal experimentation is poorly predictive of human outcomes, 4 that it is unreliable across a wide category of disease areas, 5 and that existing literature demonstrates the unreliability of animal experimentation, thereby undermining scientific arguments in its favor. I further show that the collective harms that result from an unreliable practice tip the ethical scale of harms and benefits against continuation in much, if not all, of experimentation involving animals. 6

Problems of Successful Translation to Humans of Data from Animal Experimentation

Although the unreliability and limitations of animal experimentation have increasingly been acknowledged, there remains a general confidence within much of the biomedical community that they can be overcome. 7 However, three major conditions undermine this confidence and explain why animal experimentation, regardless of the disease category studied, fails to reliably inform human health: (1) the effects of the laboratory environment and other variables on study outcomes, (2) disparities between animal models of disease and human diseases, and (3) species differences in physiology and genetics. I argue for the critical importance of each of these conditions.

The Influence of Laboratory Procedures and Environments on Experimental Results

Laboratory procedures and conditions exert influences on animals’ physiology and behaviors that are difficult to control and that can ultimately impact research outcomes. Animals in laboratories are involuntarily placed in artificial environments, usually in windowless rooms, for the duration of their lives. Captivity and the common features of biomedical laboratories—such as artificial lighting, human-produced noises, and restricted housing environments—can prevent species-typical behaviors, causing distress and abnormal behaviors among animals. 8 Among the types of laboratory-generated distress is the phenomenon of contagious anxiety. 9 Cortisone levels rise in monkeys watching other monkeys being restrained for blood collection. 10 Blood pressure and heart rates elevate in rats watching other rats being decapitated. 11 Routine laboratory procedures, such as catching an animal and removing him or her from the cage, in addition to the experimental procedures, cause significant and prolonged elevations in animals’ stress markers. 12 These stress-related changes in physiological parameters caused by the laboratory procedures and environments can have significant effects on test results. 13 Stressed rats, for example, develop chronic inflammatory conditions and intestinal leakage, which add variables that can confound data. 14

A variety of conditions in the laboratory cause changes in neurochemistry, genetic expression, and nerve regeneration. 15 In one study, for example, mice were genetically altered to develop aortic defects. Yet, when the mice were housed in larger cages, those defects almost completely disappeared. 16 Providing further examples, typical noise levels in laboratories can damage blood vessels in animals, and even the type of flooring on which animals are tested in spinal cord injury experiments can affect whether a drug shows a benefit. 17

In order to control for potential confounders, some investigators have called for standardization of laboratory settings and procedures. 18 One notable effort was made by Crabbe et al. in their investigation of the potential confounding influences of the laboratory environment on six mouse behaviors that are commonly studied in neurobehavioral experiments. Despite their “extraordinary lengths to equate test apparatus, testing protocols, and all possible features of animal husbandry” across three laboratories, there were systematic differences in test results in these labs. 19 Additionally, different mouse strains varied markedly in all behavioral tests, and for some tests the magnitude of genetic differences depended on the specific testing laboratory. The results suggest that there are important influences of environmental conditions and procedures specific to individual laboratories that can be difficult—perhaps even impossible—to eliminate. These influences can confound research results and impede extrapolation to humans.

The Discordance between Human Diseases and Animal Models of Diseases

The lack of sufficient congruence between animal models and human diseases is another significant obstacle to translational reliability. Human diseases are typically artificially induced in animals, but the enormous difficulty of reproducing anything approaching the complexity of human diseases in animal models limits their usefulness. 20 Even if the design and conduct of an animal experiment are sound and standardized, the translation of its results to the clinic may fail because of disparities between the animal experimental model and the human condition. 21

Stroke research presents one salient example of the difficulties in modeling human diseases in animals. Stroke is relatively well understood in its underlying pathology. Yet accurately modeling the disease in animals has proven to be an exercise in futility. To address the inability to replicate human stroke in animals, many assert the need to use more standardized animal study design protocols. This includes the use of animals who represent both genders and wide age ranges, who have comorbidities and preexisting conditions that occur naturally in humans, and who are consequently given medications that are indicated for human patients. 22 In fact, a set of guidelines, named STAIR, was implemented by a stroke roundtable in 1999 (and updated in 2009) to standardize protocols, limit the discrepancies, and improve the applicability of animal stroke experiments to humans. 23 One of the most promising stroke treatments later to emerge was NXY-059, which proved effective in animal experiments. However, the drug failed in clinical trials, despite the fact that the set of animal experiments on this drug was considered the poster child for the new experimental standards. 24 Despite such vigorous efforts, the development of STAIR and other criteria has yet to make a recognizable impact in clinical translation. 25

Under closer scrutiny, it is not difficult to surmise why animal stroke experiments fail to successfully translate to humans even with new guidelines. Standard stroke medications will likely affect different species differently. There is little evidence to suggest that a female rat, dog, or monkey sufficiently reproduces the physiology of a human female. Perhaps most importantly, reproducing the preexisting conditions of stroke in animals proves just as difficult as reproducing stroke pathology and outcomes. For example, most animals don’t naturally develop significant atherosclerosis, a leading contributor to ischemic stroke. In order to reproduce the effects of atherosclerosis in animals, researchers clamp their blood vessels or artificially insert blood clots. These interventions, however, do not replicate the elaborate pathology of atherosclerosis and its underlying causes. Reproducing human diseases in animals requires reproducing the predisposing diseases, also a formidable challenge. The inability to reproduce the disease in animals so that it is congruent in relevant respects with human stroke has contributed to a high failure rate in drug development. More than 114 potential therapies initially tested in animals failed in human trials. 26

Further examples of repeated failures based on animal models include drug development in cancer, amyotrophic lateral sclerosis (ALS), traumatic brain injury (TBI), Alzheimer’s disease (AD), and inflammatory conditions. Animal cancer models in which tumors are artificially induced have been the basic translational model used to study key physiological and biochemical properties in cancer onset and propagation and to evaluate novel treatments. Nevertheless, significant limitations exist in the models’ ability to faithfully mirror the complex process of human carcinogenesis. 27 These limitations are evidenced by the high (among the highest of any disease category) clinical failure rate of cancer drugs. 28 Analyses of common mice ALS models demonstrate significant differences from human ALS. 29 The inability of animal ALS models to predict beneficial effects in humans with ALS is recognized. 30 More than twenty drugs have failed in clinical trials, and the only U.S. Food and Drug Administration (FDA)–approved drug to treat ALS is Riluzole, which shows notably marginal benefit on patient survival. 31 Animal models have also been unable to reproduce the complexities of human TBI. 32 In 2010, Maas et al. reported on 27 large Phase 3 clinical trials and 6 unpublished trials in TBI that all failed to show human benefit after showing benefit in animals. 33 Additionally, even after success in animals, around 172 and 150 drug development failures have been identified in the treatment of human AD 34 and inflammatory diseases, 35 respectively.

The high clinical failure rate in drug development across all disease categories is based, at least in part, on the inability to adequately model human diseases in animals and the poor predictability of animal models. 36 A notable systematic review, published in 2007, compared animal experimentation results with clinical trial findings across interventions aimed at the treatment of head injury, respiratory distress syndrome, osteoporosis, stroke, and hemorrhage. 37 The study found that the human and animal results were in accordance only half of the time. In other words, the animal experiments were no more likely than a flip of the coin to predict whether those interventions would benefit humans.

In 2004, the FDA estimated that 92 percent of drugs that pass preclinical tests, including “pivotal” animal tests, fail to proceed to the market. 38 More recent analysis suggests that, despite efforts to improve the predictability of animal testing, the failure rate has actually increased and is now closer to 96 percent. 39 The main causes of failure are lack of effectiveness and safety problems that were not predicted by animal tests. 40

Usually, when an animal model is found wanting, various reasons are proffered to explain what went wrong—poor methodology, publication bias, lack of preexisting disease and medications, wrong gender or age, and so on. These factors certainly require consideration, and recognition of each potential difference between the animal model and the human disease motivates renewed efforts to eliminate these differences. As a result, scientific progress is sometimes made by such efforts. However, the high failure rate in drug testing and development, despite attempts to improve animal testing, suggests that these efforts remain insufficient to overcome the obstacles to successful translation that are inherent to the use of animals. Too often ignored is the well-substantiated idea that these models are, for reasons summarized here, intrinsically lacking in relevance to, and thus highly unlikely to yield useful information about, human diseases. 41

Interspecies Differences in Physiology and Genetics

Ultimately, even if considerable congruence were shown between an animal model and its corresponding human disease, interspecies differences in physiology, behavior, pharmacokinetics, and genetics would significantly limit the reliability of animal studies, even after a substantial investment to improve such studies. In spinal cord injury, for example, drug testing results vary according to which species and even which strain within a species is used, because of numerous interspecies and interstrain differences in neurophysiology, anatomy, and behavior. 42 The micropathology of spinal cord injury, injury repair mechanisms, and recovery from injury varies greatly among different strains of rats and mice. A systematic review found that even among the most standardized and methodologically superior animal experiments, testing results assessing the effectiveness of methylprednisolone for spinal cord injury treatment varied considerably among species. 43 This suggests that factors inherent to the use of animals account for some of the major differences in results.

Even rats from the same strain but purchased from different suppliers produce different test results. 44 In one study, responses to 12 different behavioral measures of pain sensitivity, which are important markers of spinal cord injury, varied among 11 strains of mice, with no clear-cut patterns that allowed prediction of how each strain would respond. 45 These differences influenced how the animals responded to the injury and to experimental therapies. A drug might be shown to help one strain of mice recover but not another. Despite decades of using animal models, not a single neuroprotective agent that ameliorated spinal cord injury in animal tests has proven efficacious in clinical trials to date. 46

Further exemplifying the importance of physiological differences among species, a 2013 study reported that the mouse models used extensively to study human inflammatory diseases (in sepsis, burns, infection, and trauma) have been misleading. The study found that mice differ greatly from humans in their responses to inflammatory conditions. Mice differed from humans in what genes were turned on and off and in the timing and duration of gene expression. The mouse models even differed from one another in their responses. The investigators concluded that “our study supports higher priority to focus on the more complex human conditions rather than relying on mouse models to study human inflammatory disease.” 47 The different genetic responses between mice and humans are likely responsible, at least in part, for the high drug failure rate. The authors stated that every one of almost 150 clinical trials that tested candidate agents’ ability to block inflammatory responses in critically ill patients failed.

Wide differences have also become apparent in the regulation of the same genes, a point that is readily seen when observing differences between human and mouse livers. 48 Consistent phenotypes (observable physical or biochemical characteristics) are rarely obtained by modification of the same gene, even among different strains of mice. 49 Gene regulation can substantially differ among species and may be as important as the presence or absence of a specific gene. Despite the high degree of genome conservation, there are critical differences in the order and function of genes among species. To use an analogy: as pianos have the same keys, humans and other animals share (largely) the same genes. Where we mostly differ is in the way the genes or keys are expressed. For example, if we play the keys in a certain order, we hear Chopin; in a different order, we hear Ray Charles; and in yet a different order, it’s Jerry Lee Lewis. In other words, the same keys or genes are expressed, but their different orders result in markedly different outcomes.

Recognizing the inherent genetic differences among species as a barrier to translation, researches have expressed considerable enthusiasm for genetically modified (GM) animals, including transgenic mice models, wherein human genes are inserted into the mouse genome. However, if a human gene is expressed in mice, it will likely function differently from the way it functions in humans, being affected by physiological mechanisms that are unique in mice. For example, a crucial protein that controls blood sugar in humans is missing in mice. 50 When the human gene that makes this protein was expressed in genetically altered mice, it had the opposite effect from that in humans: it caused loss of blood sugar control in mice. Use of GM mice has failed to successfully model human diseases and to translate into clinical benefit across many disease categories. 51 Perhaps the primary reason why GM animals are unlikely to be much more successful than other animal models in translational medicine is the fact that the “humanized” or altered genes are still in nonhuman animals.

In many instances, nonhuman primates (NHPs) are used instead of mice or other animals, with the expectation that NHPs will better mimic human results. However, there have been sufficient failures in translation to undermine this optimism. For example, NHP models have failed to reproduce key features of Parkinson’s disease, both in function and in pathology. 52 Several therapies that appeared promising in both NHPs and rat models of Parkinson’s disease showed disappointing results in humans. 53 The campaign to prescribe hormone replacement therapy (HRT) in millions of women to prevent cardiovascular disease was based in large part on experiments on NHPs. HRT is now known to increase the risk of these diseases in women. 54

HIV/AIDS vaccine research using NHPs represents one of the most notable failures in animal experimentation translation. Immense resources and decades of time have been devoted to creating NHP (including chimpanzee) models of HIV. Yet all of about 90 HIV vaccines that succeeded in animals failed in humans. 55 After HIV vaccine gp120 failed in clinical trials, despite positive outcomes in chimpanzees, a BMJ article commented that important differences between NHPs and humans with HIV misled researchers, taking them down unproductive experimental paths. 56 Gp120 failed to neutralize HIV grown and tested in cell culture. However, because the serum protected chimpanzees from HIV infection, two Phase 3 clinical trials were undertaken 57 —a clear example of how expectations that NHP data are more predictive than data from other (in this case, cell culture) testing methods are unproductive and harmful. Despite the repeated failures, NHPs (though not chimpanzees or other great apes) remain widely used for HIV research.

The implicit assumption that NHP (and indeed any animal) data are reliable has also led to significant and unjustifiable human suffering. For example, clinical trial volunteers for gp120 were placed at unnecessary risk of harm because of unfounded confidence in NHP experiments. Two landmark studies involving thousands of menopausal women being treated with HRT were terminated early because of increased stroke and breast cancer risk. 58 In 2003, Elan Pharmaceuticals was forced to prematurely terminate a Phase 2 clinical trial when an investigational AD vaccine was found to cause brain swelling in human subjects. No significant adverse effects were detected in GM mice or NHPs. 59

In another example of human suffering resulting from animal experimentation, six human volunteers were injected with an immunomodulatory drug, TGN 1412, in 2006. 60 Within minutes of receiving the experimental drug, all volunteers suffered a severe adverse reaction resulting from a life-threatening cytokine storm that led to catastrophic systemic organ failure. The compound was designed to dampen the immune system, but it had the opposite effect in humans. Prior to this first human trial, TGN 1412 was tested in mice, rabbits, rats, and NHPs with no ill effects. NHPs also underwent repeat-dose toxicity studies and were given 500 times the human dose for at least four consecutive weeks. 61 None of the NHPs manifested the ill effects that humans showed almost immediately after receiving minute amounts of the test drug. Cynomolgus and rhesus monkeys were specifically chosen because their CD28 receptors demonstrated similar affinity to TGN 1412 as human CD28 receptors. Based on such data as these, it was confidently concluded that results obtained from these NHPs would most reliably predict drug responses in humans—a conclusion that proved devastatingly wrong.

As exemplified by the study of HIV/AIDS, TGN 1412, and other experiences, 62 experiments with NHPs are not necessarily any more predictive of human responses than experiments with other animals. The repeated failures in translation from studies with NHPs belie arguments favoring use of any nonhuman species to study human physiology and diseases and to test potential treatments. If experimentation using chimpanzees and other NHPs, our closest genetic cousins, are unreliable, how can we expect research using other animals to be reliable? The bottom line is that animal experiments, no matter the species used or the type of disease research undertaken, are highly unreliable—and they have too little predictive value to justify the resultant risks of harms for humans, for reasons I now explain.

The Collective Harms That Result from Misleading Animal Experiments

As medical research has explored the complexities and subtle nuances of biological systems, problems have arisen because the differences among species along these subtler biological dimensions far outweigh the similarities , as a growing body of evidence attests. These profoundly important—and often undetected—differences are likely one of the main reasons human clinical trials fail. 63

“Appreciation of differences” and “caution” about extrapolating results from animals to humans are now almost universally recommended. But, in practice, how does one take into account differences in drug metabolism, genetics, expression of diseases, anatomy, influences of laboratory environments, and species- and strain-specific physiologic mechanisms—and, in view of these differences, discern what is applicable to humans and what is not? If we cannot determine which physiological mechanisms in which species and strains of species are applicable to humans (even setting aside the complicating factors of different caging systems and types of flooring), the usefulness of the experiments must be questioned.

It has been argued that some information obtained from animal experiments is better than no information. 64 This thesis neglects how misleading information can be worse than no information from animal tests. The use of nonpredictive animal experiments can cause human suffering in at least two ways: (1) by producing misleading safety and efficacy data and (2) by causing potential abandonment of useful medical treatments and misdirecting resources away from more effective testing methods.

Humans are harmed because of misleading animal testing results. Imprecise results from animal experiments may result in clinical trials of biologically faulty or even harmful substances, thereby exposing patients to unnecessary risk and wasting scarce research resources. 65 Animal toxicity studies are poor predictors of toxic effects of drugs in humans. 66 As seen in some of the preceding examples (in particular, stroke, HRT, and TGN1412), humans have been significantly harmed because investigators were misled by the safety and efficacy profile of a new drug based on animal experiments. 67 Clinical trial volunteers are thus provided with raised hopes and a false sense of security because of a misguided confidence in efficacy and safety testing using animals.

An equal if indirect source of human suffering is the opportunity cost of abandoning promising drugs because of misleading animal tests. 68 As candidate drugs generally proceed down the development pipeline and to human testing based largely on successful results in animals 69 (i.e., positive efficacy and negative adverse effects), drugs are sometimes not further developed due to unsuccessful results in animals (i.e., negative efficacy and/or positive adverse effects). Because much pharmaceutical company preclinical data are proprietary and thus publicly unavailable, it is difficult to know the number of missed opportunities due to misleading animal experiments. However, of every 5,000–10,000 potential drugs investigated, only about 5 proceed to Phase 1 clinical trials. 70 Potential therapeutics may be abandoned because of results in animal tests that do not apply to humans. 71 Treatments that fail to work or show some adverse effect in animals because of species-specific influences may be abandoned in preclinical testing even if they may have proved effective and safe in humans if allowed to continue through the drug development pipeline.

An editorial in Nature Reviews Drug Discovery describes cases involving two drugs in which animal test results from species-specific influences could have derailed their development. In particular, it describes how tamoxifen, one of the most effective drugs for certain types of breast cancer, “would most certainly have been withdrawn from the pipeline” if its propensity to cause liver tumor in rats had been discovered in preclinical testing rather than after the drug had been on the market for years. 72 Gleevec provides another example of effective drugs that could have been abandoned based on misleading animal tests: this drug, which is used to treat chronic myelogenous leukemia (CML), showed serious adverse effects in at least five species tested, including severe liver damage in dogs. However, liver toxicity was not detected in human cell assays, and clinical trials proceeded, which confirmed the absence of significant liver toxicity in humans. 73 Fortunately for CML patients, Gleevec is a success story of predictive human-based testing. Many useful drugs that have safely been used by humans for decades, such as aspirin and penicillin, may not have been available today if the current animal testing regulatory requirements were in practice during their development. 74

A further example of near-missed opportunities is provided by experiments on animals that delayed the acceptance of cyclosporine, a drug widely and successfully used to treat autoimmune disorders and prevent organ transplant rejection. 75 Its immunosuppressive effects differed so markedly among species that researchers judged that the animal results limited any direct inferences that could be made to humans. Providing further examples, PharmaInformatic released a report describing how several blockbuster drugs, including aripiprazole (Abilify) and esomeprazole (Nexium), showed low oral bioavailability in animals. They would likely not be available on the market today if animal tests were solely relied on. Understanding the implications of its findings for drug development in general, PharmaInformatic asked, “Which other blockbuster drugs would be on the market today, if animal trials would have not been used to preselect compounds and drug-candidates for further development?” 76 These near-missed opportunities and the overall 96 percent failure rate in clinical drug testing strongly suggest the unsoundness of animal testing as a precondition of human clinical trials and provide powerful evidence for the need for a new, human-based paradigm in medical research and drug development.

In addition to potentially causing abandonment of useful treatments, use of an invalid animal disease model can lead researchers and the industry in the wrong research direction, wasting time and significant investment. 77 Repeatedly, researchers have been lured down the wrong line of investigation because of information gleaned from animal experiments that later proved to be inaccurate, irrelevant, or discordant with human biology. Some claim that we do not know which benefits animal experiments, particularly in basic research, may provide down the road. Yet human lives remain in the balance, waiting for effective therapies. Funding must be strategically invested in the research areas that offer the most promise.

The opportunity costs of continuing to fund unreliable animal tests may impede development of more accurate testing methods. Human organs grown in the lab, human organs on a chip, cognitive computing technologies, 3D printing of human living tissues, and the Human Toxome Project are examples of new human-based technologies that are garnering widespread enthusiasm. The benefit of using these testing methods in the preclinical setting over animal experiments is that they are based on human biology. Thus their use eliminates much of the guesswork required when attempting to extrapolate physiological data from other species to humans. Additionally, these tests offer whole-systems biology, in contrast to traditional in vitro techniques. Although they are gaining momentum, these human-based tests are still in their relative infancy, and funding must be prioritized for their further development. The recent advancements made in the development of more predictive, human-based systems and biological approaches in chemical toxicological testing are an example of how newer and improved tests have been developed because of a shift in prioritization. 78 Apart from toxicology, though, financial investment in the development of human-based technologies generally falls far short of investment in animal experimentation. 79

The unreliability of applying animal experimental results to human biology and diseases is increasingly recognized. Animals are in many respects biologically and psychologically similar to humans, perhaps most notably in the shared characteristics of pain, fear, and suffering. 80 In contrast, evidence demonstrates that critically important physiological and genetic differences between humans and other animals can invalidate the use of animals to study human diseases, treatments, pharmaceuticals, and the like. In significant measure, animal models specifically, and animal experimentation generally, are inadequate bases for predicting clinical outcomes in human beings in the great bulk of biomedical science. As a result, humans can be subject to significant and avoidable harm.

The data showing the unreliability of animal experimentation and the resultant harms to humans (and nonhumans) undermine long-standing claims that animal experimentation is necessary to enhance human health and therefore ethically justified. Rather, they demonstrate that animal experimentation poses significant costs and harms to human beings. It is possible—as I have argued elsewhere—that animal research is more costly and harmful, on the whole, than it is beneficial to human health. 81 When considering the ethical justifiability of animal experiments, we should ask if it is ethically acceptable to deprive humans of resources, opportunity, hope, and even their lives by seeking answers in what may be the wrong place. In my view, it would be better to direct resources away from animal experimentation and into developing more accurate, human-based technologies.

Aysha Akhtar , M.D., M.P.H., is a neurologist and preventive medicine specialist and Fellow at the Oxford Centre for Animal Ethics, Oxford, United Kingdom.

1. Taylor K, Gordon N, Langley G, Higgins W. Estimates for worldwide laboratory animal use in 2005 . Alternatives to Laboratory Animals 2008; 36 :327–42. [ PubMed ] [ Google Scholar ]

2. Systematic reviews that have been conducted generally reveal the unreliability and poor predictability of animal tests. See Perel P, Roberts I, Sena E, Wheble P, Briscoe C, Sandercock P, et al. Comparison of treatment effects between animal experiments and clinical trials: Systematic review. BMJ 2007;334:197. See also Pound P, Bracken MB. Is animal research sufficiently evidence based to be a cornerstone of biomedical research? BMJ 2014;348:g3387. See Godlee F. How predictive and productive is animal research? BMJ 2014;348:g3719. See Benatar M. Lost in translation: Treatment trials in the SOD 1 mouse and in human ALS. Neurobiology Disease 2007; 26 :1–13 [ PubMed ] [ Google Scholar ] . And see Akhtar AZ, Pippin JJ, Sandusky CB. Animal studies in spinal cord injury: A systematic review of methylprednisolone . Alternatives to Laboratory Animals 2009; 37 :43–62. [ PubMed ] [ Google Scholar ]

3. Mathews RAJ. Medical progress depends on animal models—doesn’t it? Journal of the Royal Society of Medicine 2008; 101 :95–8. [ PMC free article ] [ PubMed ] [ Google Scholar ]

4. See Shanks N, Greek R, Greek J. Are animal models predictive for humans? Philosophy, Ethics, and Humanities in Medicine 2009; 4 :2 [ PMC free article ] [ PubMed ] [ Google Scholar ] . See also Wall RJ, Shani M. Are animal models as good as we think? Theriogenology 2008; 69 :2–9. [ PubMed ] [ Google Scholar ]

5. See note 3, Mathews 2008. See also Hartung T, Zurlo J. Food for thought… alternative approaches for medical countermeasures to biological and chemical terrorism and warfare . ALTEX 2012; 29 :251–60 [ PubMed ] [ Google Scholar ] . See Leist M, Hartung T. Inflammatory findings on species extrapolations: Humans are definitely no 70-kg mice . Archives in Toxicology 2013; 87 :563–7 [ PMC free article ] [ PubMed ] [ Google Scholar ] . See Mak IWY, Evaniew N, Ghert M. Lost in translation: Animal models and clinical trials in cancer treatment . American Journal in Translational Research 2014; 6 :114–18 [ PMC free article ] [ PubMed ] [ Google Scholar ] . And see Pippin J. Animal research in medical sciences: Seeking a convergence of science, medicine, and animal law . South Texas Law Review 2013; 54 :469–511. [ Google Scholar ]

6. For an overview of the harms-versus-benefits argument, see LaFollette H. Animal experimentation in biomedical research In: Beauchamp TL, Frey RG, eds. The Oxford Handbook of Animal Ethics . Oxford: Oxford University Press; 2011:812–18. [ Google Scholar ]

7. See Jucker M. The benefits and limitations of animal models for translational research in neurodegenerative diseases . Nature Medicine 2010; 16 :1210–14 [ PubMed ] [ Google Scholar ] . See Institute of Medicine. Improving the Utility and Translation of Animal Models for Nervous System Disorders: Workshop Summary. Washington, DC: The National Academies Press; 2013. And see Degryse AL, Lawson WE. Progress towards improving animal models for IPF . American Journal of Medical Science 2011; 341 :444–9. [ PMC free article ] [ PubMed ] [ Google Scholar ]

8. See Morgan KN, Tromborg CT. Sources of stress in captivity . Applied Animal Behaviour Science 2007; 102 :262–302 [ Google Scholar ] . See Hart PC, Bergner CL, Dufour BD, Smolinsky AN, Egan RJ, LaPorte L, et al. Analysis of abnormal repetitive behaviors in experimental animal models In Warrick JE, Kauleff AV, eds. Translational Neuroscience and Its Advancement of Animal Research Ethics . New York: Nova Science; 2009:71–82 [ Google Scholar ] . See Lutz C, Well A, Novak M. Stereotypic and self-injurious behavior in rhesus macaques: A survey and retrospective analysis of environment and early experience . American Journal of Primatology 2003; 60 :1–15 [ PubMed ] [ Google Scholar ] . And see Balcombe JP, Barnard ND, Sandusky C. Laboratory routines cause animal stress . Contemporary Topics in Laboratory Animal Science 2004; 43 :42–51. [ PubMed ] [ Google Scholar ]

9. Suckow MA, Weisbroth SH, Franklin CL. The Laboratory Rat . 2nd ed. Burlington, MA: Elsevier Academic Press; 2006, at 323.

10. Flow BL, Jaques JT. Effect of room arrangement and blood sample collection sequence on serum thyroid hormone and cortisol concentrations in cynomolgus macaques ( Macaca fascicularis ). Contemporary Topics in Laboratory Animal Science 1997;36:65–8.

11. See note 8, Balcombe et al. 2004.

12. See note 8, Balcombe et al. 2004.

13. Baldwin A, Bekoff M. Too stressed to work. New Scientist 2007;194:24.

14. See note 13, Baldwin, Bekoff 2007.

15. Akhtar A, Pippin JJ, Sandusky CB. Animal models in spinal cord injury: A review . Reviews in the Neurosciences 2008; 19 :47–60. [ PubMed ] [ Google Scholar ]

16. See note 13, Baldwin, Bekoff 2007.

17. See note 15, Akhtar et al. 2008.

18. See Macleod MR, O’Collins T, Howells DW, Donnan GA. Pooling of animal experimental data reveals influence of study design and publication bias . Stroke 2004; 35 :1203–8 [ PubMed ] [ Google Scholar ] . See also O’ Neil BJ, Kline JA, Burkhart K, Younger J. Research fundamentals: V. The use of laboratory animal models in research. Academic Emergency Medicine 1999;6:75–82.

19. Crabbe JC, Wahlsten D, Dudek BC. Genetics of mouse behavior: Interactions with laboratory environment . Science 1999; 284 :1670–2, at 1670. [ PubMed ] [ Google Scholar ]

20. See Curry SH. Why have so many drugs with stellar results in laboratory stroke models failed in clinical trials? A theory based on allometric relationships. Annals of the New York Academy of Sciences 2003;993:69–74. See also Dirnagl U. Bench to bedside: The quest for quality in experimental stroke research . Journal of Cerebral Blood Flow & Metabolism 2006; 26 :1465–78 [ PubMed ] [ Google Scholar ] .

21. van der Worp HB, Howells DW, Sena ES, Poritt MJ, Rewell S, O’Collins V, et al. Can animal models of disease reliably inform human studies? PLoS Medicine 2010; 7 :e1000245. [ PMC free article ] [ PubMed ] [ Google Scholar ] .

22. See note 20, Dirnagl 2006. See also Sena E, van der Worp B, Howells D, Macleod M. How can we improve the pre-clinical development of drugs for stroke? Trends in Neurosciences 2007; 30 :433–9. [ PubMed ] [ Google Scholar ]

23. See Gawrylewski A. The trouble with animal models: Why did human trials fail? The Scientist 2007;21:44. See also Fisher M, Feuerstein G, Howells DW, Hurn PD, Kent TA, Savitz SI, et al. Update of the stroke therapy academic industry roundtable preclinical recommendations . Stroke 2009; 40 :2244–50. [ PMC free article ] [ PubMed ] [ Google Scholar ]

24. See note 23, Gawrylewski 2007. There is some dispute as to how vigorously investigators adhered to the suggested criteria. Nevertheless, NXY-059 animal studies were considered an example of preclinical studies that most faithfully adhered to the STAIR criteria. For further discussion see also Wang MM, Guohua X, Keep RF. Should the STAIR criteria be modified for preconditioning studies? Translational Stroke Research 2013; 4 :3–14 [ PMC free article ] [ PubMed ] [ Google Scholar ] .

25. See note 24, Wang et al. 2013.

26. O’Collins VE, Macleod MR, Donnan GA, Horky LL, van der Worp BH, Howells DW. 1,026 experimental treatments in acute stroke . Annals of Neurology 2006; 59 :467–7 [ PubMed ] [ Google Scholar ] .

27. See note 5, Mak et al. 2014.

28. See note 5, Mak et al. 2014.

29. See Perrin S. Preclinical research: Make mouse studies work . Nature 2014; 507 :423–5 [ PubMed ] [ Google Scholar ] . See also, generally, Wilkins HM, Bouchard RJ, Lorenzon NM, Linseman DA. Poor correlation between drug efficacies in the mutant SOD1 mouse mode versus clinical trials of ALS necessitates the development of novel animal models for sporadic motor neuron disease. In: Costa A, Villalba E, eds. Horizons in Neuroscience Research. Vol. 5. Hauppauge, NY: Nova Science; 2011:1–39.

30. Traynor BJ, Bruijn L, Conwit R, Beal F, O’Neill G, Fagan SC, et al. Neuroprotective agents for clinical trials in ALS: A systematic assessment . Neurology 2006; 67 :20–7 [ PubMed ] [ Google Scholar ] .

31. Sinha G. Another blow for ALS . Nature Biotechnology 2013; 31 :185 [ Google Scholar ] . See also note 30, Traynor et al. 2006.

32. See Morales DM, Marklund N, Lebold D, Thompson HJ, Pitkanen A, Maxwell WL, et al. Experimental models of traumatic brain injury: Do we really need a better mousetrap? Neuroscience 2005; 136 :971–89 [ PubMed ] [ Google Scholar ] . See also Xiong YE, Mahmood A, Chopp M. Animal models of traumatic brain injury . Nature Reviews Neuroscience 2013; 14 :128–42 [ PMC free article ] [ PubMed ] [ Google Scholar ] . And see commentary by Farber: Farber N. Drug development in brain injury. International Brain Injury Association ; available at http://www.internationalbrain.org/articles/drug-development-in-traumatic-brain-injury/ (last accessed 7 Dec 2014).

33. Maas AI, Roozenbeek B, Manley GT. Clinical trials in traumatic brain injury: Past experience and current developments . Neurotherapeutics 2010; 7 :115–26. [ PMC free article ] [ PubMed ] [ Google Scholar ]

34. Schneider LS, Mangialasche F, Andreasen N, Feldman H, Giacobini E, Jones R, et al. Clinical trials and late-stage drug development in Alzheimer’s disease: An appraisal from 1984 to 2014 . Journal of Internal Medicine 2014; 275 :251–83 [ PMC free article ] [ PubMed ] [ Google Scholar ] .

35. Seok J, Warren HS, Cuenca AG, Mindrinos MN, Baker HV, Xu W, et al. Genomic responses in mouse models poorly mimic human inflammatory diseases . Proceedings of the National Academy of Sciences USA 2013; 110 :3507–12. [ PMC free article ] [ PubMed ] [ Google Scholar ]

36. Palfreyman MG, Charles V, Blander J. The importance of using human-based models in gene and drug discovery . Drug Discovery World 2002. Fall:33–40 [ Google Scholar ] .

37. See note 2, Perel et al. 2007.

38. Harding A. More compounds failing phase I. The Scientist 2004 Sept 13; available at http://www.the-scientist.com/?articles.view/articleNo/23003/title/More-compounds-failing-Phase-I/ (last accessed 2 June 2014).

39. See note 5, Pippin 2013.

40. See note 5, Hartung, Zurlo 2012.

41. Wiebers DO, Adams HP, Whisnant JP. Animal models of stroke: Are they relevant to human disease? Stroke 1990; 21 :1–3. [ PubMed ] [ Google Scholar ]

42. See note 15, Akhtar et al. 2008.

43. See note 2, Akhtar et al. 2009.

44. Lonjon N, Prieto M, Haton H, Brøchner CB, Bauchet L, Costalat V, et al. Minimum information about animal experiments: Supplier is also important . Journal of Neuroscience Research 2009; 87 :403–7. [ PubMed ] [ Google Scholar ]

45. Mogil JS, Wilson SG, Bon K, Lee SE, Chung K, Raber P, et al. Heritability of nociception I: Responses of 11 inbred mouse strains on 12 measures of nociception . Pain 1999; 80 :67–82. [ PubMed ] [ Google Scholar ]

46. Tator H, Hashimoto R, Raich A, Norvell D, Fehling MG, Harrop JS, et al. Translational potential of preclinical trials of neuroprotection through pharmacotherapy for spinal cord injury . Journal of Neurosurgery: Spine 2012; 17 :157–229. [ PubMed ] [ Google Scholar ]

47. See note 35, Seok et al. 2013, at 3507.

48. Odom DT, Dowell RD, Jacobsen ES, Gordon W, Danford TW, MacIsaac KD, et al. Tissue-specific transcriptional regulation has diverged significantly between human and mouse . Nature Genetics 2007; 39 :730–2 [ PMC free article ] [ PubMed ] [ Google Scholar ] .

49. Horrobin DF. Modern biomedical research: An internally self-consistent universe with little contact with medical reality? Nature Reviews Drug Discovery 2003; 2 :151–4. [ PubMed ] [ Google Scholar ]

50. Vassilopoulous S, Esk C, Hoshino S, Funke BH, Chen CY, Plocik AM, et al. A role for the CHC22 clathrin heavy-chain isoform in human glucose metabolism . Science 2009; 324 :1192–6. [ PMC free article ] [ PubMed ] [ Google Scholar ]

51. See Guttman-Yassky E, Krueger JG. Psoriasis: Evolution of pathogenic concepts and new therapies through phases of translational research . British Journal of Dermatology 2007; 157 :1103–15 [ PubMed ] [ Google Scholar ] . See also The mouse model: Less than perfect, still invaluable. Johns Hopkins Medicine ; available at http://www.hopkinsmedicine.org/institute_basic_biomedical_sciences/news_events/articles_and_stories/model_organisms/201010_mouse_model.html (last accessed 10 Dec 2014). See note 23, Gawrylewski 2007. See note 2, Benatar 2007. See note 29, Perrin 2014 and Wilkins et al. 2011. See Cavanaugh S, Pippin J, Barnard N. Animal models of Alzheimer disease: Historical pitfalls and a path forward. ALTEX online first; 2014 Apr 10. And see Woodroofe A, Coleman RA. ServiceNote: Human tissue research for drug discovery . Genetic Engineering and Biotechnology News 2007; 27 :18 [ Google Scholar ] .

52. Lane E, Dunnett S. Animal models of Parkinson’s disease and L-dopa induced dyskinesia: How close are we to the clinic? Psychopharmacology 2008; 199 :303–12. [ PubMed ] [ Google Scholar ]

53. See note 52, Lane, Dunnett 2008.

54. See note 5, Pippin 2013.

55. Bailey J. An assessment of the role of chimpanzees in AIDS vaccine research . Alternatives to Laboratory Animals 2008; 36 :381–428. [ PubMed ] [ Google Scholar ]

56. Tonks A. The quest for the AIDs vaccine . BMJ 2007; 334 :1346–8 [ PMC free article ] [ PubMed ] [ Google Scholar ] .

57. Johnston MI, Fauci AS. An HIV vaccine—evolving concepts . New England Journal of Medicine 2007; 356 :2073–81. [ PubMed ] [ Google Scholar ]

58. See Rossouw JE, Andersen GL, Prentice RL, LaCroix AZ, Kooperberf C, Stefanick ML, et al. Risks and benefits of estrogen plus progestin in healthy menopausal women: Principle results from the Women’s Health Initiative randomized controlled trial . JAMA 2002; 288 :321–33 [ PubMed ] [ Google Scholar ] . See also Andersen GL, Limacher A, Assaf AR, Bassford T, Beresford SA, Black H, et al. Effects of conjugated equine estrogen in postmenopausal women with hysterectomy: The Women’s Health Initiative randomized controlled trial . JAMA 2004; 291 :1701–12 [ PubMed ] [ Google Scholar ] .

59. Lemere CA. Developing novel immunogens for a safe and effective Alzheimer’s disease vaccine . Progress in Brain Research 2009; 175 :83. [ PMC free article ] [ PubMed ] [ Google Scholar ] .

60. Allen A. Of mice and men: The problems with animal testing. Slate 2006 June 1; available at http://www.slate.com/articles/health_and_science/medical_examiner/2006/06/of_mice_or_men.html (last accessed 10 Dec 2014).

61. Attarwala H. TGN1412: From discovery to disaster . Journal of Young Pharmacists 2010; 2 :332–6 [ PMC free article ] [ PubMed ] [ Google Scholar ] .

62. See Hogan RJ. Are nonhuman primates good models for SARS? PLoS Medicine 2006; 3 :1656–7 [ PMC free article ] [ PubMed ] [ Google Scholar ] . See also Bailey J. Non-human primates in medical research and drug development: A critical review . Biogenic Amines 2005; 19 :235–55. [ Google Scholar ]

63. See note 4, Wall, Shani 2008.

64. Lemon R, Dunnett SB. Surveying the literature from animal experiments: Critical reviews may be helpful—not systematic ones . BMJ 2005; 330 :977–8. [ PMC free article ] [ PubMed ] [ Google Scholar ]

65. Roberts I, Kwan I, Evans P, Haig S. Does animal experimentation inform human health care? Observations from a systematic review of international animal experiments on fluid resuscitation . BMJ 2002; 324 :474–6 [ PMC free article ] [ PubMed ] [ Google Scholar ] .

66. See note 60,Allen 2006. See also Heywood R. Target organ toxicity . Toxicology Letters 1981; 8 :349–58 [ PubMed ] [ Google Scholar ] . See Fletcher AP. Drug safety tests and subsequent clinical experience . Journal of the Royal Society of Medicine 1978; 71 :693–6. [ PMC free article ] [ PubMed ] [ Google Scholar ]

67. See note 60, Allen 2006. See note 5, Pippin 2013. See also Greek R, Greek J. Animal research and human disease . JAMA 2000; 283 :743–4 [ PubMed ] [ Google Scholar ] .

68. See note 60, Allen 2006. See also note 5, Leist, Hartung 2013.

69. Food and Drug Administration. Development & approval process (drugs); available at http://www.fda.gov/Drugs/DevelopmentApprovalProcess/ (last accessed 7 Dec 2014). See also http://www.fda.gov/drugs/resourcesforyou/consumers/ucm143534.htm (last accessed 7 Dec 2014).

70. Drug discovery pipeline. IRSF; available at http://www.rettsyndrome.org/research-programs/programmatic-overview/drug-discovery-pipeline (last accessed 24 Sept 2014).

71. See note 60, Allen 2006.

72. Follow the yellow brick road. Nature Reviews Drug Discovery 2003;2:167, at 167.

73. See note 5, Pippin 2013.

74. For data on aspirin, see Hartung T. Per aspirin as astra … Alternatives to Laboratory Animals 2009; 37 (Suppl 2 ):45–7 [ PubMed ] [ Google Scholar ] . See also note 5, Pippin 2013. For data on penicillin, see Koppanyi T, Avery MA. Species differences and the clinical trial of new drugs: A review . Clinical Pharmacology and Therapeutics 1966; 7 :250–70 [ PubMed ] [ Google Scholar ] . See also Schneierson SS, Perlman E. Toxicity of penicillin for the Syrian hamster . Proceedings of the Society for Experimental Biology and Medicine 1956; 91 :229–30. [ PubMed ] [ Google Scholar ]

75. See note 67, Greek, Greek 2000.

76. Oral bioavailability of blockbuster drugs in humans and animals. PharmaInformatic . available at http://www.pharmainformatic.com/html/blockbuster_drugs.html (last accessed 19 Sept 2014).

77. Sams-Dodd F. Strategies to optimize the validity of disease models in the drug discovery process . Drug Discovery Today 2006; 11 :355–63 [ PubMed ] [ Google Scholar ] .

78. Zurlo J. No animals harmed: Toward a paradigm shift in toxicity testing . Hastings Center Report 2012;42. Suppl:s23–6 [ PubMed ] [ Google Scholar ] .

79. There is no direct analysis of the amount of money spent on animal testing versus alternatives across all categories; however, in 2008 the Chronicle of Higher Education reported that funding of research involving animals (under basic research) of the National Institute of Health (NIH) remained steady at about 42 percent since 1990. See Monastersky R. Protesters fail to slow animal research. Chronicle of Higher Education 2008:54. In 2012, NIH director Francis Collins noted that the NIH’s support for basic research has held steady at 54 percent of the agency’s budget for decades. The remainder of the NIH’s budget is heavily funded toward clinical research, suggesting that preclinical human-based testing methods are much less funded. See also Wadman M. NIH director grilled over translational research centre. Nature News Blog 2012 Mar 20. Available at http://blogs.nature.com/news/2012/03/nih-director-grilled-over-translational-research-center.html (last accessed 5 Mar 2015). There is no data that suggests that the NIH’s funding of animal experimentation has decreased. A 2010 analysis estimates that at least 50 percent of the NIH’s extramural funding is directed into animal research; see Greek R, Greek J. Is the use of sentient animals in basic research justifiable? Philosophy, Ethics, and Humanities in Medicine 2010; 5 :14 [ PMC free article ] [ PubMed ] [ Google Scholar ] .

80. For a helpful discussion on animal pain, fear, and suffering, see DeGrazia D. Taking Animals Seriously: Mental Lives and Moral Status . New York: Cambridge University Press; 1996:116–23. [ Google Scholar ]

81. See Akhtar A. Animals and Public Health: Why Treating Animals Better Is Critical to Human Welfare . Hampshire, UK: Palgrave Macmillan; 2012:chap. 5.

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • 06 November 2023

How big is science’s fake-paper problem?

  • Richard Van Noorden

You can also search for this author in PubMed   Google Scholar

The scientific literature is polluted with fake manuscripts churned out by paper mills — businesses that sell bogus work and authorships to researchers who need journal publications for their CVs. But just how large is this paper-mill problem?

Access options

Access Nature and 54 other Nature Portfolio journals

Get Nature+, our best-value online-access subscription

24,99 € / 30 days

cancel any time

Subscribe to this journal

Receive 51 print issues and online access

185,98 € per year

only 3,65 € per issue

Rent or buy this article

Prices vary by article type

Prices may be subject to local taxes which are calculated during checkout

Nature 623 , 466-467 (2023)

doi: https://doi.org/10.1038/d41586-023-03464-x

Reprints and permissions

Related Articles

why was a research paper in 2013 proven wrong

  • Scientific community

India is booming — but there are worries ahead for basic science

India is booming — but there are worries ahead for basic science

News 10 APR 24

Is ChatGPT corrupting peer review? Telltale words hint at AI use

Is ChatGPT corrupting peer review? Telltale words hint at AI use

Total solar eclipse 2024: what dazzled scientists

Total solar eclipse 2024: what dazzled scientists

AI-fuelled election campaigns are here — where are the rules?

AI-fuelled election campaigns are here — where are the rules?

World View 09 APR 24

How papers with doctored images can affect scientific reviews

How papers with doctored images can affect scientific reviews

News 28 MAR 24

Superconductivity case shows the need for zero tolerance of toxic lab culture

Correspondence 26 MAR 24

Rwanda 30 years on: understanding the horror of genocide

Rwanda 30 years on: understanding the horror of genocide

Editorial 09 APR 24

How I harnessed media engagement to supercharge my research career

How I harnessed media engagement to supercharge my research career

Career Column 09 APR 24

Junior Group Leader Position at IMBA - Institute of Molecular Biotechnology

The Institute of Molecular Biotechnology (IMBA) is one of Europe’s leading institutes for basic research in the life sciences. IMBA is located on t...

Austria (AT)

IMBA - Institute of Molecular Biotechnology

why was a research paper in 2013 proven wrong

Open Rank Faculty, Center for Public Health Genomics

Center for Public Health Genomics & UVA Comprehensive Cancer Center seek 2 tenure-track faculty members in Cancer Precision Medicine/Precision Health.

Charlottesville, Virginia

Center for Public Health Genomics at the University of Virginia

why was a research paper in 2013 proven wrong

Husbandry Technician I

Memphis, Tennessee

St. Jude Children's Research Hospital (St. Jude)

why was a research paper in 2013 proven wrong

Lead Researcher – Department of Bone Marrow Transplantation & Cellular Therapy

Researcher in the center for in vivo imaging and therapy.

why was a research paper in 2013 proven wrong

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

IMAGES

  1. How To Write Reasearch Paper

    why was a research paper in 2013 proven wrong

  2. More Of The Same In Research Papers

    why was a research paper in 2013 proven wrong

  3. WHY RESEARCH PAPERS GET REJECTED FROM JOURNALS?

    why was a research paper in 2013 proven wrong

  4. Parts of a Research Paper

    why was a research paper in 2013 proven wrong

  5. 😂 Step by step guide to writing a research paper. Step by step guide to

    why was a research paper in 2013 proven wrong

  6. 💌 Author research paper example. Defining authorship in your research

    why was a research paper in 2013 proven wrong

VIDEO

  1. call for research paper 2013 ( International Journal )

  2. Top Reasons for Research Paper Rejection || Why Research paper Rejected || step by step Guide

  3. Steven Pinker: Human nature in 2013

  4. National Geographic's 2013 Year in Review

  5. 6 Mistakes to avoid when writing your research paper (Part 2)

  6. "Why Most Published Research Findings are False" Part I

COMMENTS

  1. Research findings that are probably wrong cited far more than robust

    Following an influential paper in 2005 titled Why most published research findings are false, three major projects have found replication rates as low as 39% in psychology journals, 61% in ...

  2. Why Most Published Research Findings Are False

    Published research findings are sometimes refuted by subsequent evidence, with ensuing confusion and disappointment. Refutation and controversy is seen across the range of research designs, from clinical trials and traditional epidemiological studies [] to the most modern molecular research [4,5].There is increasing concern that in modern research, false findings may be the majority or even ...

  3. Don't say 'prove': How to report on the conclusiveness of research findings

    These four tips will help you get it right. 1. Avoid reporting that a research study or group of studies "proves" something — even if a press release says so. Press releases announcing new research often exaggerate or minimize findings, academic studies have found. Some mistakenly state researchers have proven something they haven't.

  4. This Is Why a Lot of Peer-Reviewed Research Is Wrong

    That's called the scientific method, and it's how we attempt to eliminate most flukes and false positives from published research. But, as the latest episode of Veritasium explains, despite this lengthy process, a lot of peer-reviewed research out there is actually wrong, and it highlights a serious problem in the way we do science.

  5. When Research Evidence is Misleading

    When Research Evidence is Misleading. Ioannidis JP. Why most published research findings are false. PLoS Med. 2005;2 (8):e124. Each year, millions of research hypotheses are tested. Datasets are analyzed in ad hoc and exploratory ways. Quasi-experimental, single-center, before and after studies are enthusiastically performed.

  6. Why Most Published Research Findings Are False

    The PDF of the paper. " Why Most Published Research Findings Are False " is a 2005 essay written by John Ioannidis, a professor at the Stanford School of Medicine, and published in PLOS Medicine. [1] It is considered foundational to the field of metascience . In the paper, Ioannidis argued that a large number, if not the majority, of published ...

  7. Beware those scientific studies—most are wrong, researcher warns

    A few years ago, two researchers took the 50 most-used ingredients in a cook book and studied how many had been linked with a cancer risk or benefit, based on a variety of studies published in ...

  8. Is Most Published Research Really False?

    More. There has been an increasing concern in both the scientific and lay communities that most published medical findings are false. But what does it mean to be false? Here we describe the range of definitions of false discoveries in the scientific literature. We summarize the philosophical, statistical, and experimental evidence for each type ...

  9. Are most published research findings false? Trends in statistical power

    The core research aim is to try to empirically answer the question posed by John Ioannidis: "why most published research findings are false" . Statistical power is defined as the probability of finding a significant effect given that a true effect is present [ 9 : 1]-displayed in the first column percentage in Fig 1 .

  10. Why Most Published Research Findings Are False: Problems in the ...

    The article published in PLoS Medicine by Ioannidis [] makes the dramatic claim in the title that "most published research claims are false," and has received extensive attention as a result.The article does provide a useful reminder that the probability of hypotheses depends on much more than just the p-value, a point that has been made in the medical literature for at least four decades ...

  11. Why won't good journals retract papers that were proven wrong?

    Better for the old paper to remain in place so that references to it, especially correcting references, remain valid. In some fields, having access to old, incorrect, papers can be especially valuable to a student. If the superstars of a field go wrong in proving something important, it is useful to know how they went wrong. That way, similar ...

  12. What a massive database of retracted papers reveals about ...

    It includes more than 18,000 retracted papers and conference abstracts dating back to the 1970s (and even one paper from 1756 involving Benjamin Franklin). It is not a perfect window into the world of retractions. Not all publishers, for instance, publicize or clearly label papers they have retracted, or explain why they did so.

  13. Feature Article

    Typically, a single paper does not prove anything; instead, it supports a theory or adds evidence to a body of knowledge. Indeed, often when a study is enthusiastically supported for a popular social idea, a deeper look finds that the conclusions are not as clear-cut as they are being deemed.

  14. Why most published research findings are false

    Abstract. There is increasing concern that most current published research findings are false. The probability that a research claim is true may depend on study power and bias, the number of other studies on the same question, and, importantly, the ratio of true to no relationships among the relationships probed in each scientific field.

  15. Manuscript Referencing Errors and Their Impact on Shaping Current

    Additionally, scientists admitted being influenced by a referenced paper's citation index, being less familiar with the details of prominent papers than with less well-cited papers. This finding demonstrates that authors sometimes rely on a citation index, which can serve as a proxy for a papers' reputation, more than they rely on their own ...

  16. When scientific hypotheses don't pan out

    Research is often driven by a scientific hypothesis, an educated guess about how studies will turn out. But sometimes the results surprise even the scientists. Illustration by Kim Carney / Fred Hutch News Service. One pair of scientists thought they'd discovered a new antiviral protein buried inside skin cells.

  17. The Flaws and Human Harms of Animal Experimentation

    In addition to potentially causing abandonment of useful treatments, use of an invalid animal disease model can lead researchers and the industry in the wrong research direction, wasting time and significant investment. 77 Repeatedly, researchers have been lured down the wrong line of investigation because of information gleaned from animal ...

  18. When being wrong is a good thing for science

    To admit that science might not always be right is an uncomfortable idea. Science advances by building on the shoulders of giants, with researchers using the understanding gained from great thinkers who have gone before them. Much of that insight is described in research papers and articles that have been scrutinized by peer review.

  19. publications

    Published papers can be wrong, so "the paper was already published in a prestigious conference and passed the peer review, the results couldn't have been wrong" is not true. Here's an example . In your case, given that the author is a member of your research group, I would expect to start internally.

  20. How big is science's fake-paper problem?

    The analysis estimates that 1.5-2% of all scientific papers published in 2022 closely resemble paper-mill works. Among biology and medicine papers, the rate rises to 3%. Source: Adam Day ...