Scientific papers need better feedback systems. Here's why

The current peer-review system is limited to asking two people for their opinions - this is not enough

All products featured on WIRED are independently selected by our editors. However, we may receive compensation from retailers and/or from purchases of products through these links.

Somewhere between 65 and 90 per cent of biomedical literature is considered non-reproducible. This means that if you try to reproduce an experiment described in a given paper, 65 to 90 per cent of the time you won't get the same findings. We call this the reproducibility crisis.

The issue became live thanks to a study by Glenn Begley, who ran the oncology department at Amgen, a pharmaceutical company. In 2011, Begley decided to try to reproduce findings in 53 foundational papers in oncology: highly cited papers published in the top journals. He was unable to reproduce 47 of them - 89 per cent1.

Read more: Serious drug side effects are massively underreported in medical papers

Bayer, another pharmaceutical company, reported in the same year that 65 per cent of findings in the papers it tried to reproduce were not reproducible2. Reproducibility has also been discovered to be a concern in psychology and computer science3.

What causes these high levels of non-reproducibility in scientific literature? A variety of factors is at play. It's often hard, when you read a paper, to get an idea of how the study was performed, because the methods may be poorly described, or the data set is not available. This makes it hard to reproduce. Another factor is that scientists writing papers do not have a strong enough incentive to ensure that their paper will be reproducible. A result might be a fluke, and there is not sufficient incentive, other than one's integrity, to check whether it is a fluke.

A third factor is that when determining how good an article is, our current peer-review system is limited to asking two people for their opinions. Two people is not enough.

I argue that the way to fix the reproducibility crisis is by building a new peer-review system. This should do two things: for a given paper, we should be asking every academic who reads it what they thought, and not just the two peer reviewers who read the paper for the journal; and the peer-review system should provide credit to academics sharing code, data sets and other materials that don't get published in journals.

On the first point, the way peer review works today is that a journal editor receives a submission, then asks two academics to review it. The article is then published and read by the larger academic community: depending on the paper, 500 or 1,000 people may read it. What did they think? This 
is lost data that a peer-review system should pick up. Academics, as they read a paper, are noticing aspects of it that they deem robust, and aspects they deem weak. The more of these opinions we can surface, the more we can develop a view of the strength of an individual paper.

When weakness is identified, the author will have the opportunity to respond, and 
this back and forth will lead to a more complete understanding of the methods of the paper.

On the second point, namely rewarding academics for sharing source data and other material, currently journals don't like to publish things like data sets, code, replications of other people's work, failed replications or negative results. Why is this the case? Largely because journals are obsessed with impact factor, which is the average number of citations accruing to papers published in a journal after two years. Data sets and code don't get cited a lot, so they bring down the average citation count, and therefore impact factor.

Journals obsess about impact factor because academics obsess about it. And academics obsess about it because tenure and grant committees obsess about it. Impact factor is part of how an academic is evaluated for jobs and grants.

We need to break out of this loop. We need an academic to be able to go to his or her tenure committee and say, "Here are all the data sets and pieces of code I have published.

I haven't published them in journals, because, as we know, journals don't publish this material. But you can see that each of them has multiple recommendations from senior academics on various platforms, so these endorsements from the scientific community should count for something."

There are various companies working on these new peer-review systems. My own, Academia.edu, is a social network of academics who comment on and recommend each other's work. Each paper on Academia.edu is assigned a score - a "PaperRank" - which is based on how many recommendations the paper has received, weighted by how well recommended the recommender is. Publishers F1000 and Pubpeer, an online journal club, allow academics to write reviews of papers (anonymously on Pubpeer, and non-anonymously on F1000). ResearchGate is a social network of academics which allows them to ask authors questions.

There is also work being done in highlighting the scale of the reproducibility crisis. For example, a project called The Reproducibility Initiative, is focused on examining reproducibility levels in multiple scientific fields 4.

As these peer-review platforms get built out over the next few years, and others get invented, ideally a more robust consensus will emerge from the scientific community about a given paper, based on a wider pool of expertise.

And ideally we will also see academics be rewarded for sharing data sets, code and other material.

Richard Price is the founder of Academia.edu and previously a fellow of All Souls College, Oxford

  1. In cancer science, many “discoveries” don’t hold up

\2. Reliability of ‘new drug target’ claims called into question

\3. Collberg, C; Proebsting, T; Warren, A. (2015) Repeatability and Benefaction in Computer Systems Research

\4. Reproducibility Initiative

This article was originally published by WIRED UK