Does peer review fail to spot outstanding research?

A paper by Siler et al was published last week which attracted quite a bit of attention among those of us who take an interest in scientific publishing and the peer review process. It looked at the citation count of papers that had been submitted to 3 high-impact medical journals and subsequently published, either in one of those 3 journals or in another journal if rejected by one of the 3.

The accompanying press release from the publisher told us that “scientific peer review may have difficulties identifying unconventional and/or outstanding work”. This wasn’t too far off what was claimed in the paper, where Siler et al concluded that their work suggested that peer review “had difficulties in identifying outstanding or breakthrough work”.

The press release was reported uncritically by several organisations that should have known better, including Science, Nature,  and Retraction Watch.

It’s an interesting theory. The theory goes that peer reviewers don’t like to get out of their comfort zone, and while they may give good reviews to small incremental advances in their field, they don’t like radical new research that breaks new ground, so such research may be rejected.

The only problem with this theory is that Siler et al’s paper provides absolutely no data to support it.

Let’s look at what they did. They looked at 1008 manuscripts that were submitted to 3 top-tier medical journals (Annals of Internal Medicine, British Medical Journal, and The Lancet). Most of those papers were rejected, but subsequently published in other journals. Siler et al tracked the papers to see how many times each paper was cited.

Now, there we have our first problem. Using the number of times a paper is cited as a measure of groundbreaking research is pretty crude. Papers can be highly cited for many reasons, and presenting groundbreaking research is only one of them. I am writing this blogpost on the same day that I found that the 6th most important paper of the year according to “Altmetrics” (think of it as citation counting for the Facebook generation), was about how long it takes for boxes of chocolates on hospital wards to be eaten. A nicely conducted and amusing piece of research, to be sure, but hardly breaking new frontiers in science.

There’s also something rather fishy about the numbers of citations reported in the paper. The group of papers with the lowest citation rate reported in the paper were cited an average of 69.8 times each. That’s an extraordinarily high number. Of the 3 top-tier journals studied, The Lancet has the highest impact factor, at 39.2. That means that papers in The Lancet are cited an average of 39.2 times each. Doesn’t it seem rather odd that papers rejected from it are cited almost twice as often? I’m not sure what to make of that, but it does make me wonder if there is a problem with data quality.

Anyway, the main piece of evidence used to support the idea that peer review was bad at recognising outstanding research is that the 14 most highly cited papers of the 1008 papers examined were rejected by the 3 top journals. The first problem with that is that 12 of those 14 were rejected by the journals’ in-house editorial staff without being sent for peer review. So even if there were no further problems with the paper, we couldn’t draw any conclusions about failings of peer review: the failings would be down to journals’ in-house staff.

Another problem is that those 14 papers were not, of course, rejected by the peer review system. They were all published in peer reviewed journals: just not the first journal that the authors tried. So we really can’t conclude that peer review is preventing groundbreaking work from being published.

But in any case, if we ignore those flaws and ask ourselves is it still not true that groundbreaking (or at least highly cited) research is being rejected, I think we’d want to know that the highly cited research is more likely to be rejected than other research.

And I’m afraid the evidence for that is totally lacking.

Rejecting the top 14 papers sounds bad. But it’s important to realise that the overall rejection rate was very high: only 6.2% of the papers submitted were accepted. If the probability of accepting each of the top 14 papers was 6.2%, like all the others, then there is about a 40% chance that all 14 of them would be rejected. And that is ignoring the fact that looking specifically at the top 14 papers is a post-hoc analysis. The only robust way to see if the more highly cited papers were more likely to be rejected would have been to specify a specific hypothesis in advance, rather than to focus on what came out of the data as being the most impressive statistic.

So, to recap, this paper used a crude measure of whether papers were groundbreaking, did not look at what peer reviewers thought of them, found precisely zero high impact articles that were rejected by the peer review system, and found no evidence whatsoever that high-impact articles were more likely to be rejected than any others.

Call me a cynic if you like, but I’m not convinced. The peer review process is not perfect, of course, But if you want to convince me that one of its flaws is that it is biased against groundbreaking research, you’re going to have to come up with better evidence than Siler et al’s paper.

 

Leave a Reply

Your email address will not be published. Required fields are marked *