Duck Soup: The End of an Error?

In 1998 the Lancet, one of Elsevier’s most prestigious journals, published a paper by Andrew Wakefield and twelve colleagues that suggested a link between the MMR vaccine and autism. Further studies were quickly carried out, which failed to confirm such a link. In 2004, ten of the twelve co-authors publicly dissociated themselves from the claims in the paper, but it was not until 2010 that the paper was formally retracted by the Lancet, soon after which Wakefield was struck off the UK medical register.

A few years after Wakefield’s article was published, the Russian mathematician Grigori Perelman claimed to have proved Thurston’s geometrization conjecture, a result that gives a complete description of mathematical objects known as 3-manifolds, and in the process proves a conjecture due to Poincaré that was considered so important that the Clay Mathematics Institute had offered a million dollars for a solution. Perelman did not submit a paper to a journal; instead, in 2002 and 2003 he simply posted three preprints to the arXiv, a preprint server used by many theoretical physicists, mathematicians and computer scientists. It was difficult to understand what he had written, but such was his reputation, and such was the importance of his work if it was to be proved right, that a small team of experts worked heroically to come to grips with it, correcting minor errors, filling in parts of the argument where Perelman had been somewhat sketchy, and tidying up the presentation until it finally became possible to say with complete confidence that a solution had been found. For this work Perelman was offered a Fields Medal and the million dollars, both of which he declined.

A couple of months ago, Norbert Blum, a theoretical computer scientist from Bonn, posted to the arXiv a preprint claiming to have answered another of the Clay Mathematics Institute’s million-dollar questions. Like Perelman, Blum was an established and respected researcher. The preprint was well written, and Blum made clear that he was aware of many of the known pitfalls that await anybody who tries to solve the problem, giving careful explanations of how he had avoided them. So the preprint could not simply be dismissed as the work of a crank. After a few days, however, by which time several people had pored over the paper, a serious problem came to light: one of the key statements on which Blum’s argument depended directly contradicted a known (but not at all obvious) result. Soon after that, a clear understanding was reached of exactly where he had gone wrong, and a week or two later he retracted his claim.

These three stories are worth bearing in mind when people talk about how heavily we rely on the peer review system. It is not easy to have a paper published in the Lancet, so Wakefield’s paper presumably underwent a stringent process of peer review. As a result, it received a very strong endorsement from the scientific community. This gave a huge impetus to anti-vaccination campaigners and may well have led to hundreds of preventable deaths. By contrast, the two mathematics preprints were not peer reviewed, but that did not stop the correctness or otherwise of their claims being satisfactorily established.

An obvious objection to that last sentence is that the mathematics preprints were in fact peer-reviewed. They may not have been sent to referees by the editor of a journal, but they certainly were carefully scrutinized by peers of the authors. So to avoid any confusion, let me use the phrase “formal peer review” for the kind that is organized by a journal and “informal peer review” for the less official scrutiny that is carried out whenever an academic reads an article and comes to some sort of judgement on it. My aim here is to question whether we need formal peer review. It goes without saying that peer review in some form is essential, but it is much less obvious that it needs to be organized in the way it usually is today, or even that it needs to be organized at all.

What would the world be like without formal peer review? One can get some idea by looking at what the world is already like for many mathematicians. These days, the arXiv is how we disseminate our work, and the arXiv is how we establish priority. A typical pattern is to post a preprint to the arXiv, wait for feedback from other mathematicians who might be interested, post a revised version of the preprint, and send the revised version to a journal. The time between submitting a paper to a journal and its appearing is often a year or two, so by the time it appears in print, it has already been thoroughly assimilated. Furthermore, looking a paper up on the arXiv is much simpler than grappling with most journal websites, so even after publication it is often the arXiv preprint that is read and not the journal’s formatted version. Thus, in mathematics at least, journals have become almost irrelevant: their main purpose is to provide a stamp of approval, and even then one that gives only an imprecise and unreliable indication of how good a paper actually is. (...)

Defences of formal peer review tend to focus on three functions it serves. The first is that it is supposed to ensure reliability: if you read something in the peer-reviewed literature, you can have some confidence that it is correct. This confidence may fall short of certainty, but at least you know that experts have looked at the paper and not found it obviously flawed.

The second is a bit like the function of film reviews. We do not want to endure a large number of bad films in order to catch the occasional good one, so we leave that to film critics, who save us time by identifying the good ones for us. Similarly, a vast amount of academic literature is being produced all the time, most of it not deserving of our attention, and the peer-review system saves us time by selecting the most important articles. It also enables us to make quick judgements about the work of other academics: instead of actually reading the work, we can simply look at where it has been published.

The third function is providing feedback. If you submit a serious paper to a serious journal, then whether or not it is accepted, it has at least been read, and if you are lucky you receive valuable advice about how to improve it. (...)

It is not hard to think of other systems that would provide feedback, but it is less clear how they could become widely adopted. For example, one common proposal is to add (suitably moderated) comment pages to preprint servers. This would allow readers of articles to correct mistakes, make relevant points that are missing from the articles, and so on. Authors would be allowed to reply to these comments, and also to update their preprints in response to them. However, attempts to introduce systems like this have not, so far, been very successful, because most articles receive no comments. This may be partly because only a small minority of preprints are actually worth commenting on, but another important reason is that there is no moral pressure to do so. Throwing away the current system risks throwing away all the social capital associated with it and leaving us impoverished as a result. (...)

Why does any of this matter? Defenders of formal peer review usually admit that it is flawed, but go on to say, as though it were obvious, that any other system would be worse. But it is not obvious at all. If academics put their writings directly online and systems were developed for commenting on them, one immediate advantage would be a huge amount of money saved. Another would be that we would actually get to find out what other people thought about a paper, rather than merely knowing that somebody had judged it to be above a certain not very precise threshold (or not knowing anything at all if it had been rejected). We would be pooling our efforts in useful ways: for instance, if a paper had an error that could be corrected, this would not have to be rediscovered by every single reader.

by Timothy Gowers, TLS | Read more:

Image: “Perelman-Poincaré” by Roberto Bobrow, 2010
[ed. Crowdsourcing peer reviews. Why not? Possibly because the current dysfunctional scientific journal business and its outsized influence on what gets published and therefore deemed important might be threatened?]

Friday, October 27, 2017

The End of an Error?