UPDATE: comments welcome on Dr. Richard Tol’s draft paper on this issue, see below. This will be a top post for a day, new posts will appear below this one – Anthony
“Men, it has been well said, think in herds; it will be seen that they go mad in herds, while they only recover their senses slowly, and one by one.”
That is from Charles Mackay in his book, Extraordinary Popular Delusions and the Madness of Crowds first published in 1841.
I think it is an apt description of the process that led to Cook et al. (2013) Quantifying the consensus on anthropogenic global warming in the scientific literature because that paper is in fact, a product of a crowd evaluating a crowd. As an example, Dr. Richard Tol has just discovered that using Cook’s own data, the consensus number Cook should have published is 98%, rather than 97%.
Dr. Tol writes in a critique of the Cook et al. paper:
In fact, the paper by Cook et al. may strengthen the belief that all is not well in climate research. For starters, their headline conclusion is wrong. According to their data and their definition, 98%, rather than 97%, of papers endorse anthropogenic climate change. While the difference between 97% and 98% may be dismissed as insubstantial, it is indicative of the quality of manuscript preparation and review.
He shows the Cook data as he compiles it:
You’d think such simple elementary errors in data would have been caught in peer review, after all, that is what peer review is for.
I think that there was a goal by Cook and his crowd, and that goal was to match the 97% number that has become a popular meme in the literature and the media. This intent seems confirmed by a recent statement by one of the co-authors, Dana Nuccitilli in a media argument that 97% global warming consensus meets resistance from scientific denialism
However, we have used two independent methods and confirmed the same 97% consensus as in previous studies.
It is that branding of “denialism” by Nucciltelli to Dr. Tol, who is hardly a “denier” on climate change even by the loosest definition, that has given Tol incentive to now start systemically deconstructing the paper. It also lends a window into the mind of the coauthor Nucitelli, who can’t seem to assimilate useful criticisms, no matter how valid, but instead publicly attributes discovery of real errors in the Cook et al. paper to “denialism” rather than the self-correcting process of science. Nuccitelli’s actions suggest to me, a mindset of zealotry, rather than one of discovery. His actions of branding Dr. Tol’s and others valid criticisms, seem to fit the textbook definition of the word:
As an aside, it seems truly laughable that the Guardian has created an entire regular opinion column based and named on this 97% number, and it supports that idea that this was the “target number” rather than the number that the actual data would report. Richard Tol has just proven their own data doesn’t even match the title of their paper. Will the Guardian now correct the title?
Tol goes on to say this about the crowd-sourcing:
The results thus depend on the quality of the volunteers. Are they neutral observers, or are they predisposed to endorsing or rejecting anthropogenic climate change? Did they suffer from fatigue after rating a certain number of abstracts? 12 volunteers rated on average 50 abstracts each, and another 12 volunteers rated an average of 1922 abstracts each. Fatigue may well have a problem. This level of effort by a volunteer could indicate a strong interest in the issue at hand.
Indeed, and he backs this up by saying it is evident in the data:
WoS generates homoskedastic data. Rating made the data heteroskedastic. Sign of tiredness or manipulation.
So which is it? Tiredness or manipulation, or perhaps both? Based on what has been observed so far, I’d say there is a combination, but given the obvious 97% target, more likely it is an unconscious manipulation by the chosen crowd of volunteer reviewers, which included no climate skeptics and consisted of mostly insiders for Cook’s antithetically named website, “Skeptical Science”. Tol goes on to comment:
No neutral person would volunteer to do 1922 tasks. Cook’s data duly show bias: 35% of abstract were misclassified, 99% towards endorsement.
To support the idea that bias played a role in reaching the conclusions of the Cook et al. paper, there seems to be a systemic sloppiness in the sampling process, as Tol points out in his critique:
In fact, 34.6% of papers that should have been rated as neutral were in fact rated as non-neutral. Of those misrated papers, 99.4% were rated as endorsements. It is therefore reasonable to assume that the volunteers were not neutral, but tended to find endorsements where there were none. Because rater IDs were not reported, it is not possible to say whether all volunteers are somewhat biased or a few were very biased.
Tol also says this about the 97% scientific consensus claim:
It is a strange claim to make. Consensus or near-consensus is not a scientific argument. Indeed, the heroes in the history of science are those who challenged the prevailing consensus and convincingly demonstrated that everyone thought wrong. Such heroes are even better appreciated if they take on not only the scientific establishment but the worldly and godly authorities as well.
Well known examples of this include the challenges to the theory that Earth was the center of the universe, that infection was spread by surgeons who didn’t wash their hands, that the Earth’s crust had plates that moved, and that gastric ulcers were caused by a bacterial infection, and not stress as physicians once widely believed. As William Briggs writes:
There was once a consensus among astronomers that the heavens were static, that the boundaries of the universe constant. But in 1929, Hubble observed his red shift among the stars, overturning that consensus. In 1904, there was a consensus among physicists that Newtonian mechanics was, at last, the final word in explaining the workings of the [universe]. All that was left to do was to mop up the details. But in 1905, Einstein and a few others soon convinced them that this view was false.
Consensus can also cause disaster, as NASA proved with a consensus of management that solid rocket booster O-rings affected by unusual cold weren’t worth worrying about or that a foam strike during launch wouldn’t damage the wing of the space shuttle and were “not even worth mentioning”.
Clearly, the power of thousands in agreement on scientific consensus can’t stand up to stubborn facts and that is the self-correcting process of science which sometimes works slowly, other times dramatically quickly. Given that consensus by itself means nothing in the face of such facts, it seems to me that consensus is just another manifestation of herd-like thinking as illustrated by Mackay.
From the Amazon summary of Mackay’s insightful book on crowds:
First published in 1841, Extraordinary Popular Delusions and the Madness of Crowds is often cited as the best book ever written about market psychology. Author Charles Mackay chronicles many celebrated financial manias, or ‘bubbles’, which demonstrate his assertion that “every age has its peculiar folly; some scheme, project, or fantasy into which it plunges, spurred on by the love of gain, the necessity of excitement, or the mere force of imitation.” This still holds fast today! Among the alleged ‘bubbles’ described by Mackay is the infamous Dutch tulip mania, the South Sea Company bubble and the Mississippi Company bubble. And what do bubbles do? Why they burst of course.
The Cook et al. paper bubble is about to burst.
UPDATE: Read the draft paper Tol is working on here, comments welcome: