Bjørn Lomborg writes on his Facebook Page
Ugh. Do you remember the “97% consensus”, which even Obama tweeted?
Turns out the authors don’t want to reveal their data.
It has always been a dodgy paper (http://iopscience.iop.org/1748-9326/8/2/024024/article). Virtually everyone I know in the debate would automatically be included in the 97% (including me, but also many, much more skeptical).
The paper looks at 12,000 papers written in the last 25 years (see here, the paper doesn’t actually specify the numbers, http://notalotofpeopleknowthat.wordpress.com/2013/07/12/watch-the-pea/). It ditches about 8,000 papers because they don’t take a position.
They put people who agree into three different bins — 1.6% that explicitly endorse global warming with numbers, 23% that explicitly endorse global warming without numbers and then 74% that “implicitly endorse” because they’re looking at other issues with global warming that must mean they agree with human-caused global warming.
Voila, you got about 97% (actually here 98%, but because the authors haven’t released the numbers themselves, we have to rely on other quantitative assessments).
Notice, that *nobody* said anything about *dangerous* global warming; this meme simply got attached afterwards (by Obama and many others).
Now, Richard Tol has tried to replicate their study and it turns out they have done pretty much everything wrong. And they don’t want to release the data so anyone else can check it. Outrageous.
Read Tol’s letter to the Peter Høj, University of Queensland: “the main finding of the paper is incorrect, invalid and unrepresentative.” (http://www.uq.edu.au/about/vice-chancellor)
It would be hilarious if it wasn’t so sad.
==============================================================
Dear Professor Høj,
I was struck by a recent paper published in Environmental Research Letters with John Cook, a University of Queensland employee, as the lead author. The paper purports to estimate the degree of agreement in the literature on climate change. Consensus is not an argument, of course, but my attention was drawn to the fact that the headline conclusion had no confidence interval, that the main validity test was informal, and that the sample contained a very large number of irrelevant papers while simultaneously omitting many relevant papers.
My interest piqued, I wrote to Mr Cook asking for the underlying data and received 13% of the data by return email. I immediately requested the remainder, but to no avail.
I found that the consensus rate in the data differs from that reported in the paper. Further research showed that, contrary to what is said in the paper, the main validity test in fact invalidates the data. And the sample of papers does not represent the literature. That is, the main finding of the paper is incorrect, invalid and unrepresentative.
Furthermore, the data showed patterns that cannot be explained by either the data gathering process as described in the paper or by chance. This is documented at https://docs.google.com/file/d/0Bz17rNCpfuDNRllTUWlzb0ZJSm8/edit?usp=sharing
I asked Mr Cook again for the data so as to find a coherent explanation of what is wrong with the paper. As that was unsuccessful, also after a plea to Professor Ove Hoegh-Guldberg, the director of Mr Cook’s work place, I contacted Professor Max Lu, deputy vice-chancellor for research, and Professor Daniel Kammen, journal editor. Professors Lu and Kammen succeeded in convincing Mr Cook to release first another 2% and later another 28% of the data.
I also asked for the survey protocol but, violating all codes of practice, none seems to exist. The paper and data do hint at what was really done. There is no trace of a pre-test. Rating training was done during the first part of the survey, rather than prior to the survey. The survey instrument was altered during the survey, and abstracts were added. Scales were modified after the survey was completed. All this introduced inhomogeneities into the data that cannot be controlled for as they are undocumented.
The later data release reveals that what the paper describes as measurement error (in either direction) is in fact measurement bias (in one particular direction). Furthermore, there is drift in measurement over time. This makes a greater nonsense of the paper.
This is documented here http://richardtol.blogspot.co.uk/2013/08/the-consensus-project-update.html and http://richardtol.blogspot.co.uk/2013/08/biases-in-consensus-data.html.
I went back to Professor Lu once again, asking for the remaining 57% of the data. Particularly, I asked for rater IDs and time stamps. Both may help to understand what went wrong.
Only 24 people took the survey. Of those, 12 quickly dropped out, so that the survey essentially relied on just 12 people. The results would be substantially different if only one of the 12 were biased in one way or the other. The paper does not report any test for rater bias, an astonishing oversight by authors and referees. If rater IDs are released, these tests can be done.
Because so few took the survey, these few answered on average more than 4,000 questions. The paper is silent on the average time taken to answer these questions and, more importantly, on the minimum time. Experience has that interviewees find it difficult to stay focused if a questionnaire is overly long. The questionnaire used in this paper may have set a record for length, yet neither the authors nor the referees thought it worthwhile to test for rater fatigue. If time stamps are released, these tests can be done.
Mr Cook, backed by Professor Hoegh-Guldberg and Lu, has blankly refused to release these data, arguing that a data release would violate confidentiality. This reasoning is bogus.
I don’t think confidentiality is relevant. The paper presents the survey as a survey of published abstracts, rather than as a survey of the raters. If these raters are indeed neutral and competent, as claimed by the paper, then tying ratings to raters would not reflect on the raters in any way.
If, on the other hand, this was a survey of the raters’ beliefs and skills, rather than a survey of the abstracts they rated, then Mr Cook is correct that their identity should remain confidential. But this undermines the entire paper: It is no longer a survey of the literature, but rather a survey of Mr Cook and his friends.
If need be, the association of ratings to raters can readily be kept secret by means of a standard confidentiality agreement. I have repeatedly stated that I am willing to sign an agreement that I would not reveal the identity of the raters and that I would not pass on the confidential data to a third party either on purpose or by negligence.
I first contacted Mr Cook on 31 May 2013, requesting data that should have been ready when the paper was submitted for peer review on 18 January 2013. His foot-dragging, condoned by senior university officials, does not reflect well on the University of Queensland’s attitude towards replication and openness. His refusal to release all data may indicate that more could be wrong with the paper.
Therefore, I hereby request, once again, that you release rater IDs and time stamps.
Yours sincerely,
Richard Tol
http://richardtol.blogspot.co.uk/2013/08/open-letter-to-vice-chancellor-of.html
Discover more from Watts Up With That?
Subscribe to get the latest posts sent to your email.
Jorge says:
August 28, 2013 at 1:52 pm
KNR, it’s “effective” to liberals because they repeat it…. Now it means your energy bills go up by 20%, regulations stifle buisness and kill jobs. So it’s not as appealing to people anymore, and the “true believers” have responded to these people peeling away with even MORE HYSTERICS. …
>>>>>>>>>>>>>>>>
The problem for the masses is there are two different mind sets. One is The Philosophies Of Karl Marx and Hegel or if I REALLY REALLY BELIEVE than it is TRUE (and I click the ruby slippers three times) IT WILL COME TRUE. For those not insulated inside government bureaucracy and Academia, Mother Nature has a tendency to whomp you up side the head with reality so you lose this belief, especially the belief in the pure Hegelian philosophy.
The IPCC and the CO2 climate control knob has to be view with those philosophies in mind. From their point of view “the basic theses, antitheses, and synthesis” has already occurred “We have a Consensus” Therefore it is time to move on to the implementation phase. This is why those of us still stuck in the “thesis, antithesis” phase are called D*ni*rs. It is not that we deny climate change but that we deny the PROCESS of reaching a “Consensus’
You can see the fingerprints of these philosophies in this NUSAP(Numeral Unit Spread Assessment Pedigree ) definition.
So hard scientists “…need to recover from the mindset they might absorb unconsciously from their instruction…” That is the philosophy imbedded in the scientific method must give way to the ‘New Philosophy’
The fact that most people have no training in science or logic and very little training in math makes this ‘Consensus Process’ or ‘Post-Normal Science’ with its appeal to authority a very strong argument. It is only when reality intrudes and bites them on the behind that they reluctantly engage the brain and question what the heck is going on.
> It is no longer a survey of the literature, but rather a survey of Mr Cook and his friends.
You can say that again!
If it weren’t for its being believed by uninformed people, with regrettable results –
I’d say I wouldn’t give a rat’s sphincter even if 97 percent did agree with the CAGW meme and Cook’s conclusion were correct. Like Einstein said, it only takes one experiment – and by extension, one person – to prove a theory wrong. And we’ve got lots of those experiments and persons on the skeptic side.
No amount of consensus can alter a physical fact.
4000 questions? Lets say I can answer 10 questions a minute, that’s 400 minutes or 6 2/3 hours to complete the qustionaire. If the questions were something along the lines of “which of these three numbers is largest”, then I think an average person could achieve 10 answers per minute — maybe a bit better.
But I thought the questions were basically asking the respondents to place each paper into a category. So presumably, the respondent would need to read the abstract, comprehend it, think about it a little bit and then place it in a category. Lets say that take 10 minutes per abstract. Now were talking 40,000 minutes or 667 hours or about 100 days assuming you put about 7 hours per day into it.
What am I missing?
I worked with the now Vice Chancellor some years ago in a previous life. He was an ardent warmist/alarmist. Unless he has changed his views, don’t expect too much in the way of impartiality there.
Reblogged this on Oracle of Liberty and commented:
Bjorn Lomborg, a global warming skeptic but does believe somewhat in AGW, is one of the worlds foremost experts on climate change and its actual true effect on humanity but moreso because the alarmists have hijacked the message concentrating efforts to empower the global elite whereas he shines light on where the focus should be. The poverty stricken people around the world who are neglected so that the Hollywood types can feel good about contributing money to a solar panel farm in Malawi vs using that money to build a clean coal plant or produce clean drinking water. The latter truly helps eradicate poverty and increase living conditions while the solar farm only increases the ‘charitable effect’ of those that sit in the clean coal fired energized homes sipping their cucumber infused clean drinking water.
Bjorn chooses to make a difference on what will actually make a difference. Not what will make a liberal greenie feel better about saving the environment.
I’ve tried asking this elsewhere, but I haven’t gotten an answer yet. I’ll try here.
How does one conclude 57% of the data is unreleased? According to Tol, all we’re missing is timestamps (which we don’t know were collected) and rater IDs. That’s ~60,000 data points. There are ~125,000 data points in a single data file released by Cook et al. That’s far more than what Tol says is unreleased. How can more data be released in a single file than is missing if 57% of the data hasn’t been released?
As an additional point, I see Richard Tol says:
Tol says the suposed measurement error is actually measurement bias. That is, he says there is no random error, only bias. How could this possibly be true? Even if there is measurement bias, we would still expect there to be measurement error.
@ZT
You wouldn’t expect it, would you?
But we have.
University of Queensland (very highly rated – Cook not withstanding)
Queensland University of Technology
Griffith University
University of Southern Queensland
University of the Sunshine Coast
Central Queensland University
Bond University (Private!)
James Cook University
Australian Catholic University has a campus in Brisbane.
Southern Cross University has a campus in the Gold Coast.
(I’ve actually taught in six of those.)
dccowboy says:
August 28, 2013 at 9:56 am: “…This paper appears to be an example of the new ‘post-normal’ science….” I think the term may be “post-moral science”.
@mpaul
That’s indeed one of the key points. The raters read, on average, 2,000 abstracts. Sarah Green read more than 4,000.
This raises two questions. (1) What human would do this voluntarily? Can this person be considered impartial? (2) Can even a highly motivated human do this without loss of focus?
I lose patience with a survey that is too long. I then rush to the end. In a good survey, this is noticed as the time taken to answer a question is recorded.
Given the way this survey was conducted, Cook must have had time stamps. I don’t know whether he saved the data, or saved it and destroyed it later. But it is crucial information to assess the quality of the answers.
Richard Tol excellent work following letter.
Strangely, Richard S.J. Tol says:
There’s is nothing about the way this survey was conducted that would require timestamps have been collected.
I admire Richard Tol’s tenacity, but I feel somewhat like Sancho Panza observing it all.
The question to ask is whether the reviewers realize that we are entering year 17 without atmospheric warming increase.
Think 97% would agree???
In reality, the scientists who do not take position could possibly take the position that the data cannot support a clear conclusion. So they would be against a clear conclusion that AGW is real.
It is a three sides debate:
1- AGW is true.
2- AGW is false.
3- There is no way we can tell if AGW is real or not.
Number 3 might be the best conclusion right now.
Marc77 8:12 am
No, no, no. It is not a matter of TRUE or FALSE.
It is literally a matter of DEGREE.
1a AGW is true and is a planetary emergency to all life.
1b AGW is true and a mild discomfort for which life will adapt.
1c AGW is true and is benificial to most plants and animals
1d AGW is true, but a political mountain made from a ecological molehill.
1e AGW is true, but its affects are nearly invisible.
1f AGW is true and will put off another ice age by at least a thousand years.
Brandon Shollenberger 4:46 am
There’s is nothing about the way this survey was conducted that would require timestamps have been collected.
What if the subjective responses were recorded at super-human rates?
What if the subjective responses are highly correlated sequentially?
What if the subjective responses correlated with the day’s weather?
I can think of NO circumstances that would NOT require timestamps to be collected, particularly given the ease of collection.
@Brandon
It was a distributed, computerized survey. Hard to imagine that time stamps were never recorded. Besides, Cook has never told me he could not give me time stamps. Only that he would not.
Stephen Rasey, timestamps may have been recorded. They certainly should have been recorded. But unless we know they were recorded, we can’t say they’re being hidden. Cook et al cannot hide data they don’t have.
Richard Tol, it may be “hard to imagine” timestamps weren’t recorded, but that doesn’t mean we know they were. And if we don’t know they were, it’s inappropriate to say they’re being hidden.
By the way, why do you respond to that comment of mine yet not the comment of mine that raises substantial issues with your claims? I argued you’ve massively exaggerated a criticism of Cook et al; you ignored me. I pointed out a minor issue about what data exists; you responded. That’s silly.
@Brandon
I suspected time stamps were recorded, so I asked for them. Cook’s response confirmed that they have them.
As to your other point, I’ve told you before that you’re wrong. No need to repeat that discussion.
Richard Tol:
It’s weird you never said this before. You didn’t say it when you argued we should believe timestamps were recorded. Unless you’re claiming Cook told you this in the last twelve hours, you’re being inconsistent.
You told me I’m wrong so that’s the end of the story? That’s a fun approach to discussions. Just say, “You’re wrong!” and leave. John Cook should try it with you. I’m sure you’d react with as much disbelief as I do.
The data you say hasn’t been released is approximately 60,000 data points. Over 125,000 data points were released in a single file (you even host that data yourself).
Your numbers don’t add up.
@Brandon
Calculations are here http://www.sussex.ac.uk/Users/rt220/consensus.html “data and graphs”, “data”
The calculations were done on the basis of the information in the paper. These numbers turn out to be only roughly correct but you’d only move far away from my 57% if you count classifications or key strokes. I do not, because I’m not interested.
Richard Tol, I’ve looked at every file listed on that page, and I didn’t find any such calculations. There is no “data and graphs” file. There are two files listed with that phrase included, “Data and graphs on abstract ratings” and “Bootstrap data and graphs.” Neither has a “data” tab. And as far as I can tell, neither has any sort of calculation that comes up with your 57% missing value. It’s possible I missed something amongst the dozens of tabs in those spreadsheets, but if that’s the case, you ought to provide a reference I can actually use.
I’ve provided numbers for people to use to check my claims. If they don’t believe those numbers, they can look at the data provided and verify it for themselves. It’s simple and easy for them to replicate my work. It’s simple and easy for you to do so. If my numbers are wrong, all you have to do is say so.
I’m not quite sure what you mean by “classifications,” and nobody other than you has suggested keystrokes be recorded. Regardless, the question at hand is how much of the data collected by Cook et al has been released. You don’t get to change numbers by saying you’re “not interested” in certain data so you won’t count it.
I believe fatigue, among other things, has an important role in shaping the results of this project. The other major factor is the difficulty accurately classifying the most abundant categories, 3, 4 and 2. The two are inter-related: difficulty in classification worsens rater fatigue, and fatigue forces the raters’ hand and makes them commit ‘errors’ and/or make stereotypical rating choices. Rater time stamps would settle this issue once and for all, out in the open.
It is quite evident Cook et al, in their amateurish manner, anticipated none of the difficulties in performing a study of this type. They have neither the expertise to design a study of this kind, nor the experience to anticipate and plan around the analytic issues. The data that comes out of exercises of this kind is more a reflection of the methodology than the ‘true’ content of the abstracts.
Examining time stamps or keystroke logs should provide a lot of the information required to examine the process (because the process is the result). Now I’m inclined more to think Cook had absolutely no clue such things might be done, or useful, or required. But there is a chance he did record keystrokes and timestamps. If so, he should release the data. If they were not recorded, he should come out straight and say it. Instead of jerking people around and leading them on.