More pear-shaped trouble for John Cook’s ‘97% consensus’

Readers may recall the post John Cook’s 97% consensus claim is about to go ‘pear-shaped’. This is an update to that, in two parts. First is the introduction, and the second part is a backstory on Cook’s hidden data, the data he didn’t want to share.

Brandon Shollenberger writes:

Introduction for the Upcoming TCP Release

As you may have heard, I recently came into possession of previously undisclosed material for a 2013 paper by John Cook and others of Skeptical Science. The paper claimed to find the consensus on global warming is 97%.

That number was reached by having a group of people read abstracts (summaries) of ~12,000 scientific papers then say which endorse or reject the consensus. Each abstract was rated twice, and some had a third rater come in as a tie-break. The total number of ratings was 26,848. These ratings were done by 24 people. Twelve of them, combined, contributed only 873 ratings. That means 12 people did approximately 26,000 ratings.

Cook et al. have only discussed results related to the ~27,000 ratings. They have never discussed results broken down by individual raters. They have, in fact, refused to share the data which would allow such a discussion to take place. This is troubling. Biases in individual raters are always a problem when having people analyze text.

Biases can arise because of differences in worldviews, differences in how people understand of the rating system or any number of other things. These biases don’t mean the raters are bad people or even bad raters. It just means their ratings represent different things. If you take no steps to address that, your ratings can wind up looking like:

5-10-pre-reconciliation

This image shows the ratings broken down by individual rater for the Cook et al. paper. The columns go from zero to seven. Zero meant no rating was given. The others were given as:

1 Explicitly endorses and quantifies AGW as 50+%
2 Explicitly endorses but does not quantify or minimise
3 Implicitly endorses AGW without minimising it
4 No Position
5 Implicitly minimizes/rejects AGW
6 Explicitly minimizes/rejects AGW but does not quantify
7 Explicitly minimizes/rejects AGW as less than 50%

The circles in each column are colored according to rater. Their size indicates the number of times the rater selected that endorsement level. Their position on the y-axis represents the percentage of ratings by that rater which fell on that level.

As you can see, these circles do not line up. Some circles are higher than others, meaning those raters were more likely to pick that particular value. Some circles are lower than others, meaning those raters were less likely to pick that particular value. That shows the raters were biased. If they weren’t, the circles would have lined up.

Now then, the authors of the paper did take a step to try to address this issue. When two raters gave different ratings to the same abstract, they were given the opportunity to discuss the disagreement and modify their ratings. This reduced the biases present in the ratings, making the data look like this:

5-10-tease

As you can see, the post-reconcilation data has no zero ratings. It also has fewer biases. Fewer is not none, however. The problem of bias still clearly exists. That problem will necessarily affect the study’s results. The biases of raters’ whose circles are largest will necessarily influence the results more than those of raters’ whose circles are smaller.

To see why this is a problem, remember each circle’s size is dependent largely upon how active a rater was. Had different raters been more active, the larger circles would have been in different locations. That means the combined result would have been in a different location as well.

To demonstrate, I’ve created a simple image. Its layout is the same as the last figure, but it shows the data for the 12 most active raters combined (yellow). It also shows what the combined result would have been if the activity of those 12 raters had been reversed (red):

5-11-test

There are readily identifiable differences given this simple test. That shows the effect of the bias in raters affects the final results. It’s true this particular test resulted in differences favoring the Cook et al results, but that doesn’t mean it’s okay. Bias influencing results isn’t okay, and a different test could have resulted in a different pattern,

Regardless, we now know the results of the Cook et al. paper are influenced by the raters’ individual biases. That’s a problem in and of itself, but it raises a larger question. All the people involved in this study belong to the same group (Skeptical Science). All of these people know each other, talk to one another and have similar overall views related to global warming.

If biases between such a homogenous group can influence their results, what would the results have been if a different group had done the ratings? How would we know which results are right?

Update: It’s worth pointing the paper explicitly said, “Each abstract was categorized by two independent, anonymized raters.” That would have mitigated concerns of bias if true. However, it’s difficult to see how a small group of friends can be considered “independent” of one another. That’s especially true when the group actively talked to one another (on a forum ran by the lead author), even about how to rate specific papers, while the “independent” ratings were going on. This issue was first noted here, and it’s highly relevant when considering issues of bias.

============================================================

Cook et al’s Hidden Data

You might think this post about the previously undisclosed material I recently gained possession of. It’s not. Even with the additional material I now have, there is still data not available to anyone.

You see, while people have talked about rater ID information and timestamps not being available, everybody seems to ignore the fact Cook et al chose not to release data for 521 of the papers they examined.

I bring this up because Dana Nuccitelli, second author of the paper, recently said:

Morph – all the data are available, except confidential bits like author self-ratings. We even created a website where people can attempt to replicate our results. We could not be more transparent.

Actually, they could be much more transparent. Here is the data file they released showing all papers and their ratings. It has 11,944 entries. Here is a concordance showing the ID numbers of the papers they rated. It has 11,944 entries. Here is a data file showing all the ratings done by the group. It has entries for 11,944 papers.

The problem is there were 12,465 papers:

The ISI search generated 12 465 papers. Eliminating papers that were not peer-reviewed (186), not climate-related (288) or without an abstract (47) reduced the analysis to 11 944 papers

Cook et al eliminated 521 papers from their analysis. That’s fine. Filtering out inappropriate data is normal. What’s not normal is hiding that data. People should be allowed to look at what was excluded and why. Authors should not be able to remove ~4% of their data in a way which is completely unverifiable.

But it’s worse than that. The authors didn’t do what their description suggests they did. Their description suggests only 47 papers their search generated had missing abstracts. That’s not true. Over two hundred of the results did not have abstracts. We know this because John Cook said so in his own forum. In a topic titled Tracking down missing abstracts, he said:

Well, we’ve got the ‘no abstracts’ down to 70 which is not too shabby out of 12,272 papers (and starting with over 200 papers without abstracts). I’m guessing a number of those will be opinion/news/commentary rather than peer-reviewed papers.

The 12,272 doesn’t match the 12,465 number because more papers were added later. That doesn’t matter though. What matters is at least 200 of their search results did not have abstracts. The group went out and looked for missing abstracts, inserting ones they found. No documentation of these insertations [sic] has ever been released. The fact their search results were modified has never even been disclosed.

It’s impossible to know which abstracts were added. That means it is impossible to verify the correct abstracts were added without verifying all ~12,000 results. That also means it is impossible to ensure the abstracts added were not a biased sample.

There ‘s more. We’re told 186 papers were excluded for not being peer-reviewed. No explanation is given as to how they determined which papers were and were not peer-reviewed. Comments in the forums show there was no formal methodology. People just investigated results which seemed suspicious to them. There is no way to know how good a job they did of removing non-peer-reviewed material.

And there’s still more. We’re told 288 papers were excluded for not being climate-related. Again, no explanation is given as to how this filter was applied. It does not seem to have been applied well. For example, while 288 papers were excluded for this reason, one of the most active raters said this in the forum:

I have started wondering if there’s some journals missing from our sample or something like that, because I have now rated 1300 papers and I think I have only encountered a few papers that are actually relevant to the issue of AGW. There are lots ond lots of impacts and mitigation papers but I haven’t seen much of papers actually studying global warming itself. This might be something to consider and check after rating phase.

If only “a few” out of 1300 papers were “actually relevant to the issue of AGW,” how is it 12,177 papers out of 12,465 were “climate related”? The only explanation I can find is most papers are “climate related” but not “actually relevant to the issue of AGW.” This is an example. It’s one of the 64 papers placed in the highest category (explicitly claiming humans cause 50%+ of recent warming), and it says:

This work shows that carbon dioxide, which is a main contributor to the global warming effect, could be utilized as a selective oxidant in the oxidative dehydrogenation of ethylbenzene over alumina-supported vanadium oxide catalysts. The modification of the catalytically active vanadium oxide component with appropriate amounts of antimony oxide led to more stable catalytic performance along with a higher styrene yield (76%) at high styrene selectivity (>95%). The improved catalytic behavior was attributable to the enhanced redox properties of the active V-sites.

If you don’t know what any of that means, don’t feel bad. The paper is about a narrow chemistry subject which has no bearing on global warming. It’s only relation to climate is that one clause, “which is a main contributor to the global warming effect.” According to Cook et al, that is apparently enough to make it “climate related.” In fact, that’s enough to make this paper one of the 64 which most strongly support global warming concerns.

Given that, it’s difficult to imagine what papers might have been rated “not climate related.” Fortunately, we don’t have to use our imaginations. While it’s true Cook et al excluded all this data from their data files, it turns out that data is available [via] the search function they built.

Nobody could have guessed that. Nobody who downloaded data files would have thought to go to a website and use a function to find information excluded from those data files. Even if they had, the site only shows the final ratings of those papers. It doesn’t show any intermediary data like that in the data files.

Regardless, it does allow us to check some of the concerns raised in this post. For example, we can do a search to see what sort of papers were considered “not climate related.” I’ll provide the only title in category 1 and part of its abstract:

Now What Do People Know About Global Climate Change? Survey Studies Of Educated Laypeople

When asked how to address the problem of climate change, while respondents in 1992 were unable to differentiate between general “good environmental practices” and actions specific to addressing climate change, respondents in 2009 have begun to appreciate the differences. Despite this, many individuals in 2009 still had incorrect beliefs about climate change, and still did not appear to fully appreciate key facts such as that global warming is primarily due to increased concentrations of carbon dioxide in the atmosphere, and the single most important source of this carbon dioxide is the combustion of fossil fuels.

This abstract is more forceful in its endorsement of global warming concerns than many of the ones labeled “not climate related.” It’s topic, what people know about global warming, is certainly more relevant than topics like the molecular chemistry in material production. I could post example after example showing the same pattern. Papers excluded for not being “climate related” are often far more relevant than papers Cook et al included.

You could never find this out by examining Cook et al’s data files though. Those data files exclude the information necessary to check things like this. It’s only because of an undisclosed difference in their data sets that we could ever hope to check their work on this.

By the way, I encourage everyone to use that search feature to find examples of what I refer to. It’s amazing how many of the papers making up the “consensus” are [ ] “actually relevant to the issue of AGW.”

About these ads

72 thoughts on “More pear-shaped trouble for John Cook’s ‘97% consensus’

  1. one of the most active raters said this in the forum:

    I have started wondering if there’s some journals missing from our sample or something like that, because I have now rated 1300 papers and I think I have only encountered a few papers that are actually relevant to the issue of AGW. There are lots ond lots of impacts and mitigation papers but I haven’t seen much of papers actually studying global warming itself. This might be something to consider and check after rating phase.

    Why didn’t they do that (exclude irrelevancies) before the rating phase?

    I suspect that this same flaw–inclusion of the authors of irrelevant impacts and mitigation papers–led to the 97% result in the other two notable 97%-result surveys.

  2. The “fact” that there is a consensus is the keystone of the alarmist argument. Take that away and the whole house of cards collapses. Expose a charlatan and he will scream ever more loudly that he must be believed. The emperor has no clothes.

  3. I don’t know the answer to this question, and maybe it’s a dumb one. Anyway, here it is: Why hasn’t somebody–anybody–just hired a few of the polling companies like Gallup or Rasmussen or Harris to do a poll of scientists on the CAGW issue? I know those companies usually do political polls, but maybe they would do a poll on this issue as well. Seems to me that this is a more trustworthy and credible way of determining the degree to which the CAGW theory is believed and supported in the scientific community (which I hope is actually very low).

    Any ideas out there?

  4. Hahahaha! The vanadium paper is great!

    I wrote a little bit about the peer-review process from a chemistry standpoint not long back, and that vanadium paper fits right in. All the global warming BS in the abstract is just decoration … just filler to make it sound more interesting. Nothing more.

    Another example is the use of ionic liquids (horrible molten solids, actually) as “green” solvents. You can write a paper about doing reactions in the worst “solvents” imaginable, add the word “green” to it, and get it published pronto. It’s all smoke and mirrors though, and usually the real goal is to score a quckie publication, damn the environment.

    http://zombiesymmetry.com/2014/04/08/the-peer-reviewed-scientific-literature-is-mostly-crap/

    The peer-review process from an organic chemist’s point of view. :-P

  5. To have validity for its conclusions, such a study must rigorously select the subject papers on a priori criteria, find raters whose opinions are neutral (as much as possible) toward the subject papers, test and prove the neutrality of raters, and then train them to evaluate using a tested rubric. After the rating is done, tests need to be applied to see if ratings drifted due to fatigue and familiarity. Anything less is an unsound research method that allows too many opportunities for error to creep in.

    I sort of admire the dozen raters plowing through thousands of abstracts, even in a cursory manner, but take the results as nothing more than a curious exploration, useful only for doing a more rigorous study.

  6. Yes im sure he will retract his position once all the facts are exposed (eye roll)

  7. Excellent Work Brandon!

    ” Biases in individual raters are always a problem when having people analyze text.”

    rater bias is a problem in many situations beyond textual analysis. In short any time you use a human being as your instrument you have to take certain precautions and you must record certain data. Suppose I am asking a group of raters to rate a photograph of a person or anything according to an objective criteria. Here are the steps one typically goes through.

    1. Norming the raters. The researcher will pick exemplars that demonstrate how the rating
    system works. raters are all trained using the exemplars.
    2. Calibrating the raters. raters are then calibrated against a test set.
    3. Rating. Raters are then asked to rate unrated items. you keep track of who rates what
    the time they rated it and their rating.
    4. Renorming. Over time raters will drift. For example it is no shock that you see so many 4s
    one reason for this is that raters tend over time to regress to the norm..on a scale of 1-5
    they will tend to call everything a 3. When you look at the ratings over time you can clearly
    see this. So, you renorm. that is you put the raters through another norming process.
    5. Multiple raters. Many people think that multiple raters solves these problems. It doesnt.
    6. Conflict resolution. many people think that having a third person re rate disagreements
    solves the problem. It doesnt. you still have to do this but again keeping good records
    will allow you to answer charges that your process is biased.

    in short in any study and I mean any study that uses humans as a judging instrument, you
    must have a written protocal. You must record the important details of the rating process.
    who rated what? when did they rate it? how often were they in conflict with other raters
    ( compute kappa for example ) how were they in conflict? who resolved conflicts, how did
    they resolve them. Without this A study that uses humans as a instrument
    to make a decision ( does picture X show Y; does text Z indicate P; is sally more pretty than betty; ) is not valid. The issue of rater biased must be addressed and it can only be addressed
    with this data.

    The other point you raise about data availability is important. certain papers were dropped.
    If I am rating a collection of items — say 1200 things– and I decide that 300 of these things
    cant be rated.. then I cant merely say they cant be rated I have to present the data and the
    method I used to exclude them.

  8. Glad I asked that question now ;-)

    But is he right in that all of it is out there or not ? I mean raw and then processed at each stage.

  9. Wait, the content of the papers wasn’t graded? That is, whether or not the paper offered evidence that supported the consensus wasn’t judged, just the abstracts were read? The quality; of the science wasn't considered? Whether or not the content supported the conclusions presented in the abstracts wasn’t judged?

    Talk about junk science!

  10. Naked as a jaybird. We are living in an absurd world where reality is hidden from society by a fog of selfish apathy/silence among real scientists and partisan bluster among politicians and bureaucrats. I’m afraid that science will suffer greatly as society eventually figures out just how much blood and treasure have been expended on this utter nonsense.

  11. Kieth A. Nonemaker:

    “The “fact” that there is a consensus is the keystone of the alarmist argument. Take that away and the whole house of cards collapses. Expose a charlatan and he will scream ever more loudly that he must be believed. The emperor has no clothes.”

    There will always be a consensus though. The question is, what is it a consensus of? It all depends on how you go about phrasing things.

    These guys report a 97% consensus, and that in turn gets batted around the media as a consensus of scientists think the world is about to end. In reality, it’s just a consensus of people who haven’t explicitly said that humans do not contribute to global warming. It’s all smoke and mirrors.

    You write a paper, like that vanadium catalyst paper. You want to dress it up, put some color in it, etc. So you toss in a little blurb about global warming. Nobody, and I mean NOBODY, is going to do the opposite … write a vanadium catalyst paper and state in the abstract that global warming is BS. You either mention global warming in a way that is acceptable to the fashion of the times, or you don’t mention it at all. That a paper such as this is one that was considered by Cook’s crew is amusing as hell.

  12. Dare I say Tree Hut Conspiracy and Idiotation nutitelli Cooked up!! par for the course it seems

  13. It seems to me that the 97% claim was unrealistic from the start. He should have had his study produce something more like 60%. But that would have conflicted with other studies that had around 97%. I doubt if 97% of scientists even agree on the time of day. Even if the 97% was true, why would that be relevant? If 97% turned out to be wrong and 3% turned out to be correct, would majority rule? That would be strange science. Studies like that are not a substitute for making the case.

  14. bet they didn’t read the abstracts, just went apple- command +f MAN

    if it highlighted man – yep man made warming.

  15. Maybe I’ve not understood what Cook had set out to do, but one would think that papers about mitigation would be based on an assumption that there was something to mitigate and therefore consensus.

  16. They are masters of the psychology of propaganda. They know their papers are crap, and that someone will see through it….or even many people. But by then, the TITLE will have been picked up by the media and the damage has been done. They know that no one in the media bothers to actually check the results, and in between papers, they simply attack and demean the people who are onto them so no one will believe them when they say the study is flawed and they have proof.

    This has nothing to do with facts or actual science. Its a multi pronged campaign designed to undermine everything solid. They are undermining their own credibility too, but they don’t care if they get what they want in the end.

  17. I think there might be something wrong with me. Everytime I read the phrase “pear shaped”, I think “bootilicious!!!”

  18. My first question when exploring the Consensus Tool posted by Cook et all was why…if you want to examine a supposed SPECIFIC consensus on “anthropogenic global warming/climate change”, you omit the word “anthrogopgenic” from the search terms you used to establish your data pool….

    I found the answer when I added that specific word, and other similar ones like “man made”, “human caused” etc to my searches in THEIR system. Doing so turns up less than 20 papers that actually address the SPECIFIC type of global warming concensus they set out to verify.

  19. Yeah, 97% of scientists supporting the AGW consensus…..if you think you live in the former USSR maybe.

    Similar propaganda let’s us believe a majority of Europeans is in favor of the EU.

  20. WWS-I flash to Carol Channing singing Diamonds are a Girls Best Friend…”but square cut or pear shaped, these rocks won’t lose their shape…” ( I know…not Carol’s song, but her version of it was hilarious and the one I think of first)

  21. CD (@CD153) says:
    May 13, 2014 at 8:09 am
    I don’t know the answer to this question, and maybe it’s a dumb one. Anyway, here it is: Why hasn’t somebody–anybody–just hired a few of the polling companies like Gallup or Rasmussen or Harris to do a poll of scientists on the CAGW issue? I know those companies usually do political polls, but maybe they would do a poll on this issue as well. Seems to me that this is a more trustworthy and credible way of determining the degree to which the CAGW theory is believed and supported in the scientific community (which I hope is actually very low).

    Any ideas out there?

    I’ve posted repeatedly here that there should be a survey of scientists in climatology and in the neighboring disciplines asking well-thought-out (sophisticated) questions about many facets of this controversy, similar to those posed by the past surveys of the AMS and AGU (last in 2008) by George Mason Univ. (and executed by the Harris polling organization). This would cut the 97% consensus claim down to size. (Unfortunately, I don’t have the ear of Big Oil, apparently.) Here’s a link to their results:

    http://stats.org/stories/2008/global_warming_survey_apr23_08.html

  22. j ferguson says:
    May 13, 2014 at 9:13 am

    Maybe I’ve not understood what Cook had set out to do, but one would think that papers about mitigation would be based on an assumption that there was something to mitigate and therefore consensus.

    But mitigation-authors are not climatologists (specialists in the causes of global warming), so their opinions are not authoritative, they are only assumptions. And who cares what they assume?

  23. @Steven Mosher says:
    May 13, 2014 at 8:16 am

    Calm, reasoned, insightful and helpful commentary. No snarkiness.

    Who are you, and what have you done with the real Steven Mosher???

  24. Forgive me if I am misreading this data, but it looks to me like the MAJORITY of papers have no opinion on Global Warming. If I go to 100 people asking whether it will be warm or cold tomorrow, and 50 of them say “Not Sure”, 40 say warm, and 10 say cold, I haven’t found “consensus” on anything. I have found that a majority of people take no opinion.

  25. I’m still boggled by the OCD required here. 2,237 papers reviewed on average by each reviewer? How long would it take to open the file, read the paper, ascertain its POV, record the result and close the file? Let say you can do 10/hour with allowances for fatigue and delay. That means 224 hours — full time effort for 5.6 weeks. These were highly motivated reviewers — on a mission, so to speak. Of course, you can dramatically speed up the process if you are simply scanning the abstract for confirmatory words. Sounds like zealots on a mission to find confirmation. We need time stamps.

  26. In the field of Military Studies, I have several colleagues that have gotten papers published by including references to climate change in the abstract after first being rejected. There was never any intention that their theses examine climate as an aspect of their research, but they discovered that with a few minor modifications their research would be in demand. Two papers I can remember offhand: the first examined how the lessons of anti-access/area denial in the Falklands War could apply to a future conflict over Taiwan, while the second studied the development of countermeasures to low-observable technology. Both tied in a paragraph or two into their conclusions about how “climate change” could play a role and then modified the abstracts.

    These papers were written using qualitative methods and only a very liberal reviewer would conclude there was any relevance to climate science, but I have to wonder if they made the cut and became part of the consensus. I know the publications these works ended up in had nothing to do with climate, but with the shenanigans that Shollenberger is describing, I wouldn’t be shocked if they were included.

  27. Once you perfect the measurement of the consensus, you will have an estimate of the consensus among scientists who managed to publish in the sampled journals. You will have measured the degree to which the editors have censored information that does not conform to the dogma du jour. You will have evidence of Post Modern Science relying on its tenet of consensus forming. See the discussion between Mann and Jones in their whistle-blown emails.

    Consensus is irrelevant to Modern Science. It ranks scientific models as conjectures, hypotheses, theories and laws according to the degree to which they fit the Real World, make predictions of the Real World, and validate them by independent measurements from the Real World.

    >>Scientific theories are ways of explaining phenomena and providing insights that can be evaluated by comparison with physical reality. Each successful prediction adds to the weight of evidence supporting the theory, and any unsuccessful prediction demonstrates that the underlying theory is imperfect and requires improvement or abandonment. IPCC, AR4, ¶1.2 The Nature of Earth Science, p. 95.

    Go figger! That from the IPCC!! PMS Headquarters. Read on, though, and watch IPCC wander off into PMS:

    >>The attributes of science briefly described here can be used in assessing competing assertions about climate change. Can the statement under consideration, in principle, be proven false? [What happened to predictions and success?] Has it been rigorously tested? Did it appear in the peer-reviewed literature? Did it build on the existing research record where appropriate? If the answer to any of these questions is no, then less credence should be given to the assertion until it is tested and independently verified. [Independent?] The IPCC assesses the scientific literature to create a report based on the best available science (Section 1.6). Id.

    Pass the Midol.

  28. mpaul:

    “That means 224 hours — full time effort for 5.6 weeks. These were highly motivated reviewers — on a mission, so to speak. Of course, you can dramatically speed up the process if you are simply scanning the abstract for confirmatory words. Sounds like zealots on a mission to find confirmation.”

    Bingo!

    The work, if done appropriately, is unimaginable unless the person doing the work is on some kind of holy mission.I would imagine it went along the lines you suggest … scan for certain phrases, open up the “hits,” do a Cntrl-F to find the sentence, read it, check the box, and move on.

  29. If the count was number of papers, that wrong – it should be on number of authors, otherwise
    authors that write multiple papers are given more weight.
    The main problem here is the fact that the study was so stupidly designed.
    Number one problem – this study is supposed to reflect current scientific opinion but reviews papers written not in the present but in the past.
    Number two problem – if you want to know the author’s opinion, well JUST ASK HIM/HER
    so there is no problem interpreting the author’s opinion.
    Number 3 problem : bias. If an author provides no opinion in his paper about
    global warming, it’s likely because he isn’t sure. “Isn’t sure” is an opinion that should
    be counted, rathert than tossed out because no opinion is provided.
    This study is garbage, inside and out.

  30. @rogerknights, I read that survey (thanks for the link). I wonder how many have changed their minds since 2008 seeing that in those days they might have feared for their jobs tenure or even grants ?

  31. If you wanted to use the worst graphing method possible, you were highly successful with your pear-shaped method.

    I quit reading when I got to [the] graphs, because they convey little information to me, without spending a lot of extra time deciphering.

  32. The motivated group had 23061 ratings, the drudgery group had 607 ratings, and two ratings were made per paper. That should result in 11834 papers, not the 11944 they claim to have rated. That is, unless not all papers received two ratings.

    Another minor detail. Their paper says they downloaded the papers in March when their progress graph shows them rating them since early February.

  33. The biases of the raters are unavoidable when top-raters included people like Ari Jokimäki, who based on the information on his own blog (http://agwobserver.wordpress.com) clearly has a mission to “debunk anti-AGW papers”. Such AGW-soldiers can hardly provide scientifically sound ratings of anything when their true mission was to prove the existence of a consensus.

  34. asybot says:
    May 13, 2014 at 10:22 am

    @rogerknights, I read that survey (thanks for the link). I wonder how many have changed their minds since 2008 seeing that in those days they might have feared for their jobs tenure or even grants ?

    Agreed–the Pause must have taken a toll. OTOH, the recent re-endorsements by the official bodies of the AGU and AMS will probably sway some members in the alarmist direction.

  35. None of this careful analysis is going to make the slightest bit of difference. This isn’t to say that it shouldn’t be done of course. But the 97 percent myth is now written in stone. I see some variant of it in almost every alarmist screed I’ve ever read…

  36. Where can I find the fisking of all three of the “97%” studies? I know I’ve read about all of them here, but have been having an email conversation with my US Senator dealing with their all night chat session a few months back. He likes to throw the 97% at me and I would like to present some cogent facts back in his direction – but time is limited. Thanks.

  37. It is worth pointing out that there will be random variations in any such study. Even if the papers simply had the numerals “1” through “7” written on them and the ‘raters’ were simply recording the numbers on the papers they were assigned, some people will get more 3’s and others will get more 4’s. This does not — in and of itself — imply any bias. It would be worth doing a little more statistics to see if there truly were statistically significant differences among the raters.

  38. On your second section, I don’t think either example is a smoking gun. As much as I would have liked to see evidence of scientific misconduct, neither example worked.

    “This abstract is more forceful in its endorsement of global warming concerns than many of the ones labeled “not climate related.” It’s topic, what people know about global warming, is certainly more relevant than topics like the molecular chemistry in material production. ”
    Is simply wrong. That paper was a poll of laymen’s opinions. It was relevant in subject matter, but did not reflect a scientists’ position on AGW, and was not appropriate for Cook’s survey.

    OTOH, the catalyst paper seemed exactly to state scientists’ positions. The reference to AGW was odd, and not relevant, but I would infer that the authors (chemists/scientists) did subscribe to the current CO2 induced AGW theory.

  39. Climate scientists are just as good at social science as they are at statistics, geology, meteorology, biology, chemistry, physics and math.

  40. RogerKnights, if the assumption of having a need to mitigate by a non-climatologist doesn’t add to the consensus, then the paper shouldn’t have been in the survey, wouldn’t you think. And if papers by non-climatologists shouldn’t have been in there, how could they ever have reached those numbers in high 20,000s?

    Did these guys get anything right?

  41. DrTorch:

    “OTOH, the catalyst paper seemed exactly to state scientists’ positions. The reference to AGW was odd, and not relevant, but I would infer that the authors (chemists/scientists) did subscribe to the current CO2 induced AGW theory.”

    You know, it probably started out as:

    “This work shows that carbon dioxide, could be utilized as a selective oxidant in the oxidative dehydrogenation of ethylbenzene over alumina-supported vanadium oxide catalysts.”

    Until the prof or one of his grad. students got the idea to dress it up with the clause “which is a main contributor to the global warming effect.”

    This particular example is just awesome! It’s like, you write a paper on the first synthesis of some natural product which may have some therapeutic uses and so you dress up the abstract and introduction will a little cancer talk. “Breast cancer affects millions of women world-wide, and there is a need for better drugs to treat it, blah, blah, blah.” Or, you write some crappy paper about some crappy reaction that nobody cares about, but one of the reagents is “green,” so that’s what you push. It’s hysterical.

  42. D’oh. Reading this, I saw three typos. You guys highlighted one. Another was typing “vai” instead of “via.” The third is the most important though. My third to last paragraph begins with:

    This abstract is more forceful in its endorsement of global warming concerns than many of the ones labeled “not climate related.”

    The word “not” shouldn’t have been in that sentence.

    Apparently I should proofread better. Or get an editor. Either way, I’m going to go fix those mistakes now.

    [Done. Though the last sentence actually reads legibly both ways. 8<) Mod]

  43. As far as I vaguely recall there is no scientific consensus to the question whether continued output (BAU) of man-made greenhouse gases will cause dangerous warming by 2100.

    Is there a consensus that man is responsible for MOST of the surface warming since 1950? If yes, can I see a reference?

  44. mpaul, I’ve actually posted a bit of information about how quickly ratings were done. Without timestamps, we can’t get a lot of information, but we can see some interesting things. For example:

    I think one thing everyone would be curious to know is the largest number of ratings done in one day was 283. I think that’s a remarkable number. Also remarkable is the day before, the same rater did 238 ratings, and the day after, he did 244. That’s three of the four largest values in the data set….

    In total, there were 500 rater-day values (ratings done by one rater in one day). Six exceed 200; 75 exceed 100 (and two more are exactly 100). Nearly half (199) come in at 50 or more.

    ZombieSymmetry, planebrad, there are a lot of examples of people having simply thrown in a short statement about global warming in the hope it’d help their paper get published. Cook et al could have easily highlighted this problem. Their data set shows it wonderfully. I think one could make a compelling case off it thattheir “consensus” is just a bandwagon effect.

    Bob Koss, yup. There are a surprising number of issues with this paper. Even if you write multiple posts like I have, it’s difficult to cover them all.

    Tim Folkerts, I’ve done such tests (and even posted the results of a few), but there’s a more direct response. The agreement between authors is best in categories with the fewest ratings. If random variance could possibly explain the differences in this graph, we’d expect to see it have a far greater impact in categories 1, 5, 6 and 7.

  45. Specter:

    Until someone more knowledgeable responds, I’ll suggest that your time would be spent most efficiently by focusing initially on two points. First, Cook et al.’s paper is in its own words the “most comprehensive analysis of its kind to date.” Second, almost all of the 97% reported by that paper got into that 97% by falling into that categories so defined that almost all skeptics would fall into them, too. You can verify for yourself by spending an hour with the data here: http://iopscience.iop.org/1748-9326/8/2/024024/media/erl460291datafile.txt.

    In looking at the category definitions (http://iopscience.iop.org/1748-9326/8/2/024024/article, Table 2), note by comparison with the wording of the first “level” how easily the other “levels” could have been defined to distinguish catastrophic-anthropogenic-global-warming believers from skeptics. It is hard to escape the conclusion that the ambiguity was intentional.

    Also note that the paper does not break the results down by “level.” It is even harder to escape the conclusion that this omission was intentional.

    And this is the best such study so far?

  46. Mod, thanks. I agree that sentence was legible both ways. The problem is I was trying to contrast a “not climate related” paper to “climate related” papers to show there’s no clear distinction. It wouldn’t make sense to do that by comparing a “not climate related” paper to “not climate related” papers in general!

    Steven Mosher, thanks! The thing which amazes me the most is these people discussed a number of the problems found in their study in their forum, long before publishing their paper.

    For example, they knew rater bias was an issue. They talked about doing analyses to examine it. If it was an important enough issue John Cook wanted to look at it, why are people saying it’s meaningless when I look at it? Also, did Cook look at this? He said he wanted to once the ratings were completed. If he did look at it, why did he never discuss what he found?

    It’s sort of like how they never really defined their “consensus.” I pointed that out, and they said I was wrong. People like Roy Spencer pointed out they fit in the “consensus,” and Cook et al mocked them for it. This is despite the fact they openly discussed the lack of an actual definition in their forum. They even pointed out how weak a “consensus” position they might find.

  47. Can someone explain exactly how 97% is given as a consensus number for nearly 12, 000 papers when the largest bubbles indicate that no position was taken….

  48. Specter says:
    May 13, 2014 at 11:59 am

    Where can I find the fisking of all three of the “97%” studies? I know I’ve read about all of them here, but have been having an email conversation with my US Senator dealing with their all night chat session a few months back. He likes to throw the 97% at me and I would like to present some cogent facts back in his direction – but time is limited. Thanks.

    Three papers “creating” the 97% claim?

    PLEASE QA this summary!

    I can remember only one other:
    13,000 members of a “science society” were paper-copy letters as a survey.
    7,000 members of that society self-selected themselves by replying (on paper) to one of only five questions.
    700 people were selected from those 7000 who replied based on how many papers they had written.
    ALL non-government “scientists” from those 700 were thrown out.
    Of the “chosen” 77, 75 answered “Yes” to two questions:
    1. Has Global Warming occurred the past 100 years?
    2. Has Man caused part of that Global Warming?

    And, of course, to those two questions, 97% of the Chosen Few answered “Yes” ….
    Or should that be: Only 75 of the original 13,000 answered “Yes” ?

    PS. We do not know what the other three questions were. Nor who answered them, nor what their answers were.

  49. Here is another 97% survey:

    http://www.pnas.org/content/early/2010/06/04/1003187107.full.pdf

    Although preliminary estimates from published literature and expert surveys suggest striking agreement among climate scientists on the tenets of anthropogenic climate change (ACC), the American public expresses substantial doubt about both the anthropogeniccause and the level of scientific agreement underpinning ACC. Abroad analysis of the climate scientist community itself, the distribution of credibility of dissenting researchers relative to agreeing researchers,and the level of agreement among top climate experts has not been conducted and would inform future ACC dis-cussions. Here, we use an extensive dataset of 1,372 climate researchers and their publication and citation data to show that (i) 97 – 98% of the climate researchers most actively publishing in the field support the tenets of ACC outlined by the Intergovernmental Panel on Climate Change, and (ii) the relative climate expertise and scientific prominence of the researchers unconvinced of ACC are substantially below that of the convinced researchers.

  50. Nobody could have guessed that. Nobody who downloaded data files would have thought to go to a website and use a function to find information excluded from those data files.

    [b]Actually you could and I did, I found that last year.[/b]

    This reminds me, I was working on a research post related to this a year ago but got sidetracked when Dr. Tol started writing his rebuttal last year. Maybe I will finish it, just to prove a point.

  51. Specter says:
    May 13, 2014 at 11:59 am

    Where can I find the fisking of all three of the “97%” studies? I know I’ve read about all of them here, but have been having an email conversation with my US Senator dealing with their all night chat session a few months back. He likes to throw the 97% at me and I would like to present some cogent facts back in his direction – but time is limited. Thanks.

    Regarding two “97%” surveys that warmists more often cite, here is a summary of most of their flaws, by WUWT-commenter Robin Guenier:

    “The flaws in the Doran paper are well known: (A) it used a hopelessly inadequate sample size (79 respondents) and demographic (nearly all from N America) and (B) in any case, most sceptics would agree with both its propositions: (1) that the world has warmed since the 1700s and (2) that mankind contributed. It made no mention of GHG emissions.

    “Anderegg is more sophisticated than the hopeless Doran. But there’s a basic problem: it’s concerned with whether or not respondents agree that “anthropogenic greenhouse gases have been responsible for “most [i.e. more than 50%] of the “unequivocal” warming of the Earth’s average global temperature over the second half of the 20th century”. The only scientists qualified to evaluate that are those engaged in detection and attribution (both difficult and uncertain). Yet the research was not confined to such scientists.

    “And, in any case, the research itself is flawed. First, the total number of “climate researchers” who accepted the above statement was, according to the paper, 903 and the total that did not was 472. In other words, 66% – not the much-claimed 97%. The researchers got their 97% by restricting their findings to researchers “most actively publishing in the field” – in other words, the paper’s findings do not cover all “climate scientists”. Further, it wasn’t an opinion survey at all, but an analysis of scientists who signed pro/anti statements – not the most useful documents. And, again, it was essentially confined to North America and was not concerned with whether or not the warming was dangerous. For these reasons, it’s valueless as a measure of climate scientists’ opinion about the dangers of AGW.”

    This George Mason poll http://stats.org/stories/2008/global_warming_survey_apr23_08.html surveyed 489 randomly selected members of either the American Meteorological Society or the American Geophysical Union. It did not cherry pick the respondants who gave them the answer they wanted, and it asked more sophisticated questions, below. Under its “Major Findings” are these paragraphs:

    “Ninety-seven percent of the climate scientists surveyed believe “global average temperatures have increased” during the past century.

    “Eighty-four percent say they personally believe human-induced warming is occurring, and 74% agree that “currently available scientific evidence” substantiates its occurrence. Only 5% believe that that human activity does not contribute to greenhouse warming; the rest [11%] are unsure.

    “Scientists still debate the dangers. A slight majority (54%) believe the warming measured over the last 100 years is NOT “within the range of natural temperature fluctuation.”

    “A slight majority (56%) see at least a 50-50 chance that global temperatures will rise two degrees Celsius or more during the next 50 to 100 years. (The United Nations’ Intergovernmental Panel on Climate Change cites this increase as the point beyond which additional warming would produce major environmental disruptions.)

    “Based on current trends, 41% of scientists believe global climate change will pose a very great danger to the earth in the next 50 to 100 years, compared to 13% who see relatively little danger. Another 44% rate climate change as moderately dangerous.”

    IOW, 59% doubt the “catastrophic” potential of AGW. I suspect that number would be higher now, after six more flat years.

  52. Out of mild interest, I randomly accessed 20 of these papers. Not one had anything demonstrating carbon dioxide would cause global warming.

    There were often references to past global warming events, or if global warming occurs because of carbon dioxide, but none researched the relationship between global warming and carbon dioxide. Some references were so obtuse I cannot see why they were included, and of course, there were the dodgy computer models.. Anyhow, here are the relevant excerpts from the first five papers on the list.

    I cannot see how anyone could see this exercise of John Cook’s as being anything other than complete and utter BS.

    1. The observed retention of population stability despite shifts in the patterns of abundance implies some predictability, and potential effects of climate change (increased temperature, rainfall and raindays) are examined in a context of global warming.

    2. The possibility of global warming resulting from the anthropogenic addition of carbon dioxide and other industrial gases into the atmosphere is a topic of much recent concern.

    3.Global warming may have many consequences for natural ecosystems, including a change in disturbance regimes. No current model of landscapes subject to disturbance incorporates the effect of climatic change on disturbances on decade to century time scales, or addresses quantitative changes in landscape structure as disturbances occur. A new computer simulation model, DISPATCH, which makes use of a geographical information system for managing spatial data, has been developed for these purposes.

    4.A remarkable oxygen and carbon isotope excursion occurred in Antarctic waters near the end of the Palaeocene (-52.3m.yr ago), indicating rapid global warming and oceanographic changes that caused one of the largest deep-sea benthic extinctions of the past 90 million years.

    5. Clearly, the response of outlet systems along the periphery of the East Antarctic ice sheet during the mid-Holocene was expansion. This may have been a direct consequence of climate warming during an Antarctic “Hypsithermal.”

  53. This is interesting. I’ll read it a couple more times, but one issue leaps out. Those graphics with the circles are absolutely indecipherable to me, anyway, and I’m someone with a fair amount of experience with graphs, having been in the finance business for more than a decade, not including two years in business school.

    You really need to find a different way to present that data. The words make sense, but I’m sorry, the pictures don’t. Please regard this as the friendliest and most constructive possible criticism. You really need to go back to the drawing board on those pictures.

  54. Jake J, personally, I don’t understand why the graphs give some people so much trouble. The color of each circle shows the rater, and the size of the circle shows the number of ratings. That much seems obvious to me. The only part which doesn’t seem immediately intuitive to me is the meaning of the y-scale. That’s not bad for a graph which conveys four dimensions of information in a two-dimensional image.

    I think the best solution would be to add fully labeled axes, titles and whatnot. Then start with an image which shows data for only one rater and explain the image. Once people had gotten used to the layout with one color, more could be added in.

    I’d be willing to hear suggestions for an alternative approach. In the meantime, it’s worth pointing out these posts of mine haven’t been about proving anything. They’re just to get people thinking/talking while I deal with the issues surrounding the release of the data.* I can’t publish much in the way of graphics until I’ve resolved those.

    *For an update on that issue, I’m supposed to be contacted by the Deputy Vice Chancellor of Research from the University of Queensland today or tomorrow.

  55. Thank you very much for the analysis and for shedding more light on the matter.
    It is clear why they do not want damning transparency. This is unfortunately a pattern, of giving one just the information to be driven to the desired result but no proper check & analysis made possible.

    “There are lots ond lots of impacts and mitigation papers …”
    That clarifies about papers which “endorse” CAGW. A mitigation paper that received funding to address problems created by CAGW would endorse CAGW implicitly else it would have no reason to exist!

  56. Well, okay. I think a lot of people fall for the premise that science advances by “consensus”. It doesn’t, so let’s not say we have a “consensus” that the original “consensus” was false.

  57. Brandon Shollenberger says:
    May 14, 2014 at 3:06 am

    [. . .] n the meantime, it’s worth pointing out these posts of mine haven’t been about proving anything. They’re just to get people thinking/talking while I deal with the issues surrounding the release of the data.* I can’t publish much in the way of graphics until I’ve resolved those.

    *For an update on that issue, I’m supposed to be contacted by the Deputy Vice Chancellor of Research from the University of Queensland today or tomorrow.

    – – – – – – – – – – –

    Brandon Shollenberger,

    Thank you for letting us view the in-process development of your study.

    This is a rewarding scientifically oriented dialog.

    So, you have stimulated the interest of the Deputy Vice Chancellor of Research from the University of Queensland. I hope you will be able to share any communication with that person. I hope it will not be stipulated as privileged communication by the Deputy Vice Chancellor of Research UQ.

    John

  58. John Whitman, I have no intention of participating in a confidential discussion with him. I’m happy to communicate with people in private regarding their concerns on this topic, but he’s contacting me in an official capacity. I can’t see any reason to have a formal, confidential exchange on this topic.

  59. Brandon Shollenberger says:
    May 14, 2014 at 8:29 am

    – – – – – – – – – –

    Brandon Shollenberger,

    I look forward to hearing from you about your upcoming communication with the Deputy Vice Chancellor of Research from the University of Queensland.

    John

  60. Brandon, the visual sense provides analytic simplicity. Graphs need to be so clear-cut that they barely have to be explained, if at all. Look, I’m really interested in what you’re doing, and highly supportive. I wish I was enough of a specialist in the design and use of graphs to give you useful suggestions, but I’m not.

    You really need to find someone — a friend, a colleague — who is facile and knowledgable in that sub-specialty. I think those pear shaped graphs very badly get in the way, and actually degrade the impact of what you’re trying to convey. I’d love to be able to help more than that, but I’m afraid the most helpful I can be is to tell you that I don’t know enough about the world of graphs to point you to the right ones.

    But you really, really need to look for a different approach.

  61. Brandon Shollenberger says:
    May 14, 2014 at 3:06 am
    Jake J, personally, I don’t understand why the graphs give some people so much trouble.

    ROFLMAO, we all know you did not major in marketing. The first time I saw this I knew many people reading it are not going to understand it. Most are only commenting because it is against the 97% consensus even if they don’t understand why. This argument is not even necessary as most people do not even understand the paper was written by a bunch of skeptical science team members – that is all the information they need to know the ratings were biased.

  62. Rating is normally used by engineers to establish the condition of assets such as road pavements and bridges and is it not rocket science.

    The scoring system needs to be documented and detailed, more than the one liners used in the above post. In this instance I would probably include examples of existing extracts that would be included in each category.

    The raters used in the study should be provided with comprehensive training in the rating methodology used. The rating manual should be their bible.

    There should be a record of who evaluated each abstract, the date and time and score.

    There should be regular random auditing of the rating by the study owner so as to assure the quality of the data collected say a 5% to 10% sample. In most instances rating is subject judgment and it is important that regular random auditing is carried out to reduce any biases in the collection of data.

    Also, I would recommend that these studies used a project management methodology such as PRINCE2 to monitor the study and sign off on the study at each stage ie approval of rating manual and methodology. Could be part of the peer review process.

    It is an unfortunate fact that universities are not providing adequate training to science graduates in fields of statistical analysis and project management. This has become clearly evident in review of abstracts undertaken by Steve McIntyre of Climate Audit.

Comments are closed.