Chiefio Smith examines GHCN and finds it “not fit for purpose”

E.M. Smith over at the blog Musings from the Chiefio earlier this month posted an analysis comparing versions 1 and 3 of the GHCN (Global Historical Climate Network) data set.  WUWT readers may remember a discussion about GHCN version 3 here.   He described why the GHCN data set is important:

 There are folks who will assert that there are several sets of data, each independent and each showing the same thing, warming on the order of 1/2 C to 1 C. The Hadley CRUtemp, NASA GIStemp, and NCDC. Yet each of these is, in reality, a ‘variation on a theme’ in the processing done to the single global data set, the GHCN. If that data has an inherent bias in it, by accident or by design, that bias will be reflected in each of the products that do variations on how to adjust that data for various things like population growth ( UHI or Urban Heat Island effect) or for the frequent loss of data in some areas (or loss of whole masses of thermometer records, sometimes the majority all at once).

 He goes on to discuss the relative lack of methodological analysis and discussion of the data set and the socio-economic consequences of relying on it. He then poses an interesting question:

What if “the story” of Global Warming were in fact, just that? A story? Based on a set of data that are not “fit for purpose” and simply, despite the best efforts possible, can not be “cleaned up enough” to remove shifts of trend and “warming” from data set changes, of a size sufficient to account for all “Global Warming”; yet known not to be caused by Carbon Dioxide, but rather by the way in which the data are gathered and tabulated?…

 …Suppose there were a simple way to view a historical change of the data that is of the same scale as the reputed “Global Warming” but was clearly caused simply by changes of processing of that data.

 Suppose this were demonstrable for the GHCN data on which all of NCDC, GISS with GIStemp, and Hadley CRU with HadCRUT depend? Suppose the nature of the change were such that it is highly likely to escape complete removal in the kinds of processing done by those temperature series processing programs?….

 He then discusses how to examine the question:

…we will look at how the data change between Version 1 and Version 3 by using the same method on both sets of data. As the Version 1 data end in 1990, the Version 3 data will also be truncated at that point in time. In this way we will be looking at the same period of time, for the same GHCN data set. Just two different versions with somewhat different thermometer records being in and out, of each. Basically, these are supposedly the same places and the same history, so any changes are a result of the thermometer selection done on the set and the differences in how the data were processed or adjusted. The expectation would be that they ought to show fairly similar trends of warming or cooling for any given place. To the extent the two sets diverge, it argues for data processing being the factor we are measuring, not real changes in the global climate..The method used is a variation on a Peer Reviewed method called “First Differences”…

 …The code I used to make these audit graphs avoid making splice artifacts in the creation of the “anomaly records” for each thermometer history. Any given thermometer is compared only to itself, so there is little opportunity for a splice artifact in making the anomalies. It then averages those anomalies together for variable sized regions….

 What Is Found

What is found is a degree of “shift” of the input data of roughly the same order of scale as the reputed Global Warming.

 The inevitable conclusion of this is that we are depending on the various climate codes to be nearly 100% perfect in removing this warming shift, of being insensitive to it, for the assertions about global warming to be real.

 Simple changes of composition of the GHCN data set between Version 1 and Version 3 can account for the observed “Global Warming”; and the assertion that those biases in the adjustments are valid, or are adequately removed via the various codes are just that: Assertions….

 Smith then walks the reader through a series of comparisons, both global and regional and comes to the conclusion:

 Looking at the GHCN data set as it stands today, I’d hold it “not fit for purpose” even just for forecasting crop planting weather. I certainly would not play “Bet The Economy” on it. I also would not bet my reputation and my career on the infallibility of a handful of Global Warming researchers whose income depends on finding global warming; and on a similar handful of computer programmers who’s code has not been benchmarked nor subjected to a validation suite. If we can do it for a new aspirin, can’t we do it for the U.S. Economy writ large?  

The article is somewhat technical but well worth the read and can be found here.

 h/t to commenters aashfield, Ian W, and  rilfeld

0 0 votes
Article Rating

Discover more from Watts Up With That?

Subscribe to get the latest posts sent to your email.

120 Comments
Inline Feedbacks
View all comments
barry
June 21, 2012 3:55 pm

There is hardly any difference in the global temperature record between the different GHCN versions and the raw data set. Skeptics (like Jeff Conlon and Roman M at the Air Vent) have come up with global temp records that are actually warmer than those produced by the institutes. But they all rest within margin of error.
Adjustments are more readily noticeable at local scale, but the big picture is not qualitatively affected by adjustments. At a few hundredths of a degree per decade, the fixation on slight differences in global trend between one method or another is overly pedantic.

June 21, 2012 4:23 pm

Barry…if this statement is in dispute…“What is found is a degree of “shift” of the input data of roughly the same order of scale as the reputed Global Warming.” Then please clarify why you think there is “hardly any difference”?

June 21, 2012 4:27 pm

barry says: June 21, 2012 at 3:55 pm
Doesn’t look like you’ve even looked at EMS’ work.
I did, and it’s a tour de force. Thank you EMS. But IMHO it needs re-presentation (a) to grasp in one go (b) with say 20-50 bite-size statements that accumulate the wisdom gained. Say, a rewrite like Bishop Hill did for Steve McIntyre.

Rob R
June 21, 2012 4:27 pm

Barry
This is not the point that chiefio (EM Smith) is addressing.
The analyses you mention (Jeff Condon etc) all take the GHCN version 2 or version 3 thermometer compilation as a “given” and produce global trends from a single version.
The Chiefio is looking primarily at the differences between the different GHCN versions. He is not really focussing on the trend that you can get from a single GHCN version.

Kev-in-Uk
June 21, 2012 4:28 pm

A fair analysis of the REAL state of play, IMHO – but really it’s what we have known for ages, i.e that the manipulated data is essentially worthless!
*drums fingers* – waits for someone like Mosh to come along and remind us that it’s all we have, etc, etc….
I don’t actually mind necessary data manipulations IF they are logged and recorded and explained in sufficient detail that they can be reviewed later. To my knowledge such explanation(s) is/are not available to the general public? What is more worrying is that I doubt the reasoning behind the adjustments are still ‘noted’ anywhere – a la ‘the dog ate my homework’, etc – effectively making any and every subsequent use of the ‘data’ pointless!
We have seen various folk analyse and debunk single stations (or a few at a time) – but does anyone think Jones has been through every stations’ data and ‘checked’ through each and every time series, site change, etc? Somehow I think not – it is likely all computer manipulation at the press of a button and we all know what that means…….(where’s Harry? LOL)

Ian W
June 21, 2012 4:28 pm

barry says:
June 21, 2012 at 3:55 pm
There is hardly any difference in the global temperature record between the different GHCN versions and the raw data set. Skeptics (like Jeff Conlon and Roman M at the Air Vent) have come up with global temp records that are actually warmer than those produced by the institutes. But they all rest within margin of error.

Barry I would really suggest you read Chiefio’s post. Then after that come back here and tell us all the errors that you have found in his approach.

Chris
June 21, 2012 4:50 pm

Way back in the olden days during the brief time I was doing science as a grad student, I was taught that measured results can either be accepted, or with good reason rejected. There was no option to “adjust” numbers. Adjusting was fraud.
Now in the temperature records we no longer have data. We have a bunch of numbers, but the numbers are not data. If adjustments are necessary, they should be presented in painstaking detail, if need be as a separate step. Adjustments are never part of the data: data cannot be adjusted and still remain data. Adjustments are part of the model, not part of the data. They have to be documented and justified as part of the model. Anything else is fraud.

u.k.(us)
June 21, 2012 5:13 pm

Chris says:
June 21, 2012 at 4:50 pm
=========
+1
Though using the term “fraud”, leaves no room to maneuver.
I assume many would like to wriggle out of the trap they have entered.

pouncer
June 21, 2012 5:16 pm

barry says:” the big picture is not qualitatively affected by adjustments.”
Yep, Right. The hairs on the horse’s back may be thicker or thinner but the overall height of the horse isn’t affected.
The thing is that the “simple physics” everybody tells me settles the science of the Earth’s black body radiation budget is based on the average temperature of the Earth in degrees KELVIN. An adjustment of one to two degrees to an average temperature of 20 is already small. Such a variation on an average temperature of about 500 is — well, you tell me.
The effect being measured is smaller than the error inherent in the measuring tool. In order to account for that, very strict statistical protocols must be chosen, documented, tested, applied, and observed for all (not cherry picked samples of) the data.
Note that problems, if any, with the historic instrument record propagate into the pre-historic reconstructions. When calibrating ( “screening” if you like, or “cherry picking”) proxies against “the instrumental record” — does the researcher use the world wide average? The closest local station record? A regional, smoothed, aggregated record? As it turns out the regional extracts of the GHCN record are themselves “proxies” (or as Mosher explains, an “index”) for the actual factor of interest in the “big picture” — the T^4 in degree Kelvin in black body models. If you’re matching your speleo, isotopic or tree-ring record to a debatable instrument proxy — the magical teleconnection screen — why wouldn’t you expect debate about temperatures in the past thousand years?
Chiefio says the record is unfit for use. Barry, what use case do you have in mind for which the record is well fit?

KR
June 21, 2012 5:23 pm

This does not address the fact that the various surface temp readings match the satellite readings (see http://tinyurl.com/6nl33kz), with the caveat that satellite readings are known to be more sensitive to ENSO variations, and that land variations are higher than global variations. Those surface temperatures have been confirmed by the entirely separate satellite series.
Sorry, but the opening post is simply absurd.

June 21, 2012 5:32 pm

EMS has put a lot of work into this analysis and I wish to congratulate him. He has posted a great comment at Jonova’s sit here http://joannenova.com.au/2012/06/nature-and-that-problem-of-defining-homo-sapiens-denier-is-it-english-or-newspeak/#comments at comment 49 worth reading. I wish I had time to put up some more technical information at my pathetic attempt of a blog. I would suggest EMS gets together with Steve (McI) and Ross (McK) to publish some of this.
Keep up the good work. Eventually, truth will out but I am concerned that only legal action will stop the flow of misinformation and power seeking by the alarmists and climate (pseudo)scientists.

pat
June 21, 2012 6:06 pm

Seth is all over the MSM with his simple facts!
20 June: Canadian Press: AP: Seth Borenstein: Since Earth summit 20 years ago, world temperatures, carbon pollution rise as disasters mount
http://ca.news.yahoo.com/since-earth-summit-20-years-ago-world-temperatures-155231432.html

Ian W
June 21, 2012 6:23 pm

KR says:
June 21, 2012 at 5:23 pm
This does not address the fact that the various surface temp readings match the satellite readings (see http://tinyurl.com/6nl33kz), with the caveat that satellite readings are known to be more sensitive to ENSO variations, and that land variations are higher than global variations. Those surface temperatures have been confirmed by the entirely separate satellite series.
Sorry, but the opening post is simply absurd.

Note that EMS was just assessing the differences between GHCN V1 and GHCN V3 from 1880 to 1990. So I am somewhat confused by your comment – can you describe what “satellite surface temp readings” were produced from 1880 – 1990? and how they were assimilated into GHCN V1 and GHCN V3? The issue is the accuracy of anomalies and warming rates over a century based on the GHCN records. These show differing ‘adjustments’ to temperature readings back over a century ago between versions. How do satellite temp readings affect this?

u.k.(us)
June 21, 2012 6:25 pm

Chiefio Smith examines GHCN and finds it “not fit for purpose”
===============
When did we switch from charting the weather’s vagaries, to predicting it ?
How will changing previous weather data, enhance our understanding ?
A battle surely lost.

wayne
June 21, 2012 6:38 pm

Thanks E.M., good article.
How does this affect the GHCN or is it a different matter?
http://www.ncdc.noaa.gov/img/climate/research/ushcn/ts.ushcn_anom25_diffs_urb-raw_pg.gif
All I see is the very rise in temperature that has been created by the adjustments themselves! (but a small ~0.3C difference)

Michael R
June 21, 2012 6:42 pm

This does not address the fact that the various surface temp readings match the satellite readings (see http://tinyurl.com/6nl33kz), with the caveat that satellite readings are known to be more sensitive to ENSO variations, and that land variations are higher than global variations. Those surface temperatures have been confirmed by the entirely separate satellite series.
Sorry, but the opening post is simply absurd.

As Ian pointed out above, I am curious to know how surface temperatures and satelite correlation has anything to do with pre-satelite temperature data? The only way that argument is valid is if we could hindcast temperatures using those satelites – which I would love to know how that is done…
Having good correlation between recent ground thermometres and satelites means squat about previous temperature readings. I also find it curious that almost the entire warming that is supposed to have occured in the last 150 years occured BEFORE the satelite era and if that data shows it has been adjusted several times to create higher and higher warming trends, the effect is lots of warming pre satelite area and a sudden flattening of temp rise post satelite era…. but hang on…. that’s exactly how the data looks and your link shows it.
Unfortunately your link does not prove your argument, but it does support E.M. Smith’s. Maybe try a different tact..

Mark T
June 21, 2012 7:09 pm

Grammar nazi comment: who’s is a contraction for who is; the possessive is whose.
Mark

mfo
June 21, 2012 7:19 pm

A very interesting and thorough analysis by EM. When considering the accuracy of instruments I like the well known example used for pocket calculators:
0.000,0002 X 0.000,0002 = ?
[Reply: I hate you for making me go get a calculator to do it. ~dbs, mod.]
[PS: kidding about the hating part☺. Interesting!]
[PPS: It also works with .0000002 X .0000002 = ?

thelastdemocrat
June 21, 2012 7:22 pm

Puh leezze EDIT POSTS!
“Data” is plural!!! “Datum” is singular!!!
“If that data has an inherent bias in it, by accident or by design, that bias will be reflected in each of the products that do variations on how to adjust that data for various things like population growth ( UHI or Urban Heat Island effect) or for the frequent loss of data in some areas (or loss of whole masses of thermometer records, sometimes the majority all at once).”
This datum, these data.
[REPLY: The memo may not have reached you yet, but language evolves. Ursus horribilis, horribilis est. Ursus horribilis est sum. -REP]
[Reply #2: Ever hear of Sisyphus? He would know what it’s like to try and correct spelling & grammar in WUWT posts. ~dbs, mod.]

E.M.Smith
Editor
June 21, 2012 7:33 pm

@Lucy Skywalker:
It’s open and public. The code is published, so folks can decide if there is any hidden ‘bug’ in it as they feel like it. The whole set, including links to the source code, can be easily reached through a short link:
http://chiefio.wordpress.com/v1vsv3/
Anyone who wants to use it as a ‘stepping off point’ for a rewrite as a more approachable non-technical “AGW data have issues” posting is welcome to take a look and leverage off of it.
@Barry:
The individual data items do not need to all be changed for the effect to be a shift of the trend. In particular, “Splice Artifacts”. They are particularly sensitive to ‘end effects’ where the first or last data item (the starting and ending shape of the curve) have excessive effect.
In looking through the GIStemp code (yes, I’ve read it. Ported it to Linux and have if running in my office. Critiqued quite a bit of it.) I found code that claims to fix all sorts of such artifacts, but in looking at what the code does, IMHO, it tries but fails. So if you have a 1 C “splice artifact” effect built into the data, and code like GISTemp is only 50% effective at removal, you are left with a 1/2 C ‘residual’ that isn’t actually “Global Warming” but an artifact of the Splice Artifacts in the data and imperfect removal from codes like GIStemp and HadCRUT / CRUTEMP.
The code I used is particularly designed not to try to remove those splice artifacts. The intent is to “characterized the data”. To find how much “shift” and “artifact” is IN the data, so as to know how much programs like GIStemp must remove, then compare that to what they do. (My attempts to benchmark GIStemp have run into the problem that it is incredibly brittle to station changes, so I still have to figure out a way to feed it test data without having it flat out crash. But in the testing that I have done, it looks like it can remove about 1/2 of the bias in a data set, but perhaps less.)
So depending on how the particular “examination” of the data set is done, you may find that it “doesn’t change much” or that it has about 1 C of “warming” between the data set versions showing up as splice artifacts and subtle changes of key data items, largely at ends of segments.
In particular, the v1 vs v3 data often show surprising increases in the temperature reported in the deep past, while the ‘belly of the curve’ is cooled. If you just look at “average changes” you would find some go up, and some go down, net about nothing. Just that the ones that go up are in the deep past that is thrown away by programs like GIStemp (that tosses anything before 1880) or Hadley (that tosses before 1850). Only if you look at the shape of the changes will you find that the period of time used as the ‘baseline’ in those codes is cooled and the warming data items are thrown away in the deep past; increasing the ‘warming trend’ found.
Furthermore, as pointed out by Rob R, the point of the exercise was to dump the Version One GHCN and the Version Three GHCN data aligned on exactly the same years, through exactly the same code, and see “what changes”. The individual trends and the particular “purity” of the code don’t really matter. What is of interest is how much what is supposedly the SAME data for the SAME period of time is in fact quite different from version to version.
That the changes are about the same as the Global Warming found from V2 and V3 data, and that the data do produce different trends is what is of interest. It shows the particular data used does matter.
@Kev-in-UK:
It is important to realize that the difference highlighted in this comparison is NOT item by item thermometer by thermometer changes of particular data items. ( i.e. June 18 1960 at Alice Springs) but rather the effect of changes of what total data is “in” vs “out” of the combined data set. Yes, some of what it will find will be influenced by “day by day adjustments”, but the bulk of what is observed is due to wholesale changes of what thermometers are in, vs out, over time.
I know, harvesting nits… but it’s what computer guys do 😉
W:
Good suggestion…
Also note that the v1 data are no longer on line as near as I can tell. I had saved a copy way back when (being prone to archiving data. Old habit.) It may be hard for other folks to find a copy to do a comparison. I’ve sent a copy to at least one other person ‘for safe keeping’ but it is a bit large to post.
Frankly, most folks seem to have forgotten about v1 once v2 was released, and paid little attention to both now that v3 is out.
@Chis:
What I was taught as well. You got an F in my high school Chemistry class if you had an erasure in your lab book. Any change was ONLY allowed via a line-out and note next to it. Then the new data written below.
FWIW, the “adjustments” are now built into GHCN v3. In v2 each station record had a ‘duplicate number’. Those are now “spiced” and QA changes made upstream of the V3 set. This is particularly important as one of the first things I found was that the major shift that is called warming happened with the change of “duplicate number” at about 1987-1990 (depending on station). It is no longer possible to inspect that ‘join’, as it is hidden in the pre-assembly process at NCDC. (But I’ve saved a copy of v2 as well, so I can do it in the future as desired 😉
:
Well said. 😉
@KR:
Reaching for “The Usual Answers” eh?
Did you notice that this is a comparison of v1 and v3 aligned on 1990 when v1 ends? Did you even read the article where that is pointed out? Look at any ONE of the graphs that all start in 1990?
So it has all of about 12 years of overlap with the satellites. Just not relevant. Recent dozen+ years temperatures have shown no warming anyway, so that they match the sats is just fine with me…
:
Thanks for the support. FWIW, I’ve got a version where I cleaned up a typo or two and got the bolding right posted here:
http://chiefio.wordpress.com/2012/06/21/response-to-paul-bain/
with a couple of ‘lumpy’ sentences cleaned up a bit. It’s a bit of a ‘rant’, but I stand by it.
@UK (US):
It’s not predicting weather that bothers me. Folks like Anthony can do that rather well and it doesn’t need the GHCN. It’s the notion of predicting “climate change” and the notion that we can influence at all the climate that is just broken.
When I learned about “climates”, we were taught that they were determined by: Latitude, distance from water, land form (think mountain ranges between you and water), and altitude. So a desert climate is often found behind a mountain range where rain is squeezed out (often as snow) on the mountains (making an Alpine climate).
In the world of Geology I learned, unless you change one of those factors, you are talking weather, not climate…
CO2 does not change latitude, distance from water, land form, nor altitude. So the Mediterranean still has a Mediterranean Climate and the Sahara Desert still has a Desert Climate and The Rockey Mountains still have an Alpine Climate and the Arctic is still an Arctic Tundra Climate. Then again, I’m old fashioned. I like both my science definitions and my historical data to remain fixed…

Mike Jowsey
June 21, 2012 7:37 pm

KR 5:23 says: Those surface temperatures have been confirmed by the entirely separate satellite series.
Were those satellite measuring devices ever calibrated to surface temperature? If so, then the satellite series is not “entirely separate”, but in fact joined at the hip. If not, then how does the proxy of an electronic device in orbit translate to accurate surface land temperature?

June 21, 2012 7:43 pm

barry
Thanks for being the first to comment. It was clear in the past that some global warming believers, or more simply put, those that disagree with this web site, seemed to sit in front of their computers waiting for a new post so they could be the first to comment.

Gail Combs
June 21, 2012 7:55 pm

Chris says:
June 21, 2012 at 4:50 pm
Way back in the olden days during the brief time I was doing science as a grad student, I was taught that measured results can either be accepted, or with good reason rejected…..
_________________________________
BINGO!
That is one of the reasons for doubting the entire con-game in the first place. How the Heck does Jones or Hansen KNOW that all the guys who took the readings in 1910 did it wrong and all the results need to be adjusted DOWN by a couple of hundredths or the guys in 1984 screwed up and all the data needs to be raised by 0.4. A few hundreths in 1910??? The data was never measured that precisely in the first place. http://cdiac.ornl.gov/epubs/ndp/ushcn/ts.ushcn_anom25_diffs_urb-raw_pg.gif
Even more suspicious is the need to CONSTANTLY change the readings. http://jonova.s3.amazonaws.com/graphs/giss/hansen-giss-1940-1980.gif
The data has been so messaged, manipulated and mangled that no honest scientist in his right mind would trust it now especially since Jones says the “dog ate my homework” and New Zealand’s ‘leading’ climate research unit NIWA says the “goat ate my homework”

…In December, NZCSC issued a formal request for the schedule of adjustments under the Official Information Act 1982, specifically seeking copies of “the original worksheets and/or computer records used for the calculations”. On 29 January, NIWA responded that they no longer held any internal records, and merely referred to the scientific literature.
“The only inference that can be drawn from this is that NIWA has casually altered its temperature series from time to time, without ever taking the trouble to maintain a continuous record. The result is that the official temperature record has been adjusted on unknown dates for unknown reasons, so that its probative value is little above that of guesswork. In such a case, the only appropriate action would be reversion to the raw data record, perhaps accompanied by a statement of any known issues,” said Terry Dunleavy, secretary of NZCSC.
“NIWA’s website carries the raw data collected from representative temperature stations, which disclose no measurable change in average temperature over a period of 150 years. But elsewhere on the same website, NIWA displays a graph of the same 150-year period showing a sharp warming trend. The difference between these two official records is a series of undisclosed NIWA-created ‘adjustments’…. http://briefingroom.typepad.com/the_briefing_room/2010/02/breaking-news-niwa-reveals-nz-original-climate-data-missing.html

And if you did not follow ChiefIO’s various links. This one graph of GHCN data set. Version 3 minus Version 1 says it all:comment image
Nothing like lowering the past data by about a half degree and raising the current data by a couple tenths to get that 0.6 degree per century change in temperature….
What is interesting is the latter half of the 1700’s had times that were warmer that to day and overall the temperature was more variable.

KR
June 21, 2012 8:03 pm

E.M.Smith“Reaching for “The Usual Answers” eh?”
You have written a great deal, but included little evidence.
I would point you to http://forums.utsandiego.com/showpost.php?p=4657024&postcount=86 where a reconstruction from _raw_, unadjusted data, from the most rural stations possible (readily available, mind you) has been done. Results? Area weighted temperatures estimated from the most rural 50 GHCN stations closely match the NASA/GISS results. As confirmed by the separately run, separately calibrated satellite data
You have asserted quite a lot – without proving it. You have cast aspersions aplenty – with no evidence. And when uncorrected data is run, the same results as the NASAQ/GISS adjusted data comes out, with only a small reduction in uncertainties and variances.
I don’t often state things so strongly, but your post is b******t. If you don’t agree with the results, show your own reconstruction, show us that the adjustments are incorrect. Failing that, your claims of malfeasance are as worthy as the (zero) meaningful evidence you have presented.
[Reply: Please feel free to submit your own article for posting here. ~dbs, mod.]

Luther Wu
June 21, 2012 8:06 pm

“What is found is a degree of “shift” of the input data of roughly the same order of scale as the reputed Global Warming.”
That’s the “money shot”.

1 2 3 5