Temperature is such a simple finite thing. It is amazing how complex people can make it.
– commenter and friend of WUWT, ossqss at Judith Curry’s blog
Sometimes, you can believe you are entirely right while simultaneously believing that you’ve done due diligence. That’s what confirmation bias is all about. In this case, a whole bunch of people, including me, got a severe case of it.
I’m talking about the claim made by Steve Goddard that 40% of the USHCN data is “fabricated”. which I and few other people thought was clearly wrong.
Dr. Judith Curry and I have been conversing a lot via email over the past two days, and she has written an illuminating essay that explores the issue raised by Goddard and the sociology going on. See her essay:
http://judithcurry.com/2014/06/28/skeptical-of-skeptics-is-steve-goddard-right/
Steve Goddard aka Tony Heller deserves the credit for the initial finding, Paul Homewood deserves the credit for taking the finding and establishing it in a more comprehensible
way that opened closed eyes, including mine, in this post entitled Massive Temperature Adjustments At Luling, Texas. Along with that is his latest followup, showing the problem isn’t limited to Texas, but also in Kansas. And there’s more about this below.
Goddard early on (June 2) gave me his source code that made his graph, but I
couldn’t get it to compile and run. That’s probably more my fault than his, as I’m not an expert in C++ computer language. Had I been able to, things might have gone differently. Then there was the fact that the problem Goddard noted doesn’t show up in GHCN data and I didn’t see the problem in any of the data we had for our USHCN surface stations analysis.
But, the thing that really put up a wall for me was this moment on June 1st, shortly after getting Goddard’s first email with his finding, which I pointed out in On ‘denying’ Hockey Sticks, USHCN data, and all that – part 1.
Goddard initially claimed 40% of the STATIONS were missing, which I said right away was not possible. It raised my hackles, and prompted my “you need to do better” statement. Then he switched the text in his post from stations to data while I was away for a couple of hours at my daughter’s music recital. When I returned, I noted the change, with no note of the change on his post, and that is what really put up the wall for me. He probably looked at it like he was just fixing a typo, I looked at it like it was sweeping an important distinction under the rug.
Then there was my personal bias over previous episodes where Goddard had made what I considered grievous errors, and refused to admit to them. There was the claim of CO2 freezing out of the air in Antarctica episode, later shown to be impossible by an experiment and the GISStimating 1998 episode, and the comment where when the old data is checked and it is clear Goddard/Heller’s claim doesn’t hold up.
And then just over a month ago there was Goddard’s first hockey stick shape in the USHCN data set, which turned out to be nothing but an artifact.
All of that added up to a big heap of confirmation bias, I was so used to Goddard being wrong, I expected it again, but this time Steve Goddard was right and my confirmation bias prevented me from seeing that there was in fact a real issue in the data and that NCDC has dead stations that are reporting data that isn’t real: mea culpa.
But, that’s the same problem many climate scientists have, they are used to some skeptics being wrong on some issues, so they put up a wall. That is why the careful and exacting analyses we see from Steve McIntyre should be a model for us all. We have to “do better” to make sure that claims we make are credible, documented, phrased in non-inflammatory language, understandable, and most importantly, right.
Otherwise, walls go up, confirmation bias sets in.
Now that the wall is down, NCDC won’t be able to ignore this, even John Nielsen-Gammon, who was critical of Goddard along with me in the Polifact story now says there is a real problem. So does Zeke, and we have all sent or forwarded email to NCDC advising them of it.
I’ve also been on the phone Friday with the assistant director of NCDC and chief scientist (Tom Peterson), and also with the person in charge of USHCN (Matt Menne). Both were quality, professional conversations, and both thanked me for bringing it to their attention. There is lots of email flying back and forth too.
They are taking this seriously, they have to, as final data as currently presented for USHCN is clearly wrong. John Neilsen-Gammon sent me a cursory analysis for Texas USHCN stations, noting he found a number of stations that had “estimated” data in place of actual good data that NCDC has in hand, and appears in the RAW USHCN data file on their FTP site
From:John Nielsen-Gammon Sent: Friday, June 27, 2014 9:27 AM To: Anthony Subject: Re: USHCN station at Luling Texas
Anthony –
I just did a check of all Texas USHCN stations. Thirteen had estimates in place of apparently good data.
410174 Estimated May 2008 thru June 2009
410498 Estimated since Oct 2011
410639 Estimated since July 2012 (exc Feb-Mar 2012, Nov 2012, Mar 2013, and May 2013)
410902 Estimated since Aug 2013
411048 Estimated July 2012 thru Feb 2014
412906 Estimated since Jan 2013
413240 Estimated since March 2013
413280 Estimated since Oct 2012
415018 Estimated since April 2010, defunct since Dec 2012
415429 Estimated since May 2013
416276 Estimated since Nov 2012
417945 Estimated since May 2013
418201Estimated since April 2013 (exc Dec 2013).
What is going on is that the USHCN code is that while the RAW data file has the actual measurements, for some reason the final data they publish doesn’t get the memo that good data is actually present for these stations, so it “infills” it with estimated data using data from surrounding stations. It’s a bug, a big one. And as Zeke did a cursory analysis Thursday night, he discovered it was systemic to the entire record, and up to 10% of stations have “estimated” data spanning over a century:

And here is the real kicker, “Zombie weather stations” exist in the USHCN final data set that are still generating data, even though they have been closed.
Remember Marysville, CA, the poster child for bad station siting? It was the station that gave me my “light bulb moment” on the issue of station siting. Here is a photo I took in May 2007:
It was closed just a couple of months after I introduced it to the world as the prime example of “How not to measure temperature”. The MMTS sensor was in a parking lot, with hot air from a/c units from the nearby electronics sheds for the cell phone tower:
Guess what? Like Luling, TX, which is still open, but getting estimated data in place of the actual data in the final USHCN data file, even though it was marked closed in 2007 by NOAA’s own metadata, Marysville is still producing estimated monthly data, marked with an “E” flag:
USH00045385 2006 1034E 1156h 1036g 1501h 2166i 2601E 2905E 2494E 2314E 1741E 1298E 848i 0 USH00045385 2007 797c 1151E 1575i 1701E 2159E 2418E 2628E 2620E 2197E 1711E 1408E 846E 0 USH00045385 2008 836E 1064E 1386E 1610E 2146E 2508E 2686E 2658E 2383E 1906E 1427E 750E 0 USH00045385 2009 969E 1092E 1316E 1641E 2238E 2354E 2685E 2583E 2519E 1739E 1272E 809E 0 USH00045385 2010 951E 1190E 1302E 1379E 1746E 2401E 2617E 2427E 2340E 1904E 1255E 1073E 0 USH00045385 2011 831E 991E 1228E 1565E 1792E 2223E 2558E 2536E 2511E 1853E 1161E 867E 0 USH00045385 2012 978E 1161E 1229E 1646E 2147E 2387E 2597E 2660E 2454E 1931E 1383E 928E 0 USH00045385 2013 820E 1062E 1494E 1864E 2199E 2480E 2759E 2568E 2286E 1807E 1396E 844E 0 USH00045385 2014 1188E 1247E 1553E 1777E 2245E 2526E -9999 -9999 -9999 -9999 -9999 -9999
Source: USHCN Final : ushcn.tavg.latest.FLs.52i.tar.gz
Compare to USHCN Raw : ushcn.tavg.latest.raw.tar.gz
In the USHCN V2.5 folder, the readme file describes the “E” flag as:
E = a monthly value could not be computed from daily data. The value is estimated using values from surrounding stations
There are quite a few “zombie weather stations” in the USHCN final dataset, possibly up to 25% out of the 1218 that is the total number of stations. In my conversations with NCDC on Friday, I’m told these were kept in and “reporting” as a policy decision to provide a “continuity” of data for scientific purposes. While there “might” be some justification for that sort of thinking, few people know about it there’s no disclaimer or caveat in the USHCN FTP folder at NCDC or in the readme file that describes this, they “hint” at it saying:
The composition of the network remains unchanged at 1218 stations
But that really isn’t true, as some USHCN stations out of the 1218 have been closed and are no longer reporting real data, but instead are reporting estimated data.
NCDC really should make this clear, and while it “might” be OK to produce a datafile that has estimated data in it, not everyone is going to understand what that means, and that the stations that have been long dead are producing estimated data. NCDC has failed in notifying the public, and even their colleagues of this. Even the Texas State Climatologist John Nielsen-Gammon didn’t know about these “zombie” stations until I showed him. If he had known, his opinion might have been different on the Goddard issue. When even professional people in your sphere of influence don’t know you are doing dead weather station data infills like this, you can be sure that your primary mission to provide useful data is FUBAR.
NCDC needs to step up and fix this along with other problems that have been identified.
And they are, I expect some sort of a statement, and possibly a correction next week. In the meantime, let’s let them do their work and go through their methodology. It will not be helpful to ANYONE if we start beating up the people at NCDC ahead of such a statement and/or correction.
I will be among the first, if not the first to know what they are doing to fix the issues, and as soon as I know, so will all of you. Patience and restraint is what we need at the moment. I believe they are making a good faith effort, but as you all know the government moves slowly, they have to get policy wonks to review documents and all that. So, we’ll likely hear something early next week.
These lapses in quality control and thinking that infilling estimated data for long dead weather stations is the sort of thing happens when the only people that you interact with are inside your sphere of influence. The “yeah that seems like a good idea” approval mumble probably resonated in that NCDC meeting, but it was a case of groupthink. Imagine The Wall Street Journal providing “estimated” stock values for long dead companies to provide “continuity” of their stock quotes page. Such a thing would boggle the mind and the SEC would have a cow, not to mention readers. Scams would erupt trying to sell stocks for these long dead companies; “It’s real, see its reporting value in the WSJ!”.
It often takes people outside of climate science to point out the problems they don’t see, and skeptics have been doing it for years. Today, we are doing it again.
For absolute clarity, I should point out that the RAW USHCN monthly datafile is NOT being infilled with estimated data, only the FINAL USHCN monthly datafile. But that is the one that many other metrics use, including NASA GISS, and it goes into the mix for things like the NCDC monthly State of the Climate Report.
While we won’t know until all of the data is corrected and new numbers run, this may affect some of the absolute temperature claims made on SOTC reports such as “warmest month ever” and 3rd warmest, etc. The magnitude of such shifts, if any, is unknown at this point. Long term trend will probably not be affected.
It may also affect our comparisons between raw and final adjusted USHCN data we have been doing for our paper, such as this one from our draft paper:
The exception is BEST, which starts with the raw daily data, but they might be getting tripped up into creating some “zombie stations” of their own by the NCDC metadata and resolution improvements to lat/lon. The USHCN station at Luling Texas is listed as having 7 station moves by BEST (note the red diamonds):
But there really has only been two, and the station has been just like this since 1995, when it was converted to MMTS from a Stevenson Screen. Here is our survey image from 2009:
Photo by surfacestations volunteer John Warren Slayton.
NCDC’s metadata only lists two station moves:
As you can see below, some improvements in lat/lon accuracy can look like a station move:
http://www.ncdc.noaa.gov/homr/#ncdcstnid=20024457&tab=LOCATIONS
http://www.ncdc.noaa.gov/homr/#ncdcstnid=20024457&tab=MISC
Thanks to Paul Homewood for the two images and links above. I’m sure Mr. Mosher will let us know if this issue affects BEST or not.
And there is yet another issue: The recent change of something called “climate divisions” to calculate the national and state temperatures.
Certified Consulting Meteorologist and Fellow of the AMS Joe D’Aleo writes in with this:
I had downloaded the Maine annual temperature plot from NCDC Climate at a Glance in 2013 for a talk. There was no statistically significant trend since 1895. Note the spike in 1913 following super blocking from Novarupta in Alaska (similar to the high latitude volcanoes in late 2000s which helped with the blocking and maritime influence that spiked 2010 as snow was gone by March with a steady northeast maritime Atlantic flow). 1913 was close to 46F. and the long term mean just over 41F.
Seemingly in a panic change late this frigid winter to NCDC, big changes occurred. I wanted to update the Maine plot for another talk and got this from NCDC CAAG.
Note that 1913 was cooled nearly 5 degrees F and does not stand out. There is a warming of at least 3 degrees F since 1895 (they list 0.23/decade) and the new mean is close to 40F.
Does anybody know what the REAL temperature of Maine is/was/is supposed to be? I sure as hell don’t. I don’t think NCDC really does either.
In closing…
Besides moving toward a more accurate temperature record, the best thing about all this hoopla over the USHCN data set is the Polifact story where we have all these experts lined up (including me as the token skeptic) that stated without a doubt that Goddard was wrong and rated the claim “pants of fire”.
They’ll all be eating some crow, as will I, but now that I have Gavin for dinner company, I don’t really mind at all.
When the scientific method is at work, eventually, everybody eats crow. The trick is to be able to eat it and tell people that you are honestly enjoying it, because crow is so popular, it is on the science menu daily.
![marysville_badsiting[1]](http://wattsupwiththat.files.wordpress.com/2014/06/marysville_badsiting1.jpg?resize=480%2C360&quality=83)






Personally I think the code looks for just odd ‘cold’ readings and moves them up.
Oh, it does both. If a station is disproportionately warm, it will be adjusted lower. Even some Class 1\2 stations have downward adjustments.
BUT . . . the code adjusts towards the majority. And (during our study period of 1979 – 2008) 80% of the stations are carrying, on average, an extra ~0.14C warming per decade. As a result, the code adjusts relatively few stations downward, but adjusts just about all the lower-trend, well sited stations ‘way up.
They are not dishonest, just victims of confirmation bias: They assume the dataset is essentially sound. They think the logic of their code is okay. They think the results are just peachy-keen. So they look no further. They do not consider what homogenization does if the majority of the dataset is compromised.
Any game developer worth half his salt would never have missed a flub like that. I am a game designer/developer. And I did not miss that. #B^)
So the United States Historical Climatology Network is effectively making up a very significant slice of its temperature data some of it for stations that don’t exist, and has been doing for decades, and they were completely unaware of it?
Really?
Any of you gentlemen want to buy a really nice bridge?
All it takes is an errant piece of code. Even the regional managers don’t appear to love their stations like the volunteer curators generally do. So they fall through the cracks.
Anthony, while you’re free not to agree with what I say, you’re not free to simply state you “don’t agree” with me and leave it at that. I made my case very clear. If it is wrong, you should show it is wrong. You should not merely assert it is wrong. It would be a trivial matter to provide quotations to back up what you say, if you were correct. The fact you refuse to take such a simple step would seem to imply you aren’t correct.
Similarly, saying you don’t agree with my post’s title is meaningless unless you say why you don’t agree with it. In fact, it’s kind of pathetic. The entire point of skepticism is to argue points of disagreement. You’re doing the exact opposite. You’re refusing to argue anything, choosing to instead wave your hands and say I’m wrong.
This is simple. If I’m wrong, show I’m wrong. Don’t just editorialize and posture.
REPLY: see here’s the thing, I just don’t care. I’ve tried to explain but you don’t like my explanations. Your points have no value to me, right or wrong, and I’m not going to waste any more time arguing over the value of somebody’s else’s article nor your points which I still don’t understand why you have your knickers in a twist over. For the purpose of getting you to stop being pedantic and cluttering up this thread with an issue I don’t consider important, I’ll just say I’m wrong. But nothing is going to change in the article above. This will be the last comment on the subject. – Anthony
tomcourt says:
June 29, 2014 at 6:15 am
Recall that there were really two experiments, one being loose samples of dry ice in a perforated box, and one I suggested, samples in a not quite sealed plastic bag. The results were:
Given this was a lab grade refrigerator with NIST tracability of the thermostat, I suspect there was circulation fan. Even if there weren’t, the results were exactly what I expected – the amount of dry ice was not enough to displace all the air so I wasn’t concerned that the freezer would become a pure CO2 environment. Indeed, the unbagged dry ice completely sublimated. I thought only some would be left, but complete sublimation was fine. I expected some sublimation of the bagged ice, and perhaps redeposition on the surface, ala freezer burn on frozen food, but I expected more than half would remain, and much more than the unbagged ice. That 90% survived was at the high end of my expectations.
I thought all that was well described in the WUWT post, I’m surprised at your critique. Please reread the post. The refrigerator result pretty much confirmed that dry ice cannot form at Vostok and provided a good tangible reference to go along with the theoretical claims I and others made.
I never expected we’d still be talking about this on multiple fronts five years later.
Another “lie of the year” from Poltifact. They are becoming monotonous with their goofs.
If the USHCN dataset is ‘what we know’, the key question is ‘how do we know it?’
And, with regard to this latter question, there are two separate issues, as Anthony has pointed out.
Firstly, what is the ‘quality’ of the temperature records from each of the stations? E.g. is the siting acceptable, are the sensors and recording mechanism trustworthy, is actual data being provided, etc., etc.?
Secondly, what, exactly, is the ‘process’ of converting the data from each of the various stations into the USHCN dataset?
There are 1218 stations, I think. So would it be possible to crowd-solve the ‘quality’ issue by taking a representative random sample of those stations (say 100) and ask for approximately 20 volunteers to each physically examine five of those sample stations and assess them and also their temperature records for their relative quality? Clearly some kind of standard assessment would be needed (a simple questionnaire?) for judging the stations, but that shouldn’t be too hard to create.
Once the ‘quality’ issue is clear it will be easier to understand exactly how the ‘process’ does work. And as part of that discovery, perhaps those same volunteers could also look for and document revisions to the temperature records of each station over time.
I’m not sure how feasible all this would be, and maybe there’s a better way to audit the data. But this, at least, puts action in our hands.
Just a thought.
Dear Mr. Watts:
Congratulations on an essay demonstating true courage – most people, when wrong, hem, haw, and cavil until the problem fades into the past. You faced it head on and did absolutely the right thing. Way to go!
(It’s sad that this is kind of behavior is rare, but good to see it.)
Waiting for a permanent link to Steven Goddard’s Website (I usually have to go to BH for the link). You could at least put it under Rants or Unreliable but it should be there somewhere (I know I should use bookmarks but never do)
Glad you’re seeing the light here Tony. As I said before, I was skeptical of Steve’s claim too, until I downloaded and did my own analysis, writing my own code from scratch (I posted the code at Steve’s blog).
Steve’s certainly tendentious, but he was also definitely correct on this point: 43% of the last month’s data was marked with an E. It will be very interesting to compare this data to CRN and SurfaceStations.
Where is the accountability in all of this? I have the greatest respect for you Anthony and Steven and great appreciation for what you both do. in the couple years I lurked at both of your sites I strove to stay out of the spats because when it comes to what you guys do, I’m an unarmed man. But I do know something about management and leadership and I can’t buy the “group think” excuse for those that are supposed to be managing this system and/or monitoring the quality of data it is producing. It sounds like something a politician would say when they are caught red handed. It blames a bureaucracy and holds no individual accountable for anything. No matter what the excuse we have here one or more of three things. Gross neglect, Gross mismanagement, or Blatant fraud. For any of them there has to be accountability otherwise all the revelations about what has been discovered will lead to a disappointing and most likely only temporary resolution.
Brandon Shollenberger says: June 28, 2014 at 8:07 pm
Yes, the Politifact piece is horrible. It conflates the animated GIF with the missing data claims, and purports to “debunk” both of them — even though they are both simple analyses of officially-released data.
Gell-Mann amnesia at its finest. Journalists often seem to have no skills beyond an ability to type.
Jimbo says:
June 29, 2014 at 1:39 pm
“It seems to me that with the temperature standstill Warmists will also learn about ‘truth’. If you are right you are right, and if you are wrong you are wrong. Time is the arbiter with climate. If it starts getting too hot, then we will also learn the ‘truth’.”
That we are essentially talking about +0.7C in a century or more, about warming for ~20yrs and now no warming to cooling for 17yrs. speaks volumes about the science. No matter what happens in the future, the CO2 warming hypothesis is dead. One seventh of this century has elapsed with slight cooling. Every year that goes by shrinks CO2’s potential share in warming toward insignificance.
Adding a growing likelihood that, no matter what CO2’s sensitivity in fact is, the earth – mainly through its oceans countervails the warming (from any cause). Much of the proof for the earth’s thermostatic control (over and above the brilliant work by Eschenbach) has been hanging out there for all to see. An unbroken chain of life for at least 1B years speaks loudly of the stability of earth climate. Despite asteroid impacts, massive volcanic out-pourings in various eras (unbreathable in the Archean), Ice Ages, evidence for Snowball Earth (as a geologist I’m not convinced of this, though) etc., the earth has countered these ‘tipping point’ events and restored temperatures to a moderate level about which the ‘extremes’ oscillate ~2-3%K. Moreover, there is no evidence to suggest we are through with ice age cold and it is likely we are well past the halfway mark going back into it. Let us pray we can find a way to alter climate and make a bit of global warming at some point.
I am mad as hell and not going to take it anymore. I just don’t understand how scientific papers need to be “juried” and approved by many people who are on the line for the articles published but people at NOAA and NCDC or wherever these algorithms and modifications get “codified” can do this willy nilly without anyone having to justify what they are doing? Is there a paper on the modifications made, the code used, the examples of how it applies in practice, evidence to back up these modifications?
There should be papers, multiple papers on all these factors:
1) time observation bias
2) station movements, late station reports
3) station reporting out of line data
4) stations decommissioned
For observation time bias the adjustments seem crazy high from the data that SG and others have presented over the years. I believe I read that 90%+ of the adjustment of the 1C in the last century can be traced to that. If so, then there should be a paper on this alone that explains how this is done and justifies it in terms of other measurements made at the same location at different times. If the entire GW temperature change depends critically on the amount of this change we add in or subtract out then there should be a lot more than a footnote saying we did this. If this is being done wrong then the entire amount of warming could be changed radically.
Station movements and decommissioned, the issue of averaging where there are too few stations. There should be a detailed set of papers talking about how good these kinds of estimates are. For instance with antarctica we could put a bunch of automated stations out there and monitor how they change in relation to how we predict those stations should have reported. I’ve heard there are a couple stations in all of antarctica and few in northern canada etc. Because of the lack of uniform density of stations this could mean that we really have no clue what’s happened in these places.
Stations reporting out of line. I understand that the algorithms take into account if a station shows a big movement relative to its peers closeby. The station gets ignored or adjusted to reflect a more reasonable temperature. I personally would like some field work on this. I would like us to understand what is really going on with these stations to justify the modifications being made. Do we know why they get anomalous readings and can we verify specific instances where these corrections are made and we can show WHY they were justified.
All I believe I am asking for is for what I consider reasonable. If the entire temperature trend of the 20th century depends on these adjustments then a righorous study should be made to determine if the adjustments really make sense. Are there real world things that can be explained to anyone why the adjustments are being made and specific examples of how that makes sense. As I understand it the vast majority of the data are tampered with. Therefore the entire proof of Climate change depends on justifying these adjustments yet they seem to have less peer review than ordinary science articles. I just don’t get this. Nobody seems accountable.
Even if the NOAA or NCDC fix the data as Steve suggests I still have no faith that the data that they now say is “clean” is “clean”. I would believe as any rational person must that without anyone explaining like I describe above in peer reviewed papers precisely what they are doing to the data, examples of this with real-life stories of why the adjustments actually make sense I believe that experimenter bias is likely responsible for a large part of the temperature trend.
[snip – Brandon, I’m sorry but as stated above I’m not discussing this anymore. We’ll simply have to agree to disagree – Anthony]
I totally agree with Joe Bastardi and overall, this infighting is ultimately self-defeating and plays right into the AGW thugs’ hands. At the very minimum the passive aggressive Mannesque posturing and comments do not belong. I really look up to you Anthony, and even your apology was, for a lack of a better word, sad.
You guys need to hug this out and keep these sort of things worked out by a phone call rather than this public display.
I do love you still, Anthony. I hope you do better, in the future.
🙁
I wonder: Are all these adjustments really necessary? How many well designed and reliable points of measurement will you need to get a sufficiently accurate yearly average?
Or to be more precise: How many points of measurement will you need to get a measured yearly average with sufficiently low standard uncertainty to be able to detect a positive trend trend of 0,015 K/year (1,5 K/century)?
I consider the calculated average of a number of temperature readings, performed at a defined number of identified locations, as a well defined measurand. Hence, the standard uncertainty of the average value can then be calculated as the standard deviation of all your measurements divided by the square root of the number of measurements. ( See the open available ISO standard: Guide to the expression of Uncertainty in Measurements).
Let us say that you have 1000 temperature measurement stations. which are read 2 times each day, 365 days each year. You will then have 730 000 samples each year.
(Let us disregard potential correlation for a moment.)
If we assume that 2 standard deviations for the 730 000 samples is 20 K.
(This means that 95 % of the samples are within a temperature range of 40 K.)
An estimate for the standard uncertainty for the average value of all samples will then be:
2 Standard uncertainties for the average value = 2 Standard deviations for all measurements / (Square root of number of measurements)
20 K / (square root(730 000)) = 20 K / 854 = 0.02 K.
This means that a year to year variation in the average temperature that is larger than 0,02 K cannot reasonably be attributed to uncertainty in the determination of the average. This further means that a variation larger than 0,02 K can reasonably be attributed to the intrinsic variation of the measurand.
If I further assume that 2 standard deviations of the yearly average temperature measured at a high number of locations is in order of magnitude 0,1 K (Remaining variation of the feature when trends are removed). This means that 95 % of the calculated yearly average temperatures is within the range + 0,1 K to – 0,1 K from the average of all yearly averages (If trends are removed).
Since the standard uncertainty of the measured average (0,02 K) is much less than the standard uncertainty of the feature we are studying ( 0,1 K), I regard the uncertainty to be sufficiently low. Hence 1000 locations and 2 daily readings seems to be sufficiently high for the defined purpose.
However, the variation of the measurand, yearly average of your temperature measurements, now seems to be too high to be able to see a trend of 0,01 K / year. One approach can then be to calculate the average over several years. The standard uncertainty of the average temperature for a number of years will then be equal to the standard deviation of the yearly average (0,1 K) divided by the square root of number of years. Let us try an averaging period of 16 years. 2 standard uncertainties for the average temperature for a period of 16 years can then be calculated as 0,1 K / (square root(16)) = 0,1 K / 4 = 0.025 K.
If you choose an averaging period of 16 years, the standard uncertainty of the measured average value can now be recalculated, as the number of measurements has increased by 16 times to: 16 * 730 000 = 11 680 000. Two standard uncertainties for the average value is now 0,006 K. Hence the number of measurement locations can be reduced. Even if I select as few as 250 measurement points, 2 standard uncertainties will be as low as 0,01 K.
Consequently it seems that we should only need in order of magnitude 250 good temperature measurement locations to be able to identify a trend in the average temperature. Adding more temperature measurement locations does not seem to add significant value, as the year to year variation in temperature seems to be intrinsic to the average temperature and not due to lack of measurement locations. Hence the variation cannot be reduced by adding more measurements.
So, if the intended use of the data set is to monitor the development of the average temperature, all the operations that are performed on the data sets seems to be a waste of effort. The effort to calculate temperature fields, compensate for urban heat effect and estimate measurements for discontinued locations all seems to be meaningless. What should be done is to throw over board all the questionable and discontinued measurement locations and keep in order of magnitude 250 good temperature measurement stations randomly spread around the world.
Anyone who has waded through the “harry_read_me.txt” file from the original UAE climate gate data knows just how intractable encoding weather data can be, much less getting reliable summaries and detail analyses. This problem shows that there very likely are endemic problems in most historic weather data sets and worse, adjustments and corrections are quite likely making matters worse, certainly more uncertain. It is also highly likely that each new version release of the data (BEST, CRU, GISS, USCHN) incorporates its own idiosyncratic problems, which are inevitably difficult to identify unless there is either a glaring problem (Harry mentions for instance a least-squares value that becomes increasingly negative), or when a local looks at a look record and sees that it differs in some important way from their personal experience – an anecdotal process not highly regarded in some scientific circles, which is easily written off by the authorities in the field. It might be a good thing for the NCDC and other repositories to produce public annual or semiannual reports on bug hunting and data and processing code audits.
Bryan says:
June 29, 2014 at 12:45 pm
I would add my voice to those who suggest that we should not be quick to assume that these issues with the USHCN data set result from unintentional mistakes. Consider:…
The problem is that the process of getting the data encoded into a single set of consistent files is a very complex problem. When you are dealing with even tens of thousands of records it can be difficult. When you attempt to address hundreds of thousands of records that are spatially distributed time series, the order of magnitude in difficulty jumps several orders of magnitude. Simply identifying problems is an immensely difficult problem, and if the problem looks like what you actually expect, then as Anthony points out, confirmation bias can blind you very thoroughly. Occam’s Razor recommends not complicating explanations beyond necessity. Until there is clear evidence of deliberate and intentional biasing of the record, there’s no reason to look for them.
It’s surprising to me that Nick Stokes’s earlier very reasonable explanation was ignored and/or dismissed by what looks like possibly more confirmation bias (or hasty reading).
Nick wrote: “sometimes Nature just has a warmist bias. (…) Positive resistance will reduce the voltage. Negative will increase it. But they don’t do negative. (…) Same with TOBS. If you go from afternoon to morning reading, the trend will reduce”.
This means there might be good reasons that the adjustment/estimates of those singled out stations will have some “warmist” bias, meaning that the readings will be more often *lower* than *higher* than any “correct” measurement. Correcting for this by “infilling” will, if the above would be true, necessarily create that “fabricated” trend (Goddards red graph). In other words, the raw data on average would be too cold because of stated mechanical and electronic or impedance problems.
So in the end we have here a simple explanation which could very well explain all the stated observations from all sides. It can also be falsified further by sampling more of those stations indicated to be “estimated” and which were then compensated upwards.
This is not the same discussion as “are USHCN stations in general trustworthy?”. As Nick stated, if you’d already dismiss the whole dataset, then why bother struggling with this infilling problem? It doesn’t make sense!
Hugh Eaven says:
June 30, 2014 at 1:14 pm
“This is not the same discussion as “are USHCN stations in general trustworthy?”. As Nick stated, if you’d already dismiss the whole dataset, then why bother struggling with this infilling problem? It doesn’t make sense!”
It makes perfect sense. When you have billions of dollars at your finger tips which the AGW gult does they can simply create another data set and claimed “they fixed the problem”. They’ve already done this a bunch of times already.
The reality is this is a game of whack a mole where the cultist will use any dirty trick in the book until they can’t. When you have near unlimited funds to produce near unlimited propaganda you can keep making minor changes and carry on.
We already see this process with the non-stopped renaming of global warming, whenever its proven wrong you have 30 papers produced in weeks time to say “o the ocean ate my global warming”, etc, etc, etc.
The general hope is that people fighting this mess will, burn out from overwork, miss the next dirty trick, threats to person, job, family take hold, etc, etc, etc.
As the cultists say the science is settle, AGW is wrong however science is meaningless in arguments based on emotion, logical fallacies, propaganda and such….. which is what the AGW is and will be until it is finally killed off as an idea. \
Trying to get ahead of the next production of propaganda is about the only way to more easily force the next edition to come out even faster. Hopefully causing easier to spot errors.
At this point I am stuck on a “who” question. In the example I used above, who made the decision to append the Nampa Sugar Factory record to the earlier Caldwell record by dropping Caldwell from USHCN and adding Nampa? And by contrast, who decided to create the zombie Fremont OR, giving us (so far) an 18 year record of estimated temperatures from who knows what other stations, which can not be visited for metadata observation?
Gad, what a great excuse for the alarmists. The adjustments ate my homework.
==========
Anthony Watts, thank you for a good post. Nick Stokes, richardscourtney, Steven Mosher, thank you for your comments, rejoinders and elaborations.
As people who have followed this debate know this is not the first time by far that the data the NOAA or other institutions have produced have been grossly in error. Several incidents occur to me including one time when “someone” copied the august temperature data to september for the whole country of Russia. The climate reporting agencies were quick to jump on what they said was the hottest august on record smashing previous records. It took very little research to uncover this error being so blatant and unbelievable. After they were told they quickly fixed the result putting the temperature well back from record territory. This is just one of numerous incidents of this nature. It always seems it takes independent people in the real world to find these egregious errors and the error is ALWAYS that the data is higher than it should have been. ALWAYS.
It is entirely clear that either they are purposefully futzing with the climate record or having experimenter bias are simply blind to data that backs their presuppositions and bias. If they get a result that says temperatures are setting a new record they don’t even spend the time to see if they copied an entire countries dataset wrongly that one country comes out suspiciously super hot doesn’t concern them the slightest?? It’s hard to believe that they just weren’t hoping the data would slip by everyone’s view and nobody would look into it.
How many things like this have happened that we don’t know about that are not so blatant? The latest scandal points out how functioning reporting stations can be ignored for more than a year and substitute data fabricated from models obviously constructed by people who already are invested in their theory. How could such a blatant thing happen that huge numbers of stations are estimated values when we know that many of them have real valid data? This is obviously so politicized that its affecting the data in amazing ways that skips by the “brilliant” minds of our scientists they never question data which backs them and look for every way to minimize the appearance that climate change isn’t massive! They come up with every possible excuse to explain why temperatures weren’t so warm in the 1930s even to the point of denying the existence of well known phenomenon but don’t spend the smallest effort to determine if they are making mistakes in the other direction.
I was amazed talking to a climate modeler when he said he believed the MWP and LIA were northern hemisphere phenomenon. So, apparently for HUNDREDS of years temperatures in Northern Europe were hotter than the rest of the world. The rest of the world apparently compensated for the heat in northern Europe somehow. I’ve never heard an explanation of HOW could temperatures in Northern Europe be so hot (or cold) for hundreds of years and the rest of the world unaffected. They have NO curiosity about how such a thing would be possible or the explanation for such bizarre weather behavior. Further they don’t have any explanation for WHY the temperatures would do this for hundreds of years nor any other examples of such multi-hundred year variations in regional temperatures that apparently don’t affect anything else. Yet they were happy to accept this argument it was a regional effect for many years! Astounding to me.
When I hear arguments like this I can’t imagine that this is a sane person making such statements. Seriously, if you are going to deny the MWP or LIA then you should also have at least some plausible explanation for how such a thing would be possible. To show no curiosity in how such an aberration of temperatures could have taken place should have interested them considering their scientific curiosity. However, it appears their scientific curiosity stops 100% at the point where it might lead to some question about the veracity of the global warming religion.
The latest debacle comes as no surprise to those of us who have read the amazing stories of the data and statistical errors of our climate elite.
It seems to me that since the amount of “modification” to the temperature record is around 1F or more, sometimes 2 or 3 degrees F that the level of uncertainty is on the order of the amount of temperature in dispute, i.e. these adjustments are huge. What if I said your speed was 60mph +- 60mph. That’s not a very good estimate. I could be standing still.
There is a notion of DATA QUALITY that computer people have been pushing for a while now. First we need a rigorous way of describing the modifications to the temperature record being made and reviewed and analyzed carefully with real case studies of how such modifications work.
Further every report should have statistical reports on the frequency these “adjustments” are made, how severe the adjustments are, what regions are experiencing abnormal cold or heat. They should produce numerous data which can be used to verify the data quality of the data we are depending on. When data fall outside ranges of acceptable normal movements a person should be assigned to analyze if such a variation in fact existed. This is common practice in the financial field where you might trade billions of dollars based on numbers that were simply typed by someone wrongly into a spreadsheet sometime (or some other error). Data quality is an important discipline in being able to use the data for further analysis. It is clear that the level of professionalism of these people is zilch. The number of errors, the frequency of errors and the obvious bias in the errors is so blatant that it calls into question fraud. The agencies not only need to fix the current problem they must give the rest of us the tools to check up on them and they need to have procedures for people to call into question data. They need to have reviews of anamolous data with real people who do research to discover if these adjustments reflect reality. If the adjustments are so frequent and siginificant that this can’t be done then basic improvements to the infrastructure must be contemplated so that the data can be trusted.