CRUTEM3 error getting attention by Met Office

This is a repost of two articles from John Graham-Cumming’s blog. I watched with interest earlier this month where he and a colleague identified what they thought to be a math error related to error calculation when applied to grid cells. It appears now through a journalistic backchannel that the Met Office is taking the issue seriously.

http://hadobs.metoffice.com/hadcrut3/diagnostics/CRUTEM3_bar.png

What I found most interesting is that while the error he found may lead to slightly less uncertainty, the magnitude of the the uncertainty (especially in homogenization) is quite large in the context of the AGW signal being sought. John asks in his post: “If you see an error in our working please let us know!” I’m sure WUWT readers can weigh in. – Anthony


The station errors in CRUTEM3 and HadCRUT3 are incorrect

I’m told by a BBC journalist that the Met Office has said through their press office that the errors that were pointed out by Ilya Goz and I have been confirmed. The station errors are being incorrectly calculated (almost certainly because of a bug in the software) and that the Met Office is rechecking all the error data.

I haven’t heard directly from the Met Office yet; apparently the Met Office is waiting to write to me when they have rechecked their entire dataset.

The outcome is likely to be a small reduction in the error bars surrounding the temperature trend. The trend itself should stay the same, but the uncertainty about the trend will be slightly less.

===============================================

Something odd in the CRUTEM3 station errors

Out of the blue I got a comment on my blog about CRUTEM3 station errors. The commenter wanted to know if I’d tried to verify them: I said I hadn’t since not all the underlying data for CRUTEM3 had been released. The commenter (who I now know to be someone called Ilya Goz) correctly pointed out that although a subset had been released, for some years and some locations on the globe that subset was in fact the entire set of data and so the errors could be checked.

Ilya went on to say that he was having a hard time reproducing the Met Office’s numbers. I encouraged him to write a blog post with an example. He did that (and it looks like he had to create a blog to do it). Sitting in the departures lounge at SFO I read through his blog post and Brohan et al.. Ilya’s reasoning seemed sound, his example was clear and I checked his underlying data against that given by the Met Office.

The trouble was Ilya’s numbers didn’t match the Met Office’s. And his numbers weren’t off by a constant factor or constant difference. They followed a similar pattern to the Met Office’s, but they were not correct. At first I assumed Ilya was wrong and so I checked and double checked has calculations. His calculations looked right; the Met Office numbers looked wrong.

Then I wrote out the mathematics from the Brohan et al. paper and looked for where the error could be. And I found the source. I quickly emailed Ilya and boarded the plane to dream of CRUTEM and HadCRUT as I tried to sleep upright.

Read the details at JGC’s blog: Something odd in the CRUTEM3 station errors

Get notified when a new post is published.
Subscribe today!
0 0 votes
Article Rating
181 Comments
Inline Feedbacks
View all comments
Editor
February 25, 2010 1:18 pm

Comments for this entry bring up Tamino. Also mentioned is Tamino’s attitude regarding release of code and information.
Quite interesting given the recent post by Judith Curry on ‘trust’ and the perception by the public of the pro-AGW side of the discussion.
—————————-

Jean Parisot
February 25, 2010 1:20 pm

“1) How to undertake the spatial weighting. While this is a bit beyond my programming ability (hence my interest in Tamino’s script), the basic idea is simple: split up the world into grid cells based on lat/lon, and calculate the average anomaly of stations in the “pre-cutoff” and “post-cutoff” series (or calculate the average anomaly of all stations in each grid to compare with GISS et al). Weight each grid cell by its area to determine the contribution to global temp anomalies. This leads us to…”
Where and when was the decision to use arbitrary grid polygons made? While it may have made the “math” easier, it is not an optimal approach. Modern spatial statistics and GIS systems can handle far more complex data aggreations. Has anyone considered making the “grid” as homogeneous as possible, with an “even” depth of datum?

Jean Parisot
February 25, 2010 1:21 pm

g – the missing letter from aggregations above.

carrot eater
February 25, 2010 1:34 pm

Michael Jankowski (13:02:24) :
There are many right ways to do something, but averaging together absolute temperatures is definitely a wrong way. And for getting decent global figures, not using spatial averaging is also definitely wrong.
That still leaves you with many good ways of doing it.

February 25, 2010 1:42 pm

Mosh,
Tamino is reading this thread (he even has a comment from here in a blog post), so I’m sure he is well aware of your advocacy that he should respond to Roman’s augmentation of his optimal method. I try not to go out of my way to needlessly antagonize folks :-p

sturat
February 25, 2010 1:56 pm

“REPLY:Two things. 1) I’ve been studying USHCN, not GHCN, different networks 2) Yes it is a simple question, but I’m not the person to answer it. E.M. Smith is the one that raised this claim and did all the GHCN analysis, and I’ve sent him an email with the link and the question. I’m sure he’ll respond either here or at his blog. http://chiefio.wordpress.com
1) I guess you’re saying that if one performs a similar analysis using the USHCN network then you would arrive at an opposite conclusion to one using the GHCN network. So, were is your reproducible analysis of the USCHN network?
2) Please provide a link to the specific analysis by E.M. Smith using his methods ( or Roman’s) that show that the station dropout problem is significant? By analysis I mean, a set of runs, graphs, and supported conclusions that are comparable to those of Tamino and show the opposite effect. Comparable means using more than a handful of stations and for at least the entire Northern Hemisphere.
3) As to your assertion of my “defense of the phantom researcher” I am puzzled by a couple of things. While it can be inferred from my original post that I lean towards Tamino’s analysis being correct, my question to you was in regards to can you show (by yourself or through the work of others) that it is incorrect. As to having to have Tamino’s exact algorithm in order to show that it is incorrect that would be the case if the algorithm was the important item in question. But, in this case, the conclusion of the algorithm is in question, or rather your assertion that it is incorrect, and this could be shown by the presentation of analysis supporting the opposite conclusion (as I asked in the first place.)
Try to keep your responses ( and those of your reader) to the technical aspects of the question: Is there a reproducible, rigorous analysis that shows that the station dropouts are a problem.
Oh, and I see that Tamino will be publishing his algorithm and analysis. I would like to believe that if no significant errors are found in his analysis that you would be willing to publicly accept the results.
REPLY: Like I said I’ll let Mr. Smith respond on GHCN, you can view his many different GHCN analyses here: http://chiefio.wordpress.com/
I suggest you ask him questions directly rather than relying upon me as a proxy. As for USHCN, there’s a full analysis in process now, being prepared for a journal paper. When that paper is accepted, we’ll publish an SI that has the data, the methods, code, and results. This is the standard practice that I will follow. – Anthony

David Segesta
February 25, 2010 2:10 pm

Pamela Gray (08:39:18) :
“Anthony, I am at home too dealing with a raging head cold and solidly blocked up Eustachian tubes.”
Well that makes three of us. Can you catch a cold over the internet?
REPLY: Not if you have an anti-virus program that is up to date ;-P

February 25, 2010 2:17 pm

“Jason F (11:23:33) : …actually I don’t think I’ve seen a more vile AGW propoganda site or app
Then you haven’t seen this:
http://www.climatecops.com/
On topic, we wouldn’t be having this discussion if all the raw data, code, documentation (if there is any), and such had been made publicly available. So many errors have been found to date it is mind boggling.
Have they ever thought of the idea that other people could help? Only if you’re in the club, I guess.

Peter of Sydney
February 25, 2010 2:18 pm

No matter how small they make the error flags, the trend depicted is still meaningless and does not prove anything, least of all that AGW is factual. I suppose if the world temperatures were declining instead of rising gradually over the past 100+ years, they would be calling it AGC instead of AGW. That in fact was the case for a short time during the panic of the new ice-age mania a few decades ago. They didn’t call it AGC but the threat of a new ice-age was blamed on man. So, unless the climate stops changing for the first time ever in the history of the planet, there will always be alarmists running around like mad hatters declaring we are doomed unless we do such and such, which when analyzed in detail is proven to be useless and nothing more than a scheme by the very rich and powerful to become even more rich and powerful at the expense of the rest of us.

Dave Andrews
February 25, 2010 2:21 pm

Zeke
“I try not to go out of my way to needlessly antagonize folks “
Judging from your posts on a number of blogs I would agree with your statement. Where I might disagree is with you is aiming that statement at Steven M when Tamino’s record on antagonising is incomparably worse.

carrot eater
February 25, 2010 2:29 pm

If people are finding that Tamino moderates heavily, I’ve found the same of EM Smith. So unless one of the two open up, the only cross-discussion can occur here.
Which is unfortunate, because the topic of this particular thread is something entirely unrelated, and interesting in its own right.

steven mosher
February 25, 2010 2:30 pm

Zeke Hausfather (13:42:06) :
Mosh,
Tamino is reading this thread (he even has a comment from here in a blog post), so I’m sure he is well aware of your advocacy that he should respond to Roman’s augmentation of his optimal method. I try not to go out of my way to needlessly antagonize folks :-p
I can appreciate that. here is what you know. tamino will probably be antagonized if you dont ask. precautionary principle. You can be fairly certain that if tamino does not answer the question, then he will be antagonized ALOT. But if you antagonize him a little, and suggest that he avoid the comment catastrophe by taking a little pain today.. why you are doing a noble thing!

steven mosher
February 25, 2010 2:38 pm

Zeke I think what happens is that people read stuff for Tamino and then send him mails. people like you and nick stokes and who have math ability should just go read the stuff roman did. Then make a private comment to Tamino, behind the scenes. Explain to him that he might be wrong about the optimality of his approach.
Then Tamino will give you his argument. You can then carry that argument into Romans blog. It’s a bit bizarre but that is how it happens. Tamino can’t be seen talking to anybody in the CA gang, unless of course he is calling them a criminal or stooge. he can’t even take questions from Lucia and she believes in AGW! Any way, I trust you to have a fair read of things. go talk to tammy in private. Who knows Romans approach may show MORE WARMING.
dunno. I do know its silly not to talk.

steven mosher
February 25, 2010 2:48 pm

Roman,
Tamino will now not publish his paper. he’ll post a PDF or something.
Probably not share code. he will say ” I’ll share code with people who are not denialists” but then wont explain what he means by denialist ( I believe in AGW ).
The climate science ditto heads will link his stuff ” as tamino’s optimal method shows..” and the issue of his behavior will remain front
and center.
Now, Tamino is stuck. If he publishes his PDF or whatever, then the story
of how he could not address simple questions will always be there.
And if he doesnt publish people will point to the episode and say ” what the heck?” was he that afraid of answering questions from a retired professor?
Hansen’s bulldog. right.

James Sexton
February 25, 2010 2:54 pm

sturat (13:56:01) :
Question. Do you read the replies to your questions? Do you read other posts here responding? Did you go to the link that was provided by steven mosher (12:37:35) 😕 Did you read Tamino’s post in its entirety? While the graphs were pretty, I didn’t see anything other than “It is so, because I said it is so.” He’s only asserted he’s done something. He hasn’t shown anything. Would it make you feel better if I make up a graph and label it for you, stating I proved something? What is with you guys that you believe everything you read but only from people to claim to know everything about climate science? (And now, apparently he’s developed a statistical analysis skill beyond a statistician. Strange that he didn’t seem to possess that ability in some of his earlier works.) For the love of all that is holy, TRY SOME CRITICAL THINKING!!

wayne
February 25, 2010 2:57 pm

rbateman (11:34:27) :
The ratio of rural to urban stations should accurately reflect the current land use of any area…

I assume you mean that since there is much more rural area than urban area the two anomalies should not be merely summed together. It is commonly seen as ruralAnom + urbanAnom on GISS calculations (I think it was GISS). Instead it should more accurately be ruralAnom * ruralAreaPct + urbanAnom * urbanAreaPct, ruralAreaPct + urbanAreaPct being equal to one. Was this your point to Pamela on urban-rural?
However, I’ve never seen data on the ratio of urban to rural land use stated as such, or even at per grid cell. Rural anom usually being a small fraction of urban anom.

1DandyTroll
February 25, 2010 3:02 pm

@Anthony “Do I trust “Tamino”, a man supposedly of science but who won’t put his name to his critcisms, who regularly denigrates others, and who now won’t share what he claims falsifies the work of people who do put their name to their work? In a word, no. -A”
you mean that a dude who’s into math and aavso and used to have tamino as part of his email address and does, or did, work as a data analyst concerning customer service or some such, and wrote some data analysis software for that amateur astronomy club aavso. Only chipped in on the climate arena. And apparently likes to write simple for simple folks and newbies.
But I’m sure it’s just a seized upon coincidence turned into a trick.

carrot eater
February 25, 2010 3:03 pm

Well, I wandered over to this statpad webside, and was surprised to see myself being referenced. Is the main complaint coming from my question, over whether the offsets are month-specific or not? I think they should be month-specific, but I also don’t think it’ll make that big a deal to the final results. It certainly won’t affect the main points.

James Sexton
February 25, 2010 3:04 pm

carrot eater (14:29:47) :
“Which is unfortunate, because the topic of this particular thread is something entirely unrelated, and interesting in its own right.”
Yes, it is. Its probably out of my range of understanding error margins. But the dual discussion on this thread does have a commonality. It is dreadfully apparent, that the scientists doing the research haven’t properly engaged mathematicians. (Nor, apparently, database administrators.) Much of climate science requires proper statistical analysis to come to the proper conclusions and data integrity is of the utmost importance. Hopefully, when some of our friends engage in their “do over”, they won’t fail to seek expertise when they engage in something outside their purview.
Cheers

sturat
February 25, 2010 3:26 pm

James Sexton (14:54:39) :
Yes, I do read the threads and the link. At least the ones that don’t resort to name calling and shouting. From your comments I take it that you have generated your own analysis on the topic of station dropouts. I look forward to your publication of your results.
As to whether he has shown anything, I agree that the atomic details of his algorithm have not been presented (yet), but his explanations and “pretty” seem to me to merit consideration.
Since Mr. Watts and others have stated many times and quite loudly that there is something wrong with the temperature record (one aspect being the station dropout issue), I felt it was appropriate to ask for similar analysis that contradicts Tamino. If, and when, they can produce such an analysis that does “prove” their assertion, I will be interested in following the critiques that are sure to follow.
As to the link in Steven Mosher, I did go there, but all I saw was an interesting post on a potential error in Tamino’s algorithm, but no supportive analysis to show that it made a significant difference when applied to actual data. It is entirely possible that I have not read the relevant post with a complete analysis, though.
I do agree with carrot eater, steven mosher, and others that it is unfortunate that these discussions can’t be held in a more civil manner. Summarily blocking posts at either site adds to the mistrust and detracts from the important results.
Perhaps Tamino, E.M. Smith, and RomanM could each publish their analyses and the discussion could center on the technicalities of analysis rather than the personalities of the parties.

George E. Smith
February 25, 2010 3:26 pm

“”” RockyRoad (08:18:32) :
Please fix this sentence:
What I found most interesting is that while the error he found may lead to slightly less uncertainty
Thanks! “””
And please fix the conclusion; fixing of errors can only steer away from absolute uncertainty.
There is no measure of certainty in results known to contain errors.
Seems to me, that in the case of the Hubble Telescope main mirror; the known error in construction was also known to a high degree of certainty.
Mox nix; the resulting images were still garbage.

Phillep Harding
February 25, 2010 3:34 pm

Time for someone to create a AGW persona to get the inside scoop on this?

Steve M. From TN
February 25, 2010 4:02 pm

sturat (15:26:30) :
James Sexton (14:54:39) :
Yes, I do read the threads and the link. At least the ones that don’t resort to name calling and shouting

You must not read Tamino’s website then.

carrot eater
February 25, 2010 4:16 pm

James Sexton (14:54:39) :
Tamino more than just asserted that he did something; he described what he was doing, step by step. His results seem reasonable, though there is always a chance there is some basic math error in there somewhere. We may yet see revisions as he improves his methods, but it’s highly unlikely that the basic point will change.
The question is, had EM Smith done this sort of analysis, in order to support his ideas? It doesn’t have to be the exact same, but you do need some sort of calculation that emulates the GISS processing. So far as anybody can tell, the answer is no; it certainly isn’t in the SPPI document, but we’ll have to see what he says.

Richard Saumarez
February 25, 2010 4:21 pm

I have trawled through the HADCRUT data. The lack of sophistication is is extraordinary. The mean temperature is the integral of temperature over a period divided by the period in question, not the arithmetic mean. The analysis of “mean” temperature is mathematically illiterate. The spatial analysis is deficient (“Delauney Triangulation is done by a package and we don’t have the code”, paraphrased from Harry_read_me.txt). The gridding allows interpolation across the Himalayas and the Alps. Some stations are 700 Kms apart (i.e.: the length of British Ilses)
The only hope on the horizon is that the EPA endangerment finding has led to a legal challenge from multiple sources and the data and analysis will be subpoened. This will be scrutinised by some respectable mathematicians, who know what they are talking about.
On a slightly O/T remark. I have developed medical equipment that involves digital signal processing. The regulatory requirements need a truck load of documentation (what color do you dream in …?) and every statement has to be justified. We are supposed to overturn our economies on the basis of adjusted data when when the raw data has been lost?
A legal challege is going to expose the quality of the data on which AGW is based.