This is a repost of two articles from John Graham-Cumming’s blog. I watched with interest earlier this month where he and a colleague identified what they thought to be a math error related to error calculation when applied to grid cells. It appears now through a journalistic backchannel that the Met Office is taking the issue seriously.

What I found most interesting is that while the error he found may lead to slightly less uncertainty, the magnitude of the the uncertainty (especially in homogenization) is quite large in the context of the AGW signal being sought. John asks in his post: “If you see an error in our working please let us know!” I’m sure WUWT readers can weigh in. – Anthony
The station errors in CRUTEM3 and HadCRUT3 are incorrect
I’m told by a BBC journalist that the Met Office has said through their press office that the errors that were pointed out by Ilya Goz and I have been confirmed. The station errors are being incorrectly calculated (almost certainly because of a bug in the software) and that the Met Office is rechecking all the error data.
I haven’t heard directly from the Met Office yet; apparently the Met Office is waiting to write to me when they have rechecked their entire dataset.
The outcome is likely to be a small reduction in the error bars surrounding the temperature trend. The trend itself should stay the same, but the uncertainty about the trend will be slightly less.
===============================================
Something odd in the CRUTEM3 station errors
Out of the blue I got a comment on my blog about CRUTEM3 station errors. The commenter wanted to know if I’d tried to verify them: I said I hadn’t since not all the underlying data for CRUTEM3 had been released. The commenter (who I now know to be someone called Ilya Goz) correctly pointed out that although a subset had been released, for some years and some locations on the globe that subset was in fact the entire set of data and so the errors could be checked.
Ilya went on to say that he was having a hard time reproducing the Met Office’s numbers. I encouraged him to write a blog post with an example. He did that (and it looks like he had to create a blog to do it). Sitting in the departures lounge at SFO I read through his blog post and Brohan et al.. Ilya’s reasoning seemed sound, his example was clear and I checked his underlying data against that given by the Met Office.
The trouble was Ilya’s numbers didn’t match the Met Office’s. And his numbers weren’t off by a constant factor or constant difference. They followed a similar pattern to the Met Office’s, but they were not correct. At first I assumed Ilya was wrong and so I checked and double checked has calculations. His calculations looked right; the Met Office numbers looked wrong.
Then I wrote out the mathematics from the Brohan et al. paper and looked for where the error could be. And I found the source. I quickly emailed Ilya and boarded the plane to dream of CRUTEM and HadCRUT as I tried to sleep upright.
Read the details at JGC’s blog: Something odd in the CRUTEM3 station errors
E M Smith doesn’t need to do the work.
No one is saying he – or anyone – needs to do anything. The observation that assertions need to be backed up by evidence to be taken seriously is uncontroversial in the extreme.
before you say it, it only covers 48 Sites
Thank you. So you see the problem.
I know who I would rather believe.
Belief and science are completely orthogonal. This is another self-evident, uncontroversial premise.
E.M.Smith(Or you can pay my billing rate. $100 / hour for commercial operations. $200 / hour for “Climate Stability Deniers”. Discounts available for bulk purchase or for Friends of Anthony. And move to the head of the priority queue.)
You need the caveat of adjusting price based on attitude!
carrot eater (19:32:50) : This is saying that simply removing those stations, in itself, would ‘ensure’ a spurious warming, as the removed stations tended to be from ‘cooler’ locations. What Tamino has done shows that simply dropping those stations didn’t, in itself, ‘ensure’ any warming.
So this is just a coincidental correlation, similar to that of CO2 and Temperature, eh?
http://i27.tinypic.com/14b6tqo.jpg
If anybody really wants code to play with, instead of working it out for themselves, the ccc people have done their own version of the work. This is a true GISS emulation, with none of the differences from GISS that Tamino has. It’s global, not just NH.
http://clearclimatecode.org/the-1990s-station-dropout-does-not-have-a-warming-effect/
Paul Daniel Ash (08:20:42) :
Yes. The original SPPI document made a strong assertion. Forget about providing code; I cannot even find an analysis that backs up that assertion. This is why that document is attracting criticism.
@Paul Daniel Ash ‘The central point carrot eater made is a good one, though: there’s nothing stopping anyone who wants to from carrying out an analysis on their own. The question at hand is not Tamino per se, it is the validity of the temperature records.’
The devourer of carrots also stated that all the information was there for anyone to emulate the flute and GISS even. Point is the point carrot makes ain’t a good one. The little flute thinks hi’self debunked something, on his blog, but doesn’t want to show the implementation of the equations, hence he begs the reader to have faith in his words instead of being able to judge for one self. That’s not science. Neither is it any point to believe that two different peoples coding in different space and time will yield the same implementation of equations, but carrot isn’t in to computers like that so he don’t no better, but the little flute should’ve, with all that he has claimed to be.
The point is for everyone to see and be able to judge for them self. That the flute and carrot doesn’t have problem is no wonder since neither of them are scientists, but believers who seem to not mind a one sided scientific process, no peer review please, and don’t really look at my kind of raw data.
1DandyTroll (09:06:26) :
What equations do possibly you want?
The only equations he’d need to provide were already given here, for how he combined the stations. He needed to give them, because he’s doing it differently from anybody else.
http://tamino.wordpress.com/2010/02/08/combining-stations/
Everything else he did, you can figure out for yourself. Just as he figured it out for himself, instead of copying and pasting GISS code. If there’s something else you don’t understand, you need to build up a bit of background knowledge by reading previous papers on the topic. If as you go along, there’s something major that you feel he didn’t describe, then you might ask him.
But in any case, the clear climate code guys did their own version, and provided the code. So perhaps we could get over the red herring of the code.
Steve Keohane (08:45:42) :
You seriously think a simple average of absolute temperatures means anything at all? No, it doesn’t. It’s meaningless.
The little flute thinks hi’self debunked something, on his blog, but doesn’t want to show the implementation of the equations, hence he begs the reader to have faith in his words instead of being able to judge for one self.
This is the strangest reasoning, even separate from the Br’er Rabbit phrasing. Just using Tamino’s code to get Tamino’s results would show nothing, other than that Tamino’s code produces Tamino’s results. Doing one’s own analysis would be replicating, rather than merely repeating, the work.
And as carrot eater showed, downloadable code is available now for an analysis using GIStemp that gives the same results as Tamino’s: just go to http://clearclimatecode.org/the-1990s-station-dropout-does-not-have-a-warming-effect/
In any regard, Tamino has – as has been repeatedly noted – announced that he will release the R code he used. Skepticism is warranted, even suspicion, talk is cheap, etc. But the blithe assertion that Tamino is hiding his code is unwarranted at this point.
Paul Daniel Ash
carrot eater
Perhaps you can explain the logic behind this description
Each new station is offset so that it has the same average value during the time of overlap with the reference.
Can you tell me what the “Offset” is and how you arrive at “the same average value” for different stations.
Neither of you have responded to my previous post about this thread
http://wattsupwiththat.com/2010/02/26/a-new-paper-comparing-ncdc-rural-and-urban-us-surface-temperature-data/
Which just slightly contradicts the work being done by Tamino.
A C Osborn (11:49:46) :
First off, that other thread has little to nothing to do with what Tamino is showing. He’s looking at the effects of station dropoffs and GISS adjustments, not the effect of UHI on USHCN stations. That other thread has problems of its own (why only 48 stations?), but that’s a separate issue.
The offset is referring to a method of combining anomalies. In climate, you work with anomalies, not absolute temperatures. So what’s important is how a station changes over time, not how hot or cold it is in absolute terms. So when combining stations to get an average at a certain location, you want to combine them in a way such that you combine the trends (warming or cooling). One way is to use these offsets. This is illustrated clearly in Figure 5 in Hansen/Lebedeff (1987), as well as the equations on the page before. http://pubs.giss.nasa.gov/abstracts/1987/Hansen_Lebedeff.html
Tamino took this method, and changed it by changing how the offsets are calculated. In his way, it doesn’t matter which station you start with. In GISS’s way, it matters slightly which one you start with.
These descriptions make sense, if you take the time to learn about the field by reading the literature. I don’t know anything about medicine; I would not expect to understand a random paper from JAMA if I just picked it up. Same goes with climate; once you educate yourself, then things like that will have meaning.
Perhaps you can explain the logic behind this description
Each new station is offset so that it has the same average value during the time of overlap with the reference.
Can you tell me what the “Offset” is and how you arrive at “the same average value” for different stations.
As far as “logic” goes, carrot eater understands this much better than I do, but in case she/he is gone and not coming back I will take a stab: all the stations in a particular grid don’t necessarily have readings for each given time period. Computing offsets is a way of combining station data in a way that temperature readings can be averaged for that grid.
Neither of you have responded to my previous post
Look again.
carrot eater (12:17:52) :
I thought that Tamino had combined raw data for his analysis, not anomalies. Sheesh, I need to go back to school…
carrot eater (12:17:52) :
Sorry, Tamino is working with Absolute temperatures and still using Offsets.
Doesn’t Tamino get his Data from NCDC?
Paul Daniel Ash (12:43:56) :
Tamino started with v2.mean. This holds raw data.
By raw, it is this: somebody at the weather service of the relevant country took the max and min temperatures for all the days of the month and found a mean temperature for the month. So by ‘raw’, it’s a monthly average mean temperature. Somebody had to do a basic math operation to get the number. They could conceivably mess their addition and division, so that is a possible source of error.
From this raw data, you can then calculate anomalies using a variety of methods. The method Tamino used is described on his blog in the page I linked above. GISS, CRU and NCDC each use their own method, and each is different from Tamino.
“carrot eater (07:11:06) :
[…]
Dirk, I can assure you I’ve written more than a couple programs. We’re not talking about replicating every behavior to the 10th digit. That isn’t what reproducibility is about. We’re talking about getting basically the same results.”
Talk about error propagation. Talk about weird homogenization algorithms that take temperatures from the Amazon and shove them over to Bolivia. “Basically the same results”????? Are you kidding me????
For very trivial programs you’re right. You’re not right for the complexity we’re talking about.
A C Osborn (12:53:38) :
The very moment you add or subtract an offset, it is no longer an absolute temperature. I would call it an anomaly at that point; it just isn’t centered on zero.
What you do (well, you don’t have to, but GISS and Tamino do) is add whatever offset you need to each station to get them to overlap. See the figure and math in the 1987 paper. Then you can combine them. After you combine them, you can recenter the whole thing on zero.
You can experiment with this yourself. So long as there aren’t any station drops, it is mathematically equivalent to centering each station on zero first, and then combining them.
DirkH (13:47:00) :
The distance-weighted interpolation that GISS does is conceptually pretty simple. If you wanted to recreate it, you could. Tamino chose not to.
I know the GISS code looks like a mess, but conceptually, it’s actually pretty simple. But the ccc guys will have made it less messy to look at, by the time they’re done.
DirkH (13:47:00) :
“Talk about error propagation. Talk about weird homogenization algorithms that take temperatures from the Amazon and shove them over to Bolivia. “Basically the same results”????? Are you kidding me????”
Temperature or temperature anomalies? Big difference isn’t it.
Stu (15:51:47) :
I don’t think people realise just how far anomalies can correlate. Further than you’d guess, before looking into it.
If somebody’s worried about Bolivia, then why not go back to the period where Bolivia data are in the GHCN, and see how well they correlate with the neighboring stations? If the answer is not well, then you could argue there is a sampling problem there.
@Paul Daniel Ash ‘This is the strangest reasoning, even separate from the Br’er Rabbit phrasing. Just using Tamino’s code to get Tamino’s results would show nothing, other than that Tamino’s code produces Tamino’s results. Doing one’s own analysis would be replicating, rather than merely repeating, the work.’
What it would show is if the result has any valid integrity and not f:ed up due to bugs or worse some creative but crappy implementation.
“And as carrot eater showed, downloadable code is available now for an analysis using GIStemp that gives the same results as Tamino’s: just go to http://clearclimatecode.org/the-1990s-station-dropout-does-not-have-a-warming-effect/”
Well, happy friggin compiling then. Personally I wouldn’t expect anything else since the flute did what he could to mimic the original result at the same time doing his best trying to disprove a critical result. He, obviously, succeeded in mimic the original result and therefore claims he has now disproved the critical result. So it begs the question on what exactly did he do, and how, so as to judge if it is valid or not.
“In any regard, Tamino has – as has been repeatedly noted – announced that he will release the R code he used. Skepticism is warranted, even suspicion, talk is cheap, etc. But the blithe assertion that Tamino is hiding his code is unwarranted at this point.”
Unwarranted, really? Seems to have taken some peer2peer pressure to drive his ego to a breaking point, but still he, for some reason, choose to try and save his ego to try and get published, in what, the official peer review press? for what? Trying to debunk a critic’s critical claim that was essentially made on a blog.
Yeah that’s a rational process…. maybe for someone trying to get out under the rats in a sinking boat, oars and all, perhaps.
I am astonished at the amount of misinformation showing up here. Replication means exactly what it means. Duplicating the work, but yes, with a fresh pair of eyes, in a different lab. All else must be under the same conditions, including the code. However, one can analyze the resultant data using different kinds of statistical calculations, graphing it differently, interpreting it differently, and suggesting further areas of study not suggested by the original author. But the bottom line is this, the way the data was obtained originally must be the same in replication in order to prove that the work can be replicated.
There are two reasons for this.
1. One of them is designed to keep us from becoming cheaters. To keep us from fudging the data and not telling. Or do a secret little trick to the data that no one else knows about. Replication is designed to keep us honest.
2. The other reason is to uncover mistakes the original author didn’t catch. Remember the plot in the movie “Medicine Man”? He could not replicate the initial experiment. So a fresh pair of eyes discovered, while repeating the experiment one last time, that it was the ants in the sugar bowl used for the calibration sugar solution that was the source of the peak he couldn’t produce in all his replication experiments.
Publish, then release the code Tamino. It’s just the way research is done. Get used to it or stop playing scientist.
carrot eater (13:53:07) :
You may call it an anomaly at that point, but Tamino calls it a Temperature in his desc ription of his latest work, only after adding the Offset does he then work out the anomalies.
Thank you for pointing out the link to http://pubs.giss.nasa.gov/abstracts/1987/Hansen_Lebedeff.html
Now I know why E M Smith is so dismissive of the methology.
A C Osborn (03:39:44) :
It doesn’t matter what you call it; the point is, you are not averaging together absolute temperatures. If a station was colder than the others, it isn’t anymore once you add it to the pile. What you do preserve is its trend – whether it was warming or cooling. This understanding alone should tell you that it doesn’t matter if the station you’re adding is hot or cold; it matters what its trend was.
The method works absolutely fine when there are no gaps in data. In fact. pretty much all the methods do. Even with gaps in data, it works well enough. You can play with it yourself, and see what it takes to make it break.
It looks like EM Smith was re-inventing something like the First Difference method (see Peterson, GRL 103: 25,967-25,974 (1998). Reading the literature before you start working on a topic can save you a lot of trouble, and spare you some blushes. But if he likes that better, you’ll see it won’t really affect the results; you’ll just be using what NOAA uses.
So perhaps you shouldn’t be so dismissive – different methods have strengths and weaknesses, but all roads lead to Rome in the end.
Pamela Gray (22:03:22) :
He’s said he’ll release his code when he publishes. But you don’t need his code to check his work. There is nothing stopping you from writing your own, and seeing if you get the same results. If you or anybody interested starts today, you would probably have it done before his paper is published. That’s the whole point. If Tamino made some little math error someplace, then when you write your own code without looking at his, you probably won’t make that same error.
You don’t even have to follow the exact same methodology as him. If you use a slightly different methodology, and still can make the same conclusion, then you know the conclusion is not that sensitive to processing method. Which gives you confidence that it’s a reliable conclusion. If you have to cherry-pick some peculiar processing method to get the conclusion, then it isn’t a strong conclusion.
I’ve got a warrant for the arrest of that code. Just show me where the rascal is hiding.
============================
carrot eater (04:34:12) :
Pamela Gray (22:03:22) :
Has Tamino published a list of the Stations and Grid references for the boxes that he has used?
As I didn’t see it.
He has also used the very manipulated GHCN data not the NCDC “Quality Controlled” data, or better still the NCDC “Raw” data.
The excuses carrot eater brings up for not releasing code are astonishing. Imagine this: Researcher A writes a program, creates a graph with it, proving heating up of the globe. He doesn’t release the program but gives a verbal description of what he thinks his program does. (You wouldn’t believe how often people believe their program does a certain thing when it in fact does something completely different; actually this is a normal part of developing a program – finding and eliminating these discrepancies)
Researcher B replicates the program using the carrot-eater-approach and the verbal description given by Researcher A. He gets a different graph, showing cooling of the globe, publishes it and what now? (He also doesn’t release his code; it’s not necessary in carrot-eater-science. Verbal descriptions of what the program is supposed to do are accepted as sufficient by all researchers in carrot-eater-world.)
Do we conclude that Researcher A has lied? That Researcher B has lied? That one of them has made a mistake? Both?
Now, wouldn’t it be the obvious solution to treat the programs as part of the published result?
A program is BTW not functionally different from a mathematical proof or a mathematical argument. Following carrot eater’s logic, the mathematicians in his world would write papers that have a conclusion like “I just proved that the speed of light in vacuum is constant but i can’t give you the equations because they’re mine.”
That would be a pretty bizarre world – or is it an accurate description of the reality of climate science?