Is the NULL default infinite hot?
January 31, 2010 by E.M.Smith
see his website “Musings from the Chiefio”
What to make of THIS bizarre anomaly map?
What Have I Done?
I was exploring another example of The Bolivia Effect where an empty area became quite “hot” when the data were missing (Panama, posting soon) and that led to another couple of changed baselines that led to more ‘interesting red’ (1980 vs 1951-1980 baseline). I’m doing these examinations with a 250 km ’spread’ as that tells me more about where the thermometers are located. The above graph, if done instead with a 1200 km spread or smoothing, has the white spread out to sea 1200 km with smaller infinite red blobs in the middles of the oceans.
I thought it would be ‘interesting’ to step through parts of the baseline bit by bit to find out where it was “hot” and “cold”. (Thinking of breaking it into decades…. still to be tried…) When I thought:
Well, you always need a baseline benchmark, even if you are ‘benchmarking the baseline’, so why not start with the “NULL” case of baseline equal to report period? It ought to be a simple all white land area with grey oceans for missing data.
Well, I was “A bit surprised” when I got a blood red ocean everywhere on the planet.
You can try it yourself at the NASA / GISS web site map making page.
In all fairness, the land does stay white (no anomaly against itself) and that’s a very good thing. But that Ocean!
ALL the ocean area with no data goes blood red and the scale shows it to be up to ‘9999′ degrees C of anomaly.
“Houston, I think you have a problem”…
Why Don’t I Look In The Code
Well, the code NASA GISS publishes and says is what they run, is not this code that they are running.
Yes, they are not publishing the real code. In the real code running on the GISS web page to make these anomaly maps, you can change the baseline and you can change the “spread” of each cell. (Thus the web page that lets you make these “what if” anomaly maps). In the code they publish, the “reach” of that spread is hard coded at 1200 km and the baseline period is hard coded at 1951-1980.
So I simply can not do any debugging on this issue, because the code that produces these maps is not available.
But what I can say is pretty simple:
If a map with no areas of unusual warmth (by definition with the baseline = report period) has this happen; something is wrong.
I’d further speculate that that something could easily be what causes The Bolivia Effect where areas that are lacking in current data get rosy red blobs. Just done on a spectacular scale.
Further, I’d speculate that this might go a long way toward explaining the perpetual bright red in the Arctic (where there are no thermometers so no thermometer data). This “anomaly map” includes the HadCRUT SST anomaly map for ocean temperatures. The striking thing about this one is that those two bands of red at each pole sure look a lot like the ‘persistent polar warming’ we’ve been told to be so worried about. One can only wonder if there is some “bleed through” of these hypothetical warm spots when the ‘null data’ cells are averaged in with the ‘real data cells’ when making non-edge case maps. But without the code, it can only be a wonder:
The default 1200 km present date map for comparison:
I’m surprised nobody ever tried this particular ‘limit case’ before. Then again, experienced software developers know to test the ‘limit cases’ even if they do seem bizarre, since that’s where the most bugs live. And this sure looks like a bug to me.
A very hot bug…
Discover more from Watts Up With That?
Subscribe to get the latest posts sent to your email.



Steve M. (07:22:30) :
Error
I cannot construct a plot from your input. Anomalies for any period with respect to itself is 0 by definition – no map needed
apparently the “problem” is fixed…and just in the last 10 minutes”
Oh great. Well if there was a problem and it was pointed out by someone them I hope they did the right thing by putting up a change notice, thanking the person who pointed it out. Also, you can expect people to now start wondering if this “bug” ( deference to carrick) has had any impact that is now hidden.
ps.. Python is cool
http://arstechnica.com/open-source/news/2008/01/openmoko-freerunner-a-win-for-phone-freedom.ars
Arrg. ok the dream behind that phone was cool.
@ur momisugly Steve M. (07:22:30) :
Hey look! Mapping an 1881-1882 interval with an 1881-1883 base still shows the 179 deg W artifact in SH polar. So does 1881-1920 with 1881-1921. Likewise 1920/22, 1920/30. 1940-61 with 1940-1962 shows it very well. Seems like any periods similar enough to practically cancel out will yield the artifact.
Yup, it’s still in there.
rbroberg (08:28:05) : Looks like you one-upped me by finding the lowest anomalies as well. But I’m pleased that we both honed in on GMO IM.E.T. for the Dec 09 to Dec 98 difference.
Seriously, it took me all of five minutes to find that station.
@E.M. Smith:
Hope your cat is feeling better.
@carrot eater: Looks like you one-upped me by finding the lowest anomalies as well.
Yeah. The idea of looking at the data first seems obvious once you take off the ‘its all a fraud’ hat. And EM’s insistence of looking at only one end of the scale seems to reveal a certain explanatory bias as well.
aMINO aCIDS iN mETEORITES (18:45:04) :
“rbroberg (17:03:55) :
If there is no problem with what James Hansen and Gavin Schmidt are doing with the stations then they should switch from using the stations they kept to using the stations they excluded.
According to your lengthy arguments they will be the same.”
That is the best argument I have heard in a long time. If there was no bias or “cherry picking” involved the excluded RAW data should show a warming trend also when compared to the same baseline as the data that was keep.
“And EM’s insistence of looking at only one end of the scale seems to reveal a certain explanatory bias as well.”
I think this is a very important thing to notice.
Gail Combs (10:54:56) :
“If there was no bias or “cherry picking” involved the excluded RAW data should show a warming trend also when compared to the same baseline as the data that was keep.”
Somebody would have to repeat the effort that was made when the GHCN was started, and track down, collect and organise all the unreported data since 1990. I do think it’s time for that to happen, but it was a big effort. That said, if somebody wanted to spot-check a certain region, they could probably gather that data from SYNOP reports and check it out.
In the meantime, the other obvious thing to do is to simply build one record from stations that continued, and a separate record from stations that stopped. That can be fairly easily done, and has indeed been done.
http://www.yaleclimatemediaforum.org/2010/01/kusi-noaa-nasa/
Second figure; the analysis would be improved by taking the spatial mean, but it’s a start. Before the terminated stations dropped off, they were doing about the same thing as the stations that continued.
Simply saying that something might have had an effect is not that great a contribution. One has to go into the data and actually see if it did have an effect.
Red is used to show hot areas.
Using red to show areas with no data makes it look like there are more hot areas.
It is just “Jedi” mind tricks to bias the graphic.
Move along – nothing to see here.
after reading some of the commenters here (apparently they’re from RealClimate) I feel like I need a shower
carrot eater (12:12:30) :
This is what you said: blah, blah, blah, blah
” they were doing about the same thing as the stations that continued.”
The words “about the same thing” are relative. Relative terms are not allowed in science.
What is really happening is fudging to get a desired result.
Or is it sophistry to say “about the same thing”?
Who knows, because it’s all relative.
Harold Blue Tooth (14:01:37) :
If you want something more quantitative, then repeat the exercise, calculate the spatial means, and compute the difference in trends between continuous and terminated stations. And there you’ll have it.
If you are truly interested in this issue of station drop-off, you should be requesting the people leading that discussion to do that very analysis.
Alternatively, you could go back to the 1980s, use non-Bolivian stations to calculate the estimated temperature at some grid point within Bolivia, and then check against the actual observations. That’ll give you an idea of how well the interpolation works. This could probably be done in about a day, if somebody were interested in the matter.
OT, but back in 2006, when the election in Mexico happened, I had reason to be interested. I was able to watch the latter part of the results coming in. This was a supreemely close election, with the eventual official margin being 0.58%. Lopez Obrador had a led of about 2% when I started keeping track, with about 81% of the vote in. I refreshed the counter every 2-5 minutes. In every single refresh, the margin moved in favor of Calderon. I could not see what precincts or whatever, but the increases on both sides each time were VERY small. Yet in this election decided by barely 1/2%, every single small update was in favor of Calderon. At least while I was watching, which we pretty much right up until the end. For the final 19% of the vote (out of 40 million votes), Lopez Obrador was barely edged out, so that his 2% lead became a loss. That not ONE of those refreshes showed a gain in his favor, to me, that reeked of the fix being in, someone tampering with the results. I refreshed probably 100 times.
In a 50-50 vote, that is like flipping a coin 100 times and every time it came up heads. What is 2^100th power?…lol
When I read that every error was one way, I know it is like those coin flips. 2 to the nth power is a big number, after all.
Peter –
I DID see in the HADCRU emails that they DO use 9999 and 999 as values for missing data. It is in several places, and specifically in HARRY_READ_ME.txt.
As to whether GISS is using it, I can’t say, but it is IN the data. And CRU shared with GISS and NOAA.
carrot eater (14:36:35) :
don’t need all that
I want James Hansen to use the stations he’s been dropping and drop the stations he’s been using—and completely open up what he is doing to everyone.
That’s a more accurate way. No fudging in the stations he’s been using in the picture then.
From the “Sweep the dirt under the rug” department. If you now try to do a “self vs self” anomaly map you get this message:
Surface Temperature Analysis: Maps
Simple reminder
Anomalies for any period with respect to itself is 0 by definition – no map needed
Guess it was easier to hide it than to actually fix the code. A reasonable short term HACK but hardly good QA.
“carrot eater (04:01:16) :
Please explain why you think the choice of baselines is important. It is not. All it does is change the absolute values of the anomalies. The trends, which is what’s important, do not change as you change the baseline.”
The 1961 – 1990 “baseline” is cherry picked and not nearly wide enough. Trends would be important if one did not remove 75% of the recording devices nor “add vaule” to the raw data, and then lose the raw data.
rbroberg (09:48:21) :
@E.M. Smith:
Hope your cat is feeling better.
The cat passed away about 11am.
And EM’s insistence of looking at only one end of the scale seems to reveal a certain explanatory bias as well.
Pardon? I’m exploring ALL ends of the scale. This was an unexpected excursion on my way to benchmarking the baseline:
http://chiefio.wordpress.com/2010/02/02/giss-benchmarking-the-baseline/
I’d fully expected to get a white land / grey ocean map. That is what ought to have happened. To ignore an obvious bug in their web based maps would be to put the entire benchmark in doubt.
Need I point out a line of text from my comments earlier:
So we have here a bug announcing it’s presence. We do not know the degree to which it matters. Is it just a ‘color mapping fault’ in the web page? Or is it sporadically averaging a 9999 in with a few dozen boxes of real data and making things very red when they ought not to be? We don’t know. But we DO know it must be looked into if you would trust the code to drive policy decisions.
Hardly a statement of expectation of any given outcome.
And if you bother to look into it, you will find that “edge cases” are the most important ones in debugging and QA. I always test the zero and infinity cases if at all possible. And I usually do start at zero since the infinity case often does not exist. So you start at zero and work your way up. In this case, zero is ‘baseline against baseline’.
You will also find that if you want to test for the presence of bugs you must behave as though you expect them to be there. This is the most common failure of programmers in not catching bugs (and why the folks who do QA for a living are often specialists at it). You simple must look for broken things if would ever find them. That does not mean you want them.
I managed the QA department for a C compiler tool chain for a few release cycles. The “can’t ever happen” cases in the QA suite were a large percentage (and the frequency with which they caught bugs shows their worth). You just never ignore a bug until it’s investigated and found benign; and you just never assume the code is good. It isn’t. There are always bugs. The purpose is to make sure none of them are really “material” and stamp out any that are.
Sidebar: On one occasion, a programmer had finished a very nice input screen that carefully checked for out of bounds data and many other cases. The manager walked up, looked at the very pretty screen layout, read the “spec”. And proceeded to reach over and “mash” the keyboard. (He was simulating someone leaning on the keyboard with an arm/ elbow). The “system” crashed. Took me quite a while to figure out what very unexpected characters had caused the crash… but the final product DID pass QA (even the “key mash”).
That does not mean folks expect to do key mashing all day long. It means that that is how you find bugs and how you make “robust” code.
Not by hiding your bugs and not by assuming they do not matter.
@ur momisugly E.M.Smith (17:15:16) :
I am sorry to hear about your cat. I lost one last winter. Such losses can be hard.
I agree with you about testing boundary conditions. No argument there.
I agree there was a display bug and that it now appears to be corrected.
I disagreed with your claim that the display bug for 9999 was ” bleeding through as ‘real data’.
You asked for an explanation of the +10C and higher anomalies when diffing 2009-2008 and 2009-1998.
I demonstrated that the endpoints for many of your maps, both high and low, were obviously matching the monthly averages for isolated high latitude stations.
But I haven’t seen you yet disclaim or any in way moderate your position that the ‘display bug’ was ‘bleeding through to the “real data.”‘
Do you still hold that view?
Edit for above:
I wrote: I disagreed with your claim that the display bug for 9999 was ” bleeding through as ‘real data’.
I should have written” I disagreed with your suspicion that the display bug for 9999 was ” bleeding through as ‘real data’.
An interesting observation was made by “boballab” here:
http://boballab.wordpress.com/2010/02/02/blood-red-oceans/
He noticed that the upper right hand corner of the graph gives the average anomaly for the map. For example, the present Dec 09 anomaly is 0.67 in that map. On the “Blood red oceans” one at the top, the average anomaly is given as 5344.1 C so we now know that the Bogus Nines are used in some calculations.
We still don’t know if this is just in the creation of the web page or not. (I could see a case where it was generated by the graphing package, but that would be a bit odd given that the GIStemp software also calculates a value, but the GISS code has done odd things before. If they do it both ways they could at least print them both as a crossfoot validation. A “check if equal” for those two values in the display code would have flagged this some time back, assuming it is in the graphing part).
BTW, to the folks posting that the GIStemp code is available for download at the GISS link: As I’ve pointed out before, that is SOME code, but not THIS code. There is no graphics presentation module in the code (unless they added it on at the last update) and you don’t get to play with all the knobs…
Unfortunately, with the only visibility of this bug now swept under the rug, we may never find out what it really is.
Espen (02:14:49) :
I asked for 250 km smoothing, and they did it, except around the south pole where there seems to be a circle of 1000 km radius? Why?
Fascinating… I would GUESS (in all caps since some folks here don’t seem to see words like “speculate”, “suspect”, “expect”, “line of investigation” etc. unless really really loud…) that the way GIStemp calculates boxes is from the “top down” and the end ‘box’ may be an ‘everything left over’ that happens a bit too soon. If I get a chance I’ll look at the code that is published and see if it does that or if this is another one that is ‘between the code and the screen’…
On the question of “does the baseline matter” the benchmark I did of the baseline says “yes it does”. How much is for another more detailed study, but the benchmark showed that the baseline has features (very unusual features like a hot/cold antarctic pair) that “bleed through” into the present year reports. I’ll have to add a “1971-1990” decade graph to the presentation for the non-GIStemp folks…
kadaka (03:41:38) : Experimenting further, I tried the other end, using 1880 to 1910. Which generated the following error message: I cannot construct a plot from your input. Time interval must begin in 1880 or later. Yup, it thinks “1880″ is not 1880 or later.
Most likely you are set to the “year” that wraps from Nov to Oct (the default) rather than the Jan to Dec that is at the very top of the list. So the code is trying to use 1879 Nov & Dec to fill out the “year” that is mostly in 1880. Yeah, a ‘human factors’ issue in the interface design. But if you go out of your way to choose the J-D year it will work.
rbroberg (08:28:05) : It is is difficult to understand why you stubbornly cling to your suspicion (if you still do).
A few decades of debugging code and running a code QA department and being Build Master for another product (web based integrated email, web, etc. server w/ DHCP services and routing). After a while you learn that you NEVER know what the bug really is until the code is FIXED and passes QA. (And even then you keep your ‘suspects’ list in case the bug comes back…)
Dismissing “suspects” too early is a major reason for not finding a bug. Yeah, you rank them in probability order. Yeah, you take the most likely (and sometimes the easiest to prove / disprove) and whack on them first. But you NEVER ‘dismiss’ a suspicion until the Fat QA Suite sings… Too many times thinking “This time for sure!”… and watching everyone else do the same thing.
So as of now my “suspects” list would be headed by a “web graphics” package that we can’t look at. But everything ELSE stays on the list until the end. (And yes, ranked from highest to: “you’ve got to be kidding, no way”… but it stays on the suspect list.)
SIDEBAR: I once was writing a cost accounting package. Tests fine. Add new FUNCTION and function call. Old code breaks. Hours later, change old code name from FOO to FOONEW preparing to try new idea. Code works… Leave it alone, go back to New Function BAR. Add a few lines. FOONEW breaks. Change name back to FOO. It works… Called the vendor. Compiler bug made their compiler sensitive to names of functions sometimes…
SIDEBAR2: I once wrote code of the form:
If A goto ALAB
if NOT-A goto WAYDown
Print “You can’t get here – Compiler error”
and had it print …
So after a few of those, you stop assuming that you EVER know you can dismiss out of hand anything as ‘not the problem’. That, and running QA on a compiler tool chain for a while 😉
Further, it is a bit of the cops dictum. If you suspect and look, you may catch the problem. If you look and all is well, fine. But if you never suspect, you will never look, and you will never catch the problem. So suspect first, then look. Then keep suspecting until the court closes the case. And maybe keep an eye on the guy even if the case gets dismissed.
(Yeah, I’ve hung out with cops a lot of my life and had some amount of ‘police training’ – directing cars is fun, marching in formation not so much, when to draw your gun and fire is spooky [ everyone killed a couple of innocent people in the film when they did ‘stupid but innocent’ things.] Film was designed to teach you that. To teach you to suspect always, but leap to conclusions never.)
It has served me well over the years. I find bugs others miss. I spend a bit more time investigating some dead ends some times, but more often get to the ‘real problem’ faster than the ‘one line fix’ guys. I also tend to write really solid code.
Basically, you have a search tree. Don’t prune it too fast and don’t erase it when you take a path. Keep an eye on the other paths and be ready to jump there if some new data shows up. But don’t erase that other path…
FWIW I have the same “suspicion” about every bit of code I write.
I think it is more solid than most, but at the drop of a hat I’d start tearing it apart if it did something even a little bit unexpected.
So evidence is FINE and it helps change the weighting on the search tree, but it does not erase a branch. Just puts a toll booth on it 😉
Everyone that is saying that ‘silly’ ‘extreme’ data values are used to report errors could not be more wrong. Or at least, has been wrong since about 1990 when professional programmers started taking Y2K seriously. All professional programmers know not to pass errors or messages in data. Ergo, this software is either ancient or written by incompetents.
“George (23:50:00) :
Everyone that is saying that ’silly’ ‘extreme’ data values are used to report errors could not be more wrong. Or at least, has been wrong since about 1990 when professional programmers started taking Y2K seriously. All professional programmers know not to pass errors or messages in data. Ergo, this software is either ancient or written by incompetents.”
What was serious about Y2K? I mean, countries like Romania (With nuclear power stations) and Ethipoia (Ok to be fair Ethiopia had their Y2K in 2007 in our callendar) didn’t have the resources to deal with the “problem”, which didn’t happen (In countries which did not “invest” in the Y2K “problem”).
Well, some people made lot of money out of that “scare”.
George:
I must not be a professional programmer.
#define EOF -1 int c; while ((c = getchar()) != EOF) { // do something }That Richtie guy was a real slacker, wasn’t he?
The Y2K problem had nothing to do with passing out of band data, by the way. It was a schlock mistake that would have gotten points taken off if it had appeared in first semester programming code. I know this because I was taking points off students projects for errors like this well before the Y2K scare.
Actually I’ve always taken Y2K as a model for how people come to exaggerate climate change impact. The worst scientists hide out in the climate change impact area (the good ones are in the physical science part which is mostly unrelated). Same goes for Y2K, I turned down a great paying job involving Y2K. So did most other competent people I know.