Is the NULL default infinite hot?
January 31, 2010 by E.M.Smith
see his website “Musings from the Chiefio”
What to make of THIS bizarre anomaly map?
What Have I Done?
I was exploring another example of The Bolivia Effect where an empty area became quite “hot” when the data were missing (Panama, posting soon) and that led to another couple of changed baselines that led to more ‘interesting red’ (1980 vs 1951-1980 baseline). I’m doing these examinations with a 250 km ’spread’ as that tells me more about where the thermometers are located. The above graph, if done instead with a 1200 km spread or smoothing, has the white spread out to sea 1200 km with smaller infinite red blobs in the middles of the oceans.
I thought it would be ‘interesting’ to step through parts of the baseline bit by bit to find out where it was “hot” and “cold”. (Thinking of breaking it into decades…. still to be tried…) When I thought:
Well, you always need a baseline benchmark, even if you are ‘benchmarking the baseline’, so why not start with the “NULL” case of baseline equal to report period? It ought to be a simple all white land area with grey oceans for missing data.
Well, I was “A bit surprised” when I got a blood red ocean everywhere on the planet.
You can try it yourself at the NASA / GISS web site map making page.
In all fairness, the land does stay white (no anomaly against itself) and that’s a very good thing. But that Ocean!
ALL the ocean area with no data goes blood red and the scale shows it to be up to ‘9999′ degrees C of anomaly.
“Houston, I think you have a problem”…
Why Don’t I Look In The Code
Well, the code NASA GISS publishes and says is what they run, is not this code that they are running.
Yes, they are not publishing the real code. In the real code running on the GISS web page to make these anomaly maps, you can change the baseline and you can change the “spread” of each cell. (Thus the web page that lets you make these “what if” anomaly maps). In the code they publish, the “reach” of that spread is hard coded at 1200 km and the baseline period is hard coded at 1951-1980.
So I simply can not do any debugging on this issue, because the code that produces these maps is not available.
But what I can say is pretty simple:
If a map with no areas of unusual warmth (by definition with the baseline = report period) has this happen; something is wrong.
I’d further speculate that that something could easily be what causes The Bolivia Effect where areas that are lacking in current data get rosy red blobs. Just done on a spectacular scale.
Further, I’d speculate that this might go a long way toward explaining the perpetual bright red in the Arctic (where there are no thermometers so no thermometer data). This “anomaly map” includes the HadCRUT SST anomaly map for ocean temperatures. The striking thing about this one is that those two bands of red at each pole sure look a lot like the ‘persistent polar warming’ we’ve been told to be so worried about. One can only wonder if there is some “bleed through” of these hypothetical warm spots when the ‘null data’ cells are averaged in with the ‘real data cells’ when making non-edge case maps. But without the code, it can only be a wonder:
The default 1200 km present date map for comparison:
I’m surprised nobody ever tried this particular ‘limit case’ before. Then again, experienced software developers know to test the ‘limit cases’ even if they do seem bizarre, since that’s where the most bugs live. And this sure looks like a bug to me.
A very hot bug…
Discover more from Watts Up With That?
Subscribe to get the latest posts sent to your email.



Phil M (17:51:17) :
Why are you using python???
Phil M (18:16:31) :
Phil,
I asked for “high arctic temps” not temperature trends from a graph in Wikipedia.
Do you have a temperature set that shows high temps in the Arctic?
Did you see this?
“Arctic temperatures above 80°N are the lowest in six years”
http://wattsupwiththat.com/2010/01/23/arctic-temperatures-above-80%C2%B0n-are-the-lowest-in-six-years/
Carrick (18:14:45) :
Yup, I was so giddy with excitement getting my hands on this program in a language I can work with, I just blurted out the first thing I saw.
I guessed I’m biased with the “-9999”; all the times I’ve worked with satellite/climate data that’s been used as a null value.
aMINO aCIDS iN mETEORITES (18:23:26) :
“Why are you using python???”
Because it’s free and open source and rapidly becoming a common platform for scientific computing.
rbroberg (17:03:55) :
If there is no problem with what James Hansen and Gavin Schmidt are doing with the stations then they should switch from using the stations they kept to using the stations they excluded.
According to your lengthy arguments they will be the same.
Phil M (18:38:38) :
see this above:
steven mosher (15:34:16) :
E.M.Smith (17:06:39) :
“Since it is so easy, I look forward to your report after you have done the work. Me? I have dinner to make, a sick cat to tend, Open Office to install on 2 platforms, comments on MY blog to moderate, …”
In about ten minutes of spot-checking:
First station I tried: Goose, Newfoundland.
http://data.giss.nasa.gov/work/gistemp/STATIONS//tmp.403718160005.2.1/station.txt
is 8.6 C warmer in Dec 09 than Dec 08.
Let’s look for other stations in red splotches in Dec 09, compared to Dec 08
Egesdesminde, Greenland 5.1 C
Fort Chimo, Canada. 10 C
Looks like I found your 10 C difference between Dec 08 and Dec 09. Third station I tried. Hence, the range of the colorbar.
Let’s see what else we find.
Danmarkshavn, Greenland. 2.7 C
Godthab Nuuk: 5 C
Inukjuak Quebec: 6.6 C
Coral Harbor: 8.6 C
So I’ve found a bunch of stations that are between 5 and 10 C warmer in Dec 09 compared to Dec 08.
Do you want me to repeat the exercise for the red splotches in Dec 09 compared to Dec 98, or do you get the idea?
aMINO aCIDS iN mETEORITES (18:34:23) :
“Do you have a temperature set that shows high temps in the Arctic?”
From a previous WUWT post:
http://i43.tinypic.com/1zp1q8j.jpg
And here’s 2009:
http://usera.imagecave.com/climatestuff/2009_full.jpg
E.M.Smith (17:45:31) :
carrot eater (17:02:59) :
There is no bug here. Just the Arctic Oscillation event of Dec2009/Jan2010.
Oh, here’s a nice one. The entire winter of 2006 vs 1998. 13 C ‘hotter’.
http://data.giss.nasa.gov/cgi-bin/gistemp/do_nmap.py?year_last=2009&month_last=12&sat=4&sst=0&type=anoms&mean_gen=1203&year1=2006&year2=2006&base1=1998&base2=1998&radius=250&pol=reg
Don’t think that is related to last Decembers AO “event”…
Oh, and I’ll save you some time. 2004, 2005, 2007, and 2008 have very similar patterns.
My GUESS at this point is that: it is much more likely to have ‘data drop outs’ in the dead of arctic winter and that GIStemp is not handling a ‘no data’ case properly. It shows up dramatically when the 1951-1980 baseline is compared to self (so lots of cells with zero anomaly to be handled). But in arctic winter there are similar ‘zero anomaly’ cases (possibly from a NULL vs NULL case as a hypothesis) that bias the result. Again: This is a working hypothesis GUESS for guiding debugging and NOT a conclusion.
The conclusion comes after you recompile the fixed code and the problem is gone AND the code passes the QA suite… and maybe even ‘hand process’ a few grid / boxes just to validate that it’s really doing exactly what it ought to be doing.
BTW, the ‘SH Winter” case does not show as much red at all. Given that GIStemp fills boxes and grids “from the top down” I’d further speculate that the N.H. is more prone to whatever the issue is due to being filled first and with less opportunity to benefit from what has ‘gone before’. (The alternative hypothesis would be that the Antarctic thermometers are less prone to drop outs. Easy cross check on that idea.)
Again: All this is speculative. It is what a programmer dose to focus their first dive into the code looking for what the problem really is. Often it gets you to a fix in minutes instead of hours. When it works, it’s great. When it doesn’t, you move on to the next likely idea.
For me, that ‘next likely idea’ would be the odd way GIStemp manipulates data to make seasonal buckets. If this only shows up in NH Winter (or does so much more strongly then) I’d examine things that are different about the way each season is handled. There is this odd ‘reach back to last year’ to start a new winter that could be ‘the issue’. Especially when stations tend to end at year boundaries. (Don’t know why, but thermometer lives are often bounded by the year end / start.)
This graph:
http://data.giss.nasa.gov/cgi-bin/gistemp/do_nmap.py?year_last=2009&month_last=12&sat=4&sst=0&type=anoms&mean_gen=0303&year1=2009&year2=2009&base1=1998&base2=1998&radius=250&pol=reg
of March April May 2009 vs 1998 has a very ruby red Antarctic with that odd little red boxes look and a +7.9 C top.
2005 is similar. Both have a very cold peninsula too. So there are some interesting things to investigate here.
Oddly, 2005 FALL has a ruby red Antarctic and Siberia:
http://data.giss.nasa.gov/cgi-bin/gistemp/do_nmap.py?year_last=2009&month_last=12&sat=4&sst=0&type=anoms&mean_gen=0903&year1=2005&year2=2005&base1=1998&base2=1998&radius=250&pol=reg
with a 12.5 C hot anomaly upper bound. 2007 and 2008 are similar. This quarter also butts up against the odd ‘seasonal wrap’ that GIStemp does. So is the problem there? Or is this just an odd coincidence? No way to know. Need to go into the code to find out or need to do hand calculations for some of the boxes for comparison.
But I think it’s pretty clear that when we’ve been steadily and profoundly lower than that 1998 peak, and the graphs say we were burning up hot in comparison, the graphs have issues…
Hadley’s CRUTEM3 uses -99.0 for no data, 241,331 null data points out of 1,476,540 or 16.34% of their monthly “average” temps from 1521 stations. Well Vostok got down to −89.2 °C so they had a 9.8 degree buffer between real-world and NULL data! OBTW I wrote a Perl script that loads the CRUTEM3 dataset into a Postgresql relational database, seems to have worked properly but I haven’t checked well enough to be positive, the gziped sql dump sits at 8.6 MiB, someone expressed an interest in one of these so I figure we could connect if they are still interested.
Carrick (17:36:22) : The color red is a bug?
Um, no.
It does what it needs to do, namely flag missing regions of the ocean. Would you prefer green perhaps?
It’s not the color red that’s the big issue, it is the 9999 C attached to that color… Something is just computing nutty values and running with them.
BTW, the “missing region” flag color is supposed to be grey…
EM Smith:
You do know what out-of-band data signifies, right?
It’s obviously an error code, as a quick perusal of the code reveals.
EM Smith:
I think we can concentrate on something more important than having the same color for ocean and land for missing regions.
Seriously, if I had made that code error, I would have left it this way too.
E.M.Smith (16:48:35) :
This is a fun game, after all. Let’s say I want to find the biggest difference between Dec 09 and Dec 98. There are lots of red splotches on the map, and the colorbar has poor resolution. So I’ll download the gridded data and have a look.
Scrolling past all the 9999s for missing data, and I find that I should be looking at some islands north of Russia. I try some station called Gmo Im.E.T, and I get:
Dec 09 is 12.3 C warmer than Dec 98. First try.
Trev (15:02:27),
CO2 varies a lot. Some folks may argue with Beck’s 90,000+ direct chemical measurements accurate to, IIRC, within ± 2.5%.
http://www.biomind.de/realCO2/realCO2-1.htm
The 1940’s weren’t the only time to have high CO2. In the early 1800’s the same thing occurred. This website is controversial, but it’s hard to argue that scientists with reputations, like J.S. Haldane, would all conspire to misread tens of thousands of chemical CO2 measurements. What for?
The planet emits and absorbs varying amounts of carbon dioxide, far in excess of any puny amount that humans emit, and we still don’t know the exact mechanism.
CO2 doesn’t have nearly the effect on temperature that the AGW crowd has been claiming. CO2 spikes in the early 1800’s and the early to mid 1940’s were not followed by temperature spikes. Therefore, CO2 has little effect if any on the temperature. QED
Carrick (19:20:26) : It’s obviously an error code, as a quick perusal of the code reveals.
That is leaping to a conclusion. I would say it is LIKELY a missing data flag (though it could also be any of several other types of failures). But yes, I would suspect most that a ‘missing data flag’ is not being properly handled and is bleeding through as ‘real data’. In the limit case of “no anomalies” it runs wild and we get the Ruby Red World Oceans. In the non-limit case there is an open issue as to ‘does it bleed or not?’. In either case, it’s a bug.
Carrick (19:21:42) : EM Smith:BTW, the “missing region” flag color is supposed to be grey…
I think we can concentrate on something more important than having the same color for ocean and land for missing regions.
Um, no, we can’t. It means that the “missing regions” code is broken in some way. The code knows to assign ‘grey’ to missing regions and screws it up (badly in some cases). “White” is not ‘missing region’ it is ‘no anomaly’. So we have here a bug announcing it’s presence. We do not know the degree to which it matters. Is it just a ‘color mapping fault’ in the web page? Or is it sporadically averaging a 9999 in with a few dozen boxes of real data and making things very red when they ought not to be? We don’t know. But we DO know it must be looked into if you would trust the code to drive policy decisions.
Seriously, if I had made that code error, I would have left it this way too.
And your code would also be unsuited to policy decisions.
9999 is the equivalent of a FITS file NAN. It means no data exists. In a microprocessor, it is a NOP. No operation. Do not read, nothing to do.
So, the Arctic is a NULL, NAN, NOP, 9999 or -9999 as the case may be.
Now, let’s have some fun.
I will subtract the maps and see what gives.
Be right back.
I just love image processing.
Here’s the #2 map subtracted from the #3 map:
http://www.robertb.darkhorizons.org/ghcn_giss_1200km_anom11_2009_2009_1951_1980.jpg
At least the Arctic is now gray. NAN, NOP, -9999
The Antarctic is still beet red, which is another problem.
I looked at several variations of the same year baseline and in every case I tried where the base period is one year and the time interval is all or part of the same year, then NULL grid points become 9999 and the oceans flow with blood.
I assume this bug is in the code that generates the images, not in the code that sausagifies the station data. In which case it’s an interesting bit of coding error and doesn’t amount to much. Just drop Jim and Reto a note and you’ll get a gracious thank you in reply.
Of course if the error lies deeper in the sausagification routines, then it merits a bit of investigation to root it out.
Earle: That sounds right to me.
Make any map that works. Look at the gridded data. All the grey areas for ‘no data’ have 9999 as the temp anomaly.
Probably the code that makes the images is supposed to turn ‘9999’ into grey, but for some odd reason, it messes up if you choose the baseline to be the same as the observed time interval.
But in any case, it’s quite obvious that there is no leakage of the no-data flag into the real data. Whatever anomalies EM Smith thinks are weird, you can quickly find the corresponding stations. You can find them in almost certainly less time than it took to write the original article.
“Using out-of-band values to signify an error is pretty common in computer programming, by the way”
This may be true in research or academic programming but in my experience it is not true in commercial and business software. NULL is a construct, it has no value, is not aggregated and is not numeric. The only statement that can be made is if a variable is null or not.
Phil M (18:56:09) :
Did you see this?
http://wattsupwiththat.com/2010/01/23/arctic-temperatures-above-80%C2%B0n-are-the-lowest-in-six-years/
REPLY: FYI, true then for the moment, but they’ve already rebounded – A
Again I’ll have to say: the stations that have been excluded from the GISS set should be used and the ones that have been used should be excluded.
This would address some of the doubts.
Apparently there are a few people here arguing there is no difference between the two sets. They are not saying that directly but their arguments say it.
The stations dropped should have the same anomaly and temperature result as the ones used—if all GISS apologists are right they should, shouldn’t they?
“Iren (16:09:25) :
What they leave out is that they’re only counting since 1961! Apparently, 1890 was the end of our warmest decade.”
Because the Austraian BoM uses the global average between 1961 – 1990 in all it’s comparisons. So very plump cherry picking going on at the BoM.
OT, but I cought the tail end of a news cast here. Apparently, two teenagers have been charged with starting one of the major bush fires in Victoria in Jan/Feb 2009. Global warming at work I see.
Patrick Davis (21:09:13) :
The choice of baseline is completely irrelevant to the ranking of years or decades.