Guest post by Willis Eschenbach
Inspired by this thread on the lack of data in the Arctic Ocean, I looked into how GISS creates data when there is no data.
GISS is the Goddard Institute for Space Studies, a part of NASA. The Director of GISS is Dr. James Hansen. Dr. Hansen is an impartial scientist who thinks people who don’t believe in his apocalyptic visions of the future should be put on trial for “high crimes against humanity”. GISS produces a surface temperature record called GISTEMP. Here is their record of the temperature anomaly for Dec-Jan-Feb 2010 :
Figure 1. GISS temperature anomalies DJF 2010. Grey areas are where there is no temperature data.
Now, what’s wrong with this picture?
The oddity about the picture is that we are given temperature data where none exists. We have very little temperature data for the Arctic Ocean, for example. Yet the GISS map shows radical heating in the Arctic Ocean. How do they do that?
The procedure is one that is laid out in a 1987 paper by Hansen and Lebedeff In that paper, they note that annual temperature changes are well correlated over a large distance, out to 1200 kilometres (~750 miles).
(“Correlation” is a mathematical measure of the similarity of two datasets. It’s value ranges from zero, meaning not similar at all, to plus or minus one, indicating totally similar. A negative value means they are similar, but when one goes up the other goes down.)
Based on Hansen and Lebedeff’s finding of a good correlation (+0.5 or greater) out to 1200 km from a given temperature station, GISS show us the presumed temperature trends within 1200 km of the coastline stations and 1200 km of the island stations. Areas outside of this are shown in gray. This 1200 km. radius allows them to show the “temperature trend” of the entire Arctic Ocean, as shown in Figure 1. This gets around the problem of the very poor coverage in the Arctic Ocean. Here is a small part of the problem, the coverage of the section of the Arctic Ocean north of 80° North:
Figure 2. Temperature stations around 80° north. Circles around the stations are 250 km (~ 150 miles) in diameter. Note that the circle at 80°N is about 1200 km in radius, the size out to which Hansen says we can extrapolate temperature trends.
Can we really assume that a single station could be representative of such a large area? Look at Fig.1, despite the lack of data, trends are given for all of the Arctic Ocean. Here is a bigger view, showing the entire Arctic Ocean.
Figure 3. Temperature stations around the Arctic Ocean. Circles around the stations are 250 km (~ 150 miles) in diameter. Note that the area north of 80°N (yellow circle) is about three times the land area of the state of Alaska.
What Drs. Hansen and Lebedeff didn’t notice in 1987, and no one seems to have noticed since then, is that there is a big problem with their finding about the correlation of widely separated stations. This is shown by the following graph:
Figure 4. Five pseudo temperature records. Note the differences in the shapes of the records, and the differences in the trends of the records.
Curiously, these pseudo temperature records, despite their obvious differences, are all very similar in one way — correlation. The correlation between each pseudo temperature record and every other pseudo temperature records is above 90%.
Figure 5. Correlation between the pseudo temperature datasets shown in Fig. 3
The inescapable conclusion from this is that high correlations between datasets do not mean that their trends are similar.
OK, I can hear you thinking, “Yea, right, for some imaginary short 20 year pseudo temperature datasets you can find some wild data that will have different trends. But what about real 50-year long temperature datasets like Hansen and Lebedeff used?”
Glad you asked … here are nineteen fifty-year long temperature datasets from Alaska. All of them have a correlation with Anchorage greater than 0.5 (max 0.94, min 0.51, avg 0.75). All are within about 500 miles of Anchorage. Figure 6 shows their trends:
Figure 6. Temperature trends of Alaskan stations. Photo is of Pioneer Park, Fairbanks.
As you can see, the trends range from about one degree in fifty years to nearly three degrees in fifty years. Despite this huge ~ 300% range in trends, all of them have a good correlation (greater than +0.5) with Anchorage. This clearly shows that good correlation between temperature datasets means nothing about their corresponding trends.
Finally, as far as I know, this extrapolation procedure is unique to James Hansen and GISTEMP. It is not used by the other creators of global or regional datasets, such as CRU, NCDC, or USHCN. As Kevin Trenberth stated in the CRU emails regarding the discrepancy between GISTEMP and the other datasets (emphasis mine):
My understanding is that the biggest source of this discrepancy [between global temperature datasets] is the way the Arctic is analyzed. We know that the sea ice was at record low values, 22% lower than the previous low in 2005. Some sea temperatures and air temperatures were as much as 7C above normal. But most places there is no conventional data. In NASA [GISTEMP] they extrapolate and build in the high temperatures in the Arctic. In the other records they do not. They use only the data available and the rest is missing.
No data available? No problem, just build in some high temperatures …
Conclusion?
Hansen and Lebedeff were correct that the annual temperature datasets of widely separated temperature stations tend to be well correlated. However, they were incorrect in thinking that this applies to the trends of the well correlated temperature datasets. Their trends may not be similar at all. As a result, extrapolating trends out to 1200 km from a given temperature station is an invalid procedure which does not have any mathematical foundation.
[Update 1] Fred N. pointed out below that GISS shows a polar view of the same data. Note the claimed coverage of the entirety of the Arctic Ocean. Thanks.
[Update 2] JAE pointed out below that Figure 1 did not show trends, but anomalies. boballab pointed me to the map of the actual trends. My thanks to both. Here’s the relevant map:







Willis
The change in anomaly from one time to another is independent of the choice of the base period used to calculate the normal. However the actual value of the anomaly at any given time is highly sensitive to the choice of the base period and it is the anomaly (compared to 1951 – 1980 as the base period) that is being shown by Hansen, is it not ?
Do you know why it is that the whole of any given temperature record is not averaged to give the normal from which anomalies are calculated? After all, Mann was able to get the hockey stick partly by using a base period shorter than the record, which is kind of an analogy to my mind.
DMI calculates area-unweighted temperature with 5-degree grid using ECMWF data. So there is more measurement points per area unit when it is moved towards the pole and thus their temperature is not 80N-90N mean temperature. However, for example DMI January 2010 anomaly is about 5.7 degrees C colder than GISS anomaly, when compared to base period 1958-2002. Also February 2010 has several degrees difference to the same direction.
Different weighting methods should not to lead such a big differences. It is quite sure that GISS has too warm in 80N-90N. According to DMI data responsibles, they are coming to convert their data to true mean temperature in near future. By then, real comparison with GISS can be made (and the result is quite obvious).
When comparing to HadCRUT, GISS has much warmer global anomalies in January and February 2010 when same base period is used. This difference come from grey areas which are not included in HadCRUT.
I’ve never thought the anomaly was unimportant, or the enemy. I just don’t like the idea of being told to not pay attention to the real temperatures behind the anomaly curtain.
anna v (03:53:32), thanks for your perseverance:
I swear I can’t make that site do a !@ur momisugly#$% thing … every time it just downloads a file that says:
I tried it with both Safari and Firefox, I get nothing.
When I go to the page in the message, I get nothing, 404, missing page.
The site also says that the files are in an ftp folder at ftp://isccp.giss.nasa.gov/pub/data/ surface. Of course, being NASA, when you go there and click on the “surface” folder, they’ve misspelled it so it gives a “file not found”. Then when you figure that out and can actually get to the individual data files, they are misspelled as well. And when you hack all the way through that, they are in IEEE binary format. My tax dollars a work. What else can I try?
OK, new plan. I just downloaded the Google Chrome browser, and I got that to work. I downloaded and averaged the equal-area dataset for SAT and SST (surface air and skin temperatures). That gave me an average air temperature of 288.4K, and a skin temperature of 287.9K. This is a difference of half a degree. So it looks like my intuition was about right.
Finally, they say that they are giving the “Surface Air Temperature” and the “Surface Skin Temperature”. In their definition of the variables, they say:
OK, fair enough. And how about for Surface Air Temperature? … well … for that they have no definition at all. None. They give definitions for 18 different variables, but not for that one. So what are they using?
I ask because this is a page giving satellite derived values for different variables, and I know of no satellite product that can give us the surface air temperature. The closest that you can get is the MSU lower troposphere product. So I haven’t a clue what those numbers they quote represent. If you do, please let me know.
Let me say, however, that I agree entirely with your closing statement that “talking of radiation budgets with values of 1 and 2 and 4 watts/m^2 is futile”, not because of not using the correct temperatures in the study as you say (although that is a factor), but because we don’t know any of the values to that level of accuracy.
This is why in my other thread about climate sensitivity (which is where this discussion really belongs), I am using what I clearly identify as estimates, such as “about 150 W/m2” and “~ 20°C”. We don’t know any of these numbers to any great accuracy.
We now return you to your regularly scheduled programming, featuring extrapolating temperatures 1200 km into the unknown …
w.
bradley13 (02:00:15)
As someone pointed out upstream, this has been done. Let me see if I can find it … OK, it’s here.
Correction to earlier reply: DMI uses 0.5 degrees grid, NOT 5 degrees.
Leone (10:05:17), thanks for the information.
Hmmmm … I couldn’t find where DMI stations are located. From what you say, it sounds like they are using reanalysis output rather than station data. Is this the case? Reanalysis uses a computer to do the filling in of blank spots rather than the 1200 km extrapolation used by GISS. However … it’s still not data, it is the output of a computer model.
Again, subject to ??? if they are using reanalysis data.
True. Even with GISS we can see the difference by looking at the 250 km smoothing versus the 1200 km smoothing. Here’s that comparison, showing the trends by latitude:

Amino Acids in Meteorites (20:31:11) :
It was only your line of comment on cigarette smoke that was snipped. Anyone could see that.
You comments related to this thread are still there. Even though you do not agree with the writer of this thread you were not deleted for it.
———-
You are missing two completely deleted comments:
Willis Eschenbach (17:18:35) :
Anu (20:46:33) :
These comments followed directly from the line, which is still retained in this guest post:
The Director of GISS is Dr. James Hansen. Dr. Hansen is an impartial scientist who thinks people who don’t believe in his apocalyptic visions of the future should be put on trial for “high crimes against humanity”.
I pointed out that Dr. Hansen was referring only to the CEO’s of large fossil fuel companies, and the comments followed from that, and the comparison to the large tobacco companies lawsuit.
Public relations and the public perception of science is a topic interesting to me, but I can see why it might be considered “off topic”.
Enough said.
Yes, I believe Willis would not delete a comment of mine just because we disagree. We have disagreed in the past, but his anger makes him argue better, not dirtier.
———-
This would not be the case at RealClimate. Comments that are not in agreement with the writers there are customarily deleted even though they are on topic.
It would be hard to know what has been deleted, unless it was your comment deleted, or you happen to see it before the Moderator gets it (which I’ve seen on The Guardian, for instance).
I think almost all opinions should be allowed on the Web – it’s not like its a waste of paper, such as Letters to the Editor in a newspaper. Obvious distractions, like Viagra ads or psychotic rants should be deleted, but there are other ways to handle “unpopular” comments, such as show/hide on YouTube.
If RealClimate does as you say (I’ve only seen a few articles there, and never commented), I’m against the practice. I would treat them as I do all sites:
first Moderator deletion – 1 day ban from commenting
second deletion – 1 week ban
third deletion – 1 month ban
fourth deletion – 1 year ban (effectively, lifetime ban, since I’ve never gone back)
Anu (15:54:29) : edit
Anthony snipped both yours and mine because it was heading for cigarette wars, and thus way off-topic. Fair enough, I have no complaint.
Thanks, Anu. I view the snipping of any on-topic scientific comment as a high crime, and never do it no matter how much I disagree with it. Neither, as far as I know, does Anthony.
Realclimate is famous for censoring scientific questions and statements that don’t agree with the party line. Take a look here. I wrote a peer-reviewed article on the subject available here.
FTA: ” In that paper, they note that annual temperature changes are well correlated over a large distance, out to 1200 kilometres (~750 miles).”
Which temperatures? Temperatures at the equator? Temperatures in Asia? If they haven’t compared temperature records over a lengthy period at the Arctic, how can they know what correlations exist there? The poles are precisely where you would expect markedly different behavior than anywhere else.
Another questionable tactic in all these plots is the range of colors used. A lay person looking at these pictures would naturally think that red is a lot different from blue. In reality, the whole map should be differing shades of blue.
Re: Willis Eschenbach (Mar 27 12:50),
Thanks for doing the average, and thanks for digging out the definition of :Surface Skin Temperature
This parameter represents the solid surface physical temperature. As part of the ISCCP cloud analysis, the clear-sky infrared (wavelength of about 11 microns) brightness temperature is estimated at intervals of 30 km and 3 hr. The surface skin temperature is retrieved from these values by correcting for atmospheric emission and absorption of radiation, using data for the atmospheric temperature and humidity profiles, and for the fact that the surface infrared emissivity is less than one. The values of surface emissivity used are shown in the Narrowband Infrared Emissivity dataset. Because these values are determined under clear conditions, they will over-estimate the average daytime – summertime maximum temperatures and underestimate the average nighttime – wintertime minimum temperatures.
Seems to me that the air currents motions are not accounted for, so they are not really getting the surface temperature. Another $%^& computer program :(.
For example if you look at the arctic temperatures on http://ocean.dmi.dk/arctic/meant80n.uk.php, you see huge variations that can only be air motions at this season. Estimating the temperature from these air temperatures, ( brightness in kilometer height) cannot but be off. Another, summer in Greece where you cannot walk barefoot on rock but the air, coming from seasonal winds from Siberia and cooling it , is 36C.
But you are right, this belongs to the sensitivity thread which is too many pages removed :(.
Bart (18:43:43)
Read the Hansen paper I cited in the article. They show the correlations by latitude band. Arctic temperatures are correlated greater than 0.5 out to 1200 km … which means nothing about the trends.
Willis Eschenbach (23:00:06) :
“…which means nothing about the trends.”
I would say “means little which could be useful in projecting trends…,” (in my profession, a 0.5 correlation coefficient would be interpreted as “not very well correlated”), if it were true in the first place, but I have significant doubts about even that.
As usual, key information is absent. But, it appears to me that the higher latitude stations are likely located along similar latitude lines. I would fully expect that correlations would be less sensitive to longitude than they are to latitude, and that relatively high correlation among readings from neighboring stations at similar latitude would hardly be surprising. But, that does not imply that you could just draw a ring around each station and say everything in that ring has the same correlation as readings in the lateral direction. I would expect that, at the very least, contours of constant correlation would be elliptical.
It’s kind of like, when they came out and said “we are 90% certain that global warming is blah, blah, blah.” Even a high school kid knows that, when you make up a number, you have to add some decimal places to make the grader think you at least had some basis for it, that some actual calculations were involved. If they had said “we are 91.7% certain,” it would have had a bigger impact. Here, they should have used ellipses.
Indeed, Figure 3 shows several points at 1200 km which have very low and even negative correlation. I suspect those are readings between stations with significantly different latitude. And, needless to say, the extrapolation over the pole involves varying the latitude quite significantly.
I’m not any kind of expert on atmospheric physics, but I would imagine that mixing of the atmosphere would be rapidly changing as you get closer and closer to the pole. I do not think it likely that points located 11 deg or more from the pole would be representative of what is happening there.
Willis Eschenbach: “Hmmmm … I couldn’t find where DMI stations are located. From what you say, it sounds like they are using reanalysis output rather than station data. Is this the case?”
ECMWF is a forecasting model. I don’t exactly know what kind of data is used as input but I suppose that all kind of data which can be obtained is used. Thus DMI data product can be regarded as the best knowledge that we have about 80N-90N temperatures.
When compared 2001- winter months with GISS there can be found several cases with similar divergence. Probability of divergence increases with increasing years – and the direction is always the same: GISS is showing warmer. I hope that DMI is able to publish true mean temperatures as soon as possible. Maybe they hurry more if it is requested by many…
GISS grey area handling is truly worth closer look, because much warming is calculated from there. If DMI will show different results, the whole GISS grey area handling can be questioned.
Willis,
You appear to be unwilling to test if the extrapolation to 1200km is skilful. If extrapolations to 1200km lack skill, despite the strength of the correlation between climate stations this far apart, then you have the basis for a manuscript criticising GISTEMP (and you would be a hero to most of the readers here). On the other hand, if the extrapolations are skilful, then you owe James Hansen and coworkers an apology. Is it the risk of this latter eventuality than prevents you from trying to prove your argument?
This isn’t a complicated analysis to run, it doesn’t require a mainframe, and the data is all public domain. What’s stopping you?
100 lines of my R code says you owe Hansen an apology. Extrapolations from stations >60N are skilful (median station has positive RE statistic) at 1200km (indeed out to over 2000km – which was as far as I tested).
Richard Telford (16:00:17)
I tested the correlation between trends in the head post. As you can see, despite the fact that the correlations are in the range Hansen used (greater than 0.5), the Alaskan trends are all over the board.
So it’s not clear what you are talking about. How is that not a test of whether the extrapolation is skillful?
Richard, you are not following the story. I never said that the correlations are not large out to long distances. They are. I said that the correlation means nothing about the trend. See how well your trends are correlated, and report back to us.
Because if, as you say, “extrapolations … are skillful … out to over 2000 km”, why screw around? We can take one station in the Arctic and cover the entire Arctic ocean, think of the money we’ll save closing the others, waste of time really …
PS – I love people who say I shouldn’t study this, I should study that, or I shouldn’t run this analysis, I should run that analysis. Truth is, I analyze what is of interest to me. I stop when I have found out what is happening.
Having looked at theoretical “pseudo-temperatures”, and having seen how extremely poorly the good correlation of the Alaska stations is reflected in their trends, that’s all I care to do. I picked those stations at random because they had long records, no cherry picking, didn’t throw any stations out. If correlated stations in Alaska do that badly, I’m not interested in wasting my time on a detailed 1,200 station analysis. This is particularly true when I have shown five pseudo temperature records that are all correlated more than 0.90, yet have hugely different trends.
So Richard, you are more than welcome to run the analysis as far as you wish. I have shown that both in theory and in practice, correlation and trends don’t have much to do with each other. That’s enough for me. If you find a reliable way to use one temperature station to reliably predict the trend of some other station 2,000 kms away, let us know how. Until then, I’m happy with the amount I’ve done.
Finally, if you want to get some traction, post your code and let some people play with it. From your description I can’t tell what you’ve done … what, for example, is a “median station”?
Thanks,
w.
Leone (01:54:20)
If that’s the best we have, we’re in big trouble. Take a look at the last day temperature for one year and the first day temperature of the following year. For example, the year 1999 ends at about 246K. The next year, 2000, starts up at 262K, a full 16K greater.
How unusual is this? The standard deviation of the day-to-day changes over the last 15 years of the record is 1.0K. The biggest daily change in that time (ignoring year-end changes) is 5.9K
From the end of one year to the next, on the other hand, the standard deviation of the one day change is 6K, more than the biggest change in the rest of the data. And six of the fifteen years have a last-day-to-first-day change of more than 5K.
This is all too typical in the climate model sphere. They come up with some brilliant program, and run it … but then they don’t error check it in any adequate fashion, and we end up with garbage. So at this point, we can’t trust the DMI data at all. There’s something seriously wrong at the changeover of the years, and we don’t know how far it goes. Might be trivial and insignificant, might be big, we don’t know …
Hi Willis,
Great analysis.
The correlation between data x_i and y_i is maximal, 1, precisely when the distance of x_i to its mean is proportional to the distance of y_i to its mean.
This means that you can take any data x_i and move to y_i by
-adding the same quantity to each x_i
-multiplying each x_i by the same positive amount
and the correlation will still be 1.
Take e.g 1,2,3,4.
Add -10 to each to get -9, -8, -7, -6
Multiply each by 100 to get
-900, -800, -700, -600
The correlation of the latter to 1,2,3,4 is still 1.
That leaves A LOT of leeway to making up functions with perfect correlation…
That’s what was exploited by Hansen: it’s voodoo math – or rather good math used for voodoo purposes.
And a psychological point. The aim of Hansen isn’t only skewing the trend. The scary red colored map is a goal in itself…