GISScapades

Guest post by Willis Eschenbach

Inspired by this thread on the lack of data in the Arctic Ocean, I looked into how GISS creates data when there is no data.

GISS is the Goddard Institute for Space Studies, a part of NASA. The Director of GISS is Dr. James Hansen. Dr. Hansen is an impartial scientist who thinks people who don’t believe in his apocalyptic visions of the future should be put on trial for “high crimes against humanity”. GISS produces a surface temperature record called GISTEMP. Here is their record of the temperature anomaly for Dec-Jan-Feb 2010 :

Figure 1. GISS temperature anomalies DJF 2010. Grey areas are where there is no temperature data.

Now, what’s wrong with this picture?

The oddity about the picture is that we are given temperature data where none exists. We have very little temperature data for the Arctic Ocean, for example. Yet the GISS map shows radical heating in the Arctic Ocean. How do they do that?

The procedure is one that is laid out in a 1987 paper by Hansen and Lebedeff In that paper, they note that annual temperature changes are well correlated over a large distance, out to 1200 kilometres (~750 miles).

(“Correlation” is a mathematical measure of the similarity of two datasets. It’s value ranges from zero, meaning not similar at all, to plus or minus one, indicating totally similar. A negative value means they are similar, but when one goes up the other goes down.)

Based on Hansen and Lebedeff’s finding of a good correlation (+0.5 or greater) out to 1200 km from a given temperature station, GISS show us the presumed temperature trends within 1200 km of the coastline stations and 1200 km of the island stations. Areas outside of this are shown in gray. This 1200 km. radius allows them to show the “temperature trend” of the entire Arctic Ocean, as shown in Figure 1. This gets around the problem of the very poor coverage in the Arctic Ocean. Here is a small part of the problem, the coverage of the section of the Arctic Ocean north of 80° North:

Figure 2. Temperature stations around 80° north. Circles around the stations are 250 km (~ 150 miles) in diameter. Note that the circle at 80°N is about 1200 km in radius, the size out to which Hansen says we can extrapolate temperature trends.

Can we really assume that a single station could be representative of such a large area? Look at Fig.1, despite the lack of data, trends are given for all of the Arctic Ocean. Here is a bigger view, showing the entire Arctic Ocean.

Figure 3. Temperature stations around the Arctic Ocean. Circles around the stations are 250 km (~ 150 miles) in diameter. Note that the area north of 80°N (yellow circle) is about three times the land area of the state of Alaska.

What Drs. Hansen and Lebedeff didn’t notice in 1987, and no one seems to have noticed since then, is that there is a big problem with their finding about the correlation of widely separated stations. This is shown by the following graph:

Figure 4. Five pseudo temperature records. Note the differences in the shapes of the records, and the differences in the trends of the records.

Curiously, these pseudo temperature records, despite their obvious differences, are all very similar in one way — correlation. The correlation between each pseudo temperature record and every other pseudo temperature records is above 90%.

Figure 5. Correlation between the pseudo temperature datasets shown in Fig. 3

The inescapable conclusion from this is that high correlations between datasets do not mean that their trends are similar.

OK, I can hear you thinking, “Yea, right, for some imaginary short 20 year pseudo temperature datasets you can find some wild data that will have different trends. But what about real 50-year long temperature datasets like Hansen and Lebedeff used?”

Glad you asked … here are nineteen fifty-year long temperature datasets from Alaska. All of them have a correlation with Anchorage greater than 0.5 (max 0.94, min 0.51, avg 0.75). All are within about 500 miles of Anchorage. Figure 6 shows their trends:

Figure 6. Temperature trends of Alaskan stations. Photo is of Pioneer Park, Fairbanks.

As you can see, the trends range from about one degree in fifty years to nearly three degrees in fifty years. Despite this huge ~ 300% range in trends, all of them have a good correlation (greater than +0.5) with Anchorage. This clearly shows that good correlation between temperature datasets means nothing about their corresponding trends.

Finally, as far as I know, this extrapolation procedure is unique to James Hansen and GISTEMP. It is not used by the other creators of global or regional datasets, such as CRU, NCDC, or USHCN. As Kevin Trenberth stated in the CRU emails regarding the discrepancy between GISTEMP and the other datasets (emphasis mine):

My understanding is that the biggest source of this discrepancy [between global temperature datasets] is the way the Arctic is analyzed. We know that the sea ice was at record low values, 22% lower than the previous low in 2005. Some sea temperatures and air temperatures were as much as 7C above normal. But most places there is no conventional data. In NASA [GISTEMP] they extrapolate and build in the high temperatures in the Arctic. In the other records they do not. They use only the data available and the rest is missing.

No data available? No problem, just build in some high temperatures …

Conclusion?

Hansen and Lebedeff were correct that the annual temperature datasets of widely separated temperature stations tend to be well correlated. However, they were incorrect in thinking that this applies to the trends of the well correlated temperature datasets. Their trends may not be similar at all. As a result, extrapolating trends out to 1200 km from a given temperature station is an invalid procedure which does not have any mathematical foundation.

[Update 1] Fred N. pointed out below that GISS shows a polar view of the same data. Note the claimed coverage of the entirety of the Arctic Ocean. Thanks.

[Update 2] JAE pointed out below that Figure 1 did not show trends, but anomalies. boballab pointed me to the map of the actual trends. My thanks to both. Here’s the relevant map:

0 0 votes

Article Rating

218 Comments

Inline Feedbacks

View all comments

David

March 25, 2010 3:42 pm

Fascinating as ever Willis. If this is NASA’s idea of accurate data, perhaps it is just as well that they cancelled the moon landing programme.

Paul

March 25, 2010 3:44 pm

How did the GISS measure temperature in 1951-1980, when there wasn’t any weather stations in the North Pole, and we didn’t have satelites?

Fred N.

March 25, 2010 3:46 pm

It’s even more disgusting using the polar plot view:
http://data.giss.nasa.gov/work/gistemp/NMAPS/tmp_GHCN_GISS_1200km_Anom1203_2010_2010_1951_1980/GHCN_GISS_1200km_Anom1203_2010_2010_1951_1980.POL.pdf

jaypan

March 25, 2010 3:52 pm

Good stuff.
To bring it forward, I strongly agree with vboring.
What satellite data are available and what are they saying?

David Alan Evans

March 25, 2010 3:54 pm

Temperature alone is a stupid metric anyway.
BTW. by the GISTemp method. Aberdeen can influence the northern Med and Central Sweden.
DaveE.

Richard Telford

March 25, 2010 3:54 pm

This appears to be a case of “Willis doesn’t believe it, therefore its not true”. Hardly an adaquate basis for evaluating the method. The sort of procedure used by GIStemp, using the correlation structure in the data to fill in the gaps, is not dissimilar to the geostatistical tools used by mining companies estimate how much reserves there are from scattered data. Rather than dreaming up examples where you don’t think (but don’t bother testing) the method will work, there are several ways you could test the method. I know this would run the risk of finding out that Hansen had done something correct, but it would raise this post above the level of argument from personal incredulity. For example, you could try crossvalidating the data – omit a site and test how well its temperature anomaly can be reconstructed from the neighbouring sites using the GIStemp procedure. If the reconstructions have little skill, then you have a post worth writing.

pwl

March 25, 2010 4:01 pm

Hansen’s approach to science: “No data available? No problem, just build in some high temperatures …”
In high school science classes we learned how important REAL DATA is in science. Hansen would fail those classes had he suggested doing what he has published in papers: the fabrication of data.

Charles Wadsack

March 25, 2010 4:04 pm

What exactly the DMI Polar Temperature site measure? They’ve got more than 50 years of data. To my very novice eye, it appears that 2010 to date is average. Can anyone explain why it ended 2009 at 245 K and began 2010 at about 252 K?

JAE

March 25, 2010 4:06 pm

?? I don’t get it. Fig. 1 shows the anomaly, not trends. Isn’t the problem simply that the 1200 km “weighting” is not representative?

Willis Eschenbach

Author

March 25, 2010 4:07 pm

Scott (15:20:17)

Hmm, isn’t a common correlation coefficient R-squared? Obviously, this can’t go negative…just between zero and one. Are they using R or R-squared here? If R-squared, then the sentence about correlation going down to -1 needs to be changed.
-Scott

Take a look at the source document. They are using R, not R^2.

Willis Eschenbach

Author

March 25, 2010 4:09 pm

Veronica (England) (15:27:05)

I don’t think you have adequately dealt with inverse correlation. Your explanation sounded funky to me.

Inverse correlation is not relevant to this analysis, since all correlations used are positive. My explanation was not supposed to be a full dissertation on correlation. If you’d like to clarify my one-sentence explanation of inverse correlation please do, but it is not necessary for the purposes of this discussion.

mrpkw

March 25, 2010 4:11 pm

So basically they are just guessing and not at all a surprise, they always guess up!!!

Willis Eschenbach

Author

March 25, 2010 4:13 pm

Doug Badgero (15:29:06)

Willis,
Are those Alaska temps NASA GISS “value added” trends or are they raw data trends? If they are raw data trends, has Alaska really been on such a continuous warming trend?

Raw data. Linear trends are very deceptive. See here, Update 5, for details

JAE

March 25, 2010 4:13 pm

Richard Telford (15:54:25) :
“This appears to be a case of “Willis doesn’t believe it, therefore its not true”. Hardly an adaquate basis for evaluating the method. ”
Here’s another possible basis for evaluating the method (at least showing that something is wrong): The artic sea ice continues to increase in area (see previous post), which seems to me to cast some serious doubt on all those bright red anomalies up there.

François GM

March 25, 2010 4:14 pm

Again, this shows that the notion of a global temperature is not credible.
I think we should look at at the average of trends of all stations for which we have reliable (unadjusted or UHI-adjusted) data over time rather than look at the trend of a global temperature for which we have much inconsistent, adjusted or interpolated data over time.

rbateman

March 25, 2010 4:16 pm

paul (15:14:10) :
You’ve noticed the glaring errors in Hansen’s GISS anomaly maps too.
He must have something in his code that runs hot anomalies over what should be natural gradations.
Just another big fat error with GISS.

NickB.

March 25, 2010 4:17 pm

Doug Badgero,
That’s a helluva point… Do they homogenize the temps they extrapolate across the entire arctic?

Willis Eschenbach

Author

March 25, 2010 4:23 pm

Fred N. (15:46:06)

It’s even more disgusting using the polar plot view:
http://data.giss.nasa.gov/work/gistemp/NMAPS/tmp_GHCN_GISS_1200km_Anom1203_2010_2010_1951_1980/GHCN_GISS_1200km_Anom1203_2010_2010_1951_1980.POL.pdf

Thanks, Fred, I’ve updated the head post with the polar view.

Robin Edwards

March 25, 2010 4:24 pm

Anyone who has done some serious data analysis by regression methods in the industrial world will understand the problem posed by blind reliance on values of RSqd as an indicator of the practical value or worth of a correlation. RSqd is a measure of /linear/ correlation. If you use it to judge any aspect of the relationship between two variables you are implicitly accepting that this relationship is fundamentally linear. Unless you display the full data plot (that is the individual data pairs) on the plot, together with the fitted line – presumably computed by least squares – and also the confidence intervals for both the line and for any future individual observation, at an acceptable probability level, you will have no idea at all of the practical worth of the relationship.
This can not be stated often enough. Enlightenment may come only by working through some numbers and producing appropriate graphical displays. I urge anyone who intends to comment on statistical correlation to take the trouble to go through the mechanics (i.e arithmetic) of computing a linear correlation coefficient (and its square), and to study the prediction capability of the the correlation.

Anu

March 25, 2010 4:32 pm

Dr. Hansen is an impartial scientist who thinks people who don’t believe in his apocalyptic visions of the future should be put on trial for “high crimes against humanity”
He doesn’t care what you think, just the CEO’s of large fossil fuel energy companies that are actively fighting the science. [snip]
Let me know when you confirm that CRU just fills in the missing data with the planetary average anomaly – clearly an inferior approach. Also, that 1987 paper I showed you also shows how they use multiple stations that are within 1200 km to get a weighted, best guesstimate. If there are six stations in the Arctic with temperature anomalies ranging from 0.1 °C to 0.15°C that month, a guesstimate of 0.125 °C for an area 1000 km away, with no direct measurements, is better than a 0.02 °C temperature anomaly which might be the planetary average that month.
No data available? No problem, just build in some high temperatures …
If all the closest stations had high temperature anomalies, that’s a better guesstimate than the average of the entire planet. See:
http://www.cru.uea.ac.uk/cru/data/temperature/
@vboring (14:50:54) :
Why not infill using the satellite data?
Exactly, that’s what GISS does for ocean data now. That 1987 paper was describing how they dealt with the temperature dataset starting in 1880 that had very sparse coverage of some parts of the planet for many decades.
GISTEMP uses NOAA data for ocean temperatures, see:
http://data.giss.nasa.gov/gistemp/sources/gistemp.html
where they explicitly mention using:
http://ftp.emc.ncep.noaa.gov cmb/sst/oimonth_v2 Reynolds 11/1981-present
Here is the background info on how they use NOAA satellite data for ocean surface temperatures using a complicated method called “optimum interpolation”, cross-checked with in situ measurements by ships and buoys, and how they calculate surface temperatures of ocean covered by sea ice: Happy reading.
http://www.emc.ncep.noaa.gov/research/cmb/sst_analysis/#_cch2_1007145286
http://www.ncdc.noaa.gov/oa/climate/research/sst/oi-daily.php
http://www.ncdc.noaa.gov/oa/climate/research/sst/papers/whats-new-v2.pdf
ftp://ftp.emc.ncep.noaa.gov/cmb/sst/papers/oiv2pap/oiv2.pdf
Satellites only cover up to 82.5 °N (given their orbital parameters), so there is still a small hole at the top of the world that doesn’t have much coverage, save the occasional Russian icebreaker in summer. Hmm, how should we interpolate this tiny patch of ocean ?
How about ignore all the closest measurements in the high Arctic, and give it the planetary average ?

Willis Eschenbach

Author

March 25, 2010 4:36 pm

Richard Telford (15:54:25)
Richard, you say inter alia:

This appears to be a case of “Willis doesn’t believe it, therefore its not true”.

It is not a question of “belief”. I have given examples of both pseudo-temps and real temperatures that clearly show that it doesn’t work either in theory or in the real world. It’s called science.

The sort of procedure used by GIStemp, using the correlation structure in the data to fill in the gaps, is not dissimilar to the geostatistical tools used by mining companies estimate how much reserves there are from scattered data.

This has nothing to do with how mining companies infill missing data. They generally use kriging, which is very, very different both conceptually and in practice.

Rather than dreaming up examples where you don’t think (but don’t bother testing) the method will work, there are several ways you could test the method. I know this would run the risk of finding out that Hansen had done something correct, but it would raise this post above the level of argument from personal incredulity. For example, you could try crossvalidating the data – omit a site and test how well its temperature anomaly can be reconstructed from the neighbouring sites using the GIStemp procedure.

I did that. Didn’t you read the post? How well do you think that you could reconstruct the temperature trend of say Fairbanks U using the other stations shown. The average trend of those stations is 0.31°C/decade. The trend of the nearest station (Fairbanks, only 3 km away, correlation with Fairbanks U = 0.75) is 0.44. The trend of Fairbanks U is 0.58 … so if you think you can reconstruct Fairbanks U. from the other stations, good luck.

ScottR

March 25, 2010 4:36 pm

It has always been a mystery to me why the Goddard Institute for Space Studies builds a “premier” temperature data set that eschews data from space satellites.
Instead, they use data taken from 4 feet off the asphalt, extrapolate it (i.e. fake it) thousands of miles away from any ground stations, and then massage it (i.e. fake it) so much that it doesn’t really matter what the original data was that they started from.
Then they use it to determine the fate of the world.
Can we at least agree that any organization with “Space Studies” in its name should not be responsible for a ground temperature data record? Where are James Hansen’s rockets anyway? Poor Robert Goddard must be spinning.
Maybe the good Dr. Hansen should change his vocation: “Professor Marvel, Acclaimed by the Crown Heads of Europe — Let Him Read Your Past, Present, and Future In His Crystal Ball — Also Juggling and Sleight of Hand”
Oh wait, that IS his vocation already.
(Apologies to the shade of Frank Morgan…)

Curiousgeorge

March 25, 2010 4:39 pm

Wanna know a secret? Governments don’t really give a rat’s fat hairy behind about CO2, AGW or the rest of that bs. If they did, they wouldn’t play games like this: http://www.cnn.com/2010/WORLD/europe/03/25/russia.uk.intercepts/index.html?hpt=C1
How much CO2 do two TU160’s, and two Tornado’s emit whilst chasing each other around the sky for a couple hours? And why is it that the USA seems to bear the brunt of criticism for AGW, etc. ? Let’s get real, and understand this AGW BS is an entertaining sideshow for public consumption, and bears no relation to what’s really going on. Same old song, same old dance.
“Britain’s Ministry of Defence released images it said were taken earlier this month of two Russian Tu-160 bombers — known as Blackjacks by NATO forces — as they entered UK airspace near the Outer Hebrides islands off Scotland’s northwest coast.
It said the March 10 incident, which resulted in crystal clear images of the planes against clear blue skies and a dramatic sunset, was one of many intercepts carried out by British Royal Air Force crews in just over 12 months.
“This is not an unusual incident, and many people may be surprised to know that our crews have successfully scrambled to intercept Russian aircraft on more than 20 occasions since the start of 2009,” Wing Cdr. Mark Gorringe, of the RAF’s 111 Squadron, said in a statement.
The RAF said two of its Tornado fighter jets from its base at Leuchars, on Scotland’s east coast, were dispatched to tail the Russian Blackjacks as they approached the western Isle of Lewis.”

tarpon

March 25, 2010 4:43 pm

An experiment proposal — Question why can’t an experiment be run with land based stations, say choose stations on a 750 mile circle, and show the correlation is proved? Wouldn’t that be something like doing real science by putting forth a theory and then running an experiment to verify the theory? USA stations would seem ideal. Maybe even choose multiple experiments with multiple ‘ring’ choices to see if they match.
Wouldn’t the Arctic experience the same weather discrepancies that a “chosen ring” of normal land based stations would.
Seems like a lot of the ground observation datasets exhibit a large amount of wishful thinking and little experimental science. And don’t we have huge computers which could do all this data computation/reduction in a flash, assuming you hire other than CRU type people to do the software. In fact high end PCs should give it a good run for the money in accomplishing the tasks.

subtlety.leads.to.confusion

March 25, 2010 4:44 pm

Great article Willis …
That arctic hotspot is quite impressive!
But wait a second, aren’t all the global temperature analyses done based on 5×5 grid cells?
And isn’t it true that the farther from the equator one goes, the smaller the physical area of each grid cell becomes?
And then, if you fill in some high temperature numbers in some high latitude cells, those numbers will be over-represented in the subsequent summation process?
Shouldn’t each grid cell be weighted by latitude?
Using the actual width of the middle of the cell versus the width at the equator would be a reasonable approximation.
Perhaps this is being done somewhere in the code, but I have never heard mention of it.