Rewriting History, Time and Time Again

A guest post by John Goetz

In February I wrote a post asking How much Estimation is too much Estimation? I pointed out that a large number of station records contained estimates for the annual average. Furthermore, the number of stations used to calculate the annual average had been dropping precipitously for the past 20 years. One was left to wonder just how accurate the reported global average really was and how meaningful rankings of the warmest years had become.

One question that popped into my mind back then was whether or not – with all of the estimation going on – the historical record was static. One could reasonably expect that the record is static. After all, once an estimate for a given year is calculated there is no reason to change it, correct? That would be true if your estimate did not rely on new data added to the record, in particular temperatures collected at a future date. But in the case of GISStemp, this is exactly what is done.

Last September I noted that an estimate of a seasonal or quarterly temperature when one month is missing from the record depends heavily on averages for all three months in that quarter. This can be expressed by the following equation, where {m}_{a}, {m}_{b}, {m}_{c} are the months in the quarter (in no particular order) and one of the three months {m}_{a} is missing:

{T}_{q,n} = \frac{1}{3}{\overline{T}}_{{m}_{a},N} + \frac{1}{2}\left({T}_{{m}_{b},n} + {T}_{{m}_{c},n}\right) - \frac{1}{6}\left({\overline{T}}_{{m}_{b},N} + \overline{T}}_{{m}_{c},N}\right)

In the above, T is temperature, q is the given quarter, n is the given year, and N is all years of the record.

One can readily see that as new temperatures are added to the record, the average monthly temperatures will change. Because those average monthly temperatures change, the estimated quarterly temperatures will change, as will the estimated annual averages.

Interestingly, application of the “bias method” used to combine a station’s scribal records can have a ripple effect all the way back to the beginning of a station’s history. This is because the first annual average in every scribal record is estimated, and the bias method relies on the overlap between all years of record, estimated or not. Recall that annual averages are calculated from December of the prior year through November of the current year. However, all scribal records begin in January (well, I have not found one that does not begin in January), so that first winter average is estimated due to the missing December value. Thus, with the bias method, at least one of the two records contains estimated annual values.

Of course, it is fair to ask whether or not this ultimately has any effect on the global annual averages reported by GISS. One does not have to look very hard to find out that the answer is “yes”.

On March 29 I downloaded the GLB.Ts.txt file from GISS and compared it to a copy I had from late August 2007. I was surprised to find several hundred differences in monthly temperature. Intrigued, I decided to take a trip back in time via the “Way Back Machine”.

Here I found 32 versions of GLB.Ts.txt going back to September 24, 2005. I was a bit disappointed the record did not go back further, but was later surprised at how many historical changes can occur in a brief 2 1/2 years.The first thing I did was eliminate versions where no changes to the data were made. I then compared the number of monthly differences between the remaining sequential records and built the following table. Here I show the “Prior” record compared to the next sequential record (referred to as “Current”). The number of changes made to the monthly record between Prior and Current is shown in the “Updates” column (this column does not count additions to the record – only changes to existing data are counted). The number of valid months contained in the Prior record is in the “Months” column. “Change” is simply the percent Updates made to Months.

gbl_table.gif

On average 20% of the historical record was modified 16 times in the last 2 1/2 years. The largest single jump was 0.27 C. This occurred between the Oct 13, 2006 and Jan 15, 2007 records when Aug 2006 changed from an anomoly of +0.43C to +0.70C, a change of nearly 68%.

Wow.

The next question I had was “how often are the months within specific years modified?” As can be seen in the next chart, a surprising number of the earliest monthly averages are modified time and again.

gbl_yearly_changes.gif

Click the image to see it in full

I was surprised at how much of the pre-Y2K temperature record changed! My personal favorite change was between the August 16, 2007 file and the March 29, 2008 file. Suddenly, in the later file, the J-D annual temperature for 1880 could now be calculated. In all previous versions the temperature could not be determined.

But some will want to know only how this process affects the rankings for the top 10 warmest years. Because the history goes back to the middle of 2005, I explored this question only for the years before 2005. While the overall ranking from top to bottom does change from one record to the other, the top 10 prior to 2005 does not change much. However, the top two do exchange position frequently, as can be seen from the following table:

gbl_top10.gif

I will note that the overall trend in changes between now and Sep. 24, 2005 is very close to zero. If one compares the latest file with the one from Sep 24, 2005, it can be seen that the earliest and latest years are adjusted lower today than in 2005, while the middle years are adjusted higher. However, this is purely coincidence. If one compares the file from Aug. 2007 with the latest file, it appears the earliest temperatures have been adjusted downward, leading to an overall upward trend. Surely other comparisons will yield a downward tend. It is by pure chance that we have selected two endpoint datasets that appear to have no effect on the tend.

In the meantime, will the real historical record please stand up?

0 0 votes
Article Rating

Discover more from Watts Up With That?

Subscribe to get the latest posts sent to your email.

35 Comments
Inline Feedbacks
View all comments
AJ Abrams
April 9, 2008 6:15 am

PDM.
Look at March numbers for GISS. No it most certainly does NOT work and does NOT match other records. Time for them to fiddle with the numbers again.

Editor
April 9, 2008 8:08 am

jeez, Anthony: Yes, jeez did find data going back to 1999 and posted links on CA. I’ve already downloaded all of it but have not had a chance to crawl through it. I’m not sure it merits much more investigation because I don’t think we will learn any more than we know, which is the algorithm modifies the past when temperatures are added to the present. That should be enough to cause a double-take.
Thanks jeez!

Mike Bryant
April 9, 2008 8:34 am

#29 !

Francois
April 9, 2008 10:03 am

A bit off topic, but also about rewriting history. Recently I was looking for a nice picture of Antarctica’s temperature trend, from NASA, that I had seen somewhere. I found it here , but then I also found it… here !… One shows the trend from 1982-2004, the other between 1981-2007. I can’t figure out how we could have gone from cooling to warming by adding 4 years to the data…

Earle Williams
April 9, 2008 10:38 am

pdm,
It’s not the end that is of concern, it is the beginning. GISS matches the satellite record to some degree, true. But it reqrites history by applying adjustments to past temperatures. Look at the adjustments applied in the 100 years prior to the satellite record and you will see a remarkable similarity to the alleged increase in golbal temperature since the start of the industrial revolution.

Gary Gulrud
April 9, 2008 2:10 pm

“First, let me say that I personally don’t think fraud of any sort is involved.”
There is, after all, no controlling authority overseeing the ‘Science’.

pdm
April 9, 2008 2:16 pm

John:
Match is the wrong word. Looking back over the years of data it seems that GISS tells the same temperature story that RSS and the others do. Is that not the case? I say story because they don’t use the same baseline.
Having said that:
Earle:
I hear what you are saying regarding pre-satellite data. It is rather suspicious, to put it kindly. There isn’t anything to compare against.
Would you agree that in the satellite period the data tell the same story? If, in the future, the satellite history diverges then I would agree that the books are being cooked.

John Cross
April 10, 2008 10:12 am

OK, I had some spare time over lunch and used the Wayback link (thanks for providing the link) to get the data for september 24, 2005 and then used the current data (with the March value). Using Excel I subtracted the anomalies from the two data sets to look at the changes.
Most of the monthly data change were on the order of + / – 1 or 0.01C. I also did the same with the annual data and again most of the annual changes were + / – 1 with a couple being +/- 2. So I agree that there are a significant number of changes, but I do not think the changes are large enough to be relevant. I seem to recall that the accuracy of the GISS is +/- 0.1C and more the further back you go – I am sure some of the people here will be able to supply a more accurate figure. I also did a comparison between the monthly anomalies and the annual ones to see if the annual changes were being caused by the monthly ones and – to within round off error – they seemed to be.
A question further up caused me to look at the overall trends for the annual record (D-N). From the two cases above I plotted the annual and using the trend function on Excel (yeah, I know its for wimps, but I was in a hurry) I found the trend for one to be +0.5695 and for the other to be +0.5742. The R2 value was almost identical (0.6323 and 0.6341).
As a final check I plotted to annual values against each other. I figure this would give me a good idea of how close they were. A perfect match would have a slope of 1. The slope in this case was 1.006.
So I don’t think there is a “gotch-ya” hiding there. However I do tend to agree with the people who say it is a strange way to calculate a missing value. If the data series was fairly flat (no trend) then it might make sense. But I don’t see any rationale for doing it this way.
John

Editor
April 10, 2008 11:44 am

John Cross:
Correct, the magnitude of the fluctuations is not significant. It is significant enough, however, to cause changes in the ever-so-important temperature rankings we have all come to know and love. I found that surprising – it was not something I expected to see.
Something else I did not expect to see was how much a specific year’s average can change over time. Take the “hottest year on record” – 1998. On Oct. 13, 1999 the anomaly was 65. On Jan. 26, 2001 it had risen to 66. On Feb. 11, 2002 it was 67. By Apr. 19, 2003 it was 70, which is where it stands today after a brief interlude back at 69 in mid-2006. That’s an almost 10% drift upward, due solely to the fact (as it appears) to more recent temperatures being added to the record. One would think an agency as august as NASA would be able to develop a more stable methodology. It just looks sloppy.
So if there is a “gotcha” it is that we have found their process to add additional uncertainty beyond that already present in the data.

Alan Wilkinson
July 11, 2008 1:24 am

This whole process seems bizarre and rather worse than sloppy.
Surely you should not estimate missing data in order to evaluate a trend? The missing data increases the uncertainty of your evaluation and cannot be replaced by estimates without loss of both data integrity and scientific integrity.