Guest post by John Goetz
The GISStemp Step 1 code combines “scribal records” (multiple temperature records collected at presumably the same station) into a single, continuous record. There are multiple detailed posts on Climate Audit (including this one) that describe the Step 1 process, known affectionately as The Bias Method.
On the surface seems like a reasonable concept, and in reading HL87 the description of the algorithm makes complete sense. In simple terms, HL87 says that:
- The longest available record is compared with the next longest record, and the period of overlap between the two records is identified.
- The average temperature during the period of overlap is calculated for each station.
- The difference between the average temperature for the longer station and shorter station is calculated, and that difference (a bias) is added to all temperatures of the shorter station to bias it – bringing it in line with the longer station.
- The two records can now be combined as one, and the process repeats for additional records.
In looking at numerous stations with multiple records, more often than not the temperatures during the period of overlap are identical, so one would expect the bias to be zero. However, we often see a slight bias existing in the GISS results for such stations, and over the course of combining multiple records, that bias can be several tenths of a degree.
This was one of Steve McIntyre’s many puzzles, and we eventually figured out why we were getting bias when two records with identical overlap periods were combined: GISStemp estimates the averages during the overlap period.
GISStemp does not take the monthly data during the overlap period and simply average it. Instead, it calculates seasonal averages from monthly averages (for example, winter is Dec-Jan-Feb), and then it calculates annual averages from the four seasonal averages. If a single monthly average is missing, the seasonal average is estimated. This estimate is based on historical data found in the individual scribal record. If two records are missing the same data point (say, March 1989), but one record covers 1900 – 1990 and the other 1987 – 2009, they will each produce a different estimate for March, 1989. All other data points might match during the period of overlap, but a bias will be introduced nonetheless.
The GISS algorithm forces at least one estimation to always occur. The records used begin with January data, but the winter season includes the previous December. That December datapoint is always missing from the first year of a scribal record, which means the first winter season and first annual temperature in each scribal record is estimated. Thus, if two stations overlap from January 1987 through December 1990 (a common occurance), and all overlapping temperatures are identical, a bias will be applied because the 1987 annual temperature for the newer record will be estimated.
Obviously, the bias could go either way: it could warm or cool the older records. With a large enough sample size, one would expect the average bias to be near zero. So what does the average bias really look like? Using the GISStemp logs from June, 2009, the average bias on a yearly basis across 7006 scribal records was: