by John Goetz
On September 15, 2008, Anthony DePalma of the New York Times wrote an article about the Mohonk Lakes USHCN weather station titled Weather History Offers Insight Into Global Warming. This article claimed, in part, that the average annual temperature has risen 2.7 degrees in 112 years at this station. What struck me about the article was the rather quaint description of the manner in which temperatures are recorded, which I have excerpted here (emphasis mine):
Mr. Huth opened the weather station, a louvered box about the size of a suitcase, and leaned in. He checked the high and low temperatures of the day on a pair of official Weather Service thermometers and then manually reset them…
If the procedure seems old-fashioned, that is just as it is intended. The temperatures that Mr. Huth recorded that day were the 41,152nd daily readings at this station, each taken exactly the same way. “Sometimes it feels like I’ve done most of them myself,” said Mr. Huth, who is one of only five people to have served as official weather observer at this station since the first reading was taken on Jan. 1, 1896.
That extremely limited number of observers greatly enhances the reliability, and therefore the value, of the data. Other weather stations have operated longer, but few match Mohonk’s consistency and reliability. “The quality of their observations is second to none on a number of counts,” said Raymond G. O’Keefe, a meteorologist at the National Weather Service office in Albany. “They’re very precise, they keep great records and they’ve done it for a very long time.”
Mohonk’s data stands apart from that of most other cooperative weather observers in other respects as well. The station has never been moved, and the resort, along with the area immediately surrounding the box, has hardly changed over time.
Clearly the data collected at this site is of the highest quality. Five observers committed to their work. No station moves. No equipment changes according to Mr. Huth (in contrast to the NOAA MMS records). Attention to detail unparalleled elsewhere. A truly Norman Rockwell image of dedication.
After reading the article, I wondered what happened to Mr. Huth’s data, and the data collected by the four observers who preceded him. What I learned is that NOAA doesn’t quite trust the data meticulously collected by Mr. Huth and his predecessors. Neither does GISS trust the data NOAA hands it. Following is a description of what is done with the data.
Let’s begin with the process of getting the data to NOAA:
Mr. Huth and other observers like him record their data in a “B91 Form”, which is submitted to NOAA every month. These forms can be downloaded for free from the NOAA website. Current B91 forms show the day’s minimum and maximum temperature as well as the time of observation. Older records often include multiple readings of temperature throughout the day. The month’s record of daily temperatures is added to each station’s historical record of daily temperatures, which can be downloaded from NOAA’s FTP site here.
The B91 form for Mohonk Lake is hand-written, and temperatures are recorded in Farenheit. Transcribing the data to the electronic daily record introduces an opportunity for error, but I spot-checked a number of B91 forms – converting degrees F to tenths of degree C – and found no errors. Kudos to the NOAA transcriptionists.
Next comes the first phase of NOAA adjustments.
The pristine data from Mohonk Lake are subject to a number of quality control and homogeneity testing and adjustment procedures. First, data is checked against a number of quality control tests, primarily to eliminate gross transcription errors. Next, monthly averages are calculated from the TMIN and TMAX values. This is straightforward when both values exist for all days in a month, but in the case of Mohonk Lake there are a number of months early in the record with several missing TMIN and/or TMAX values. Nevertheless, NOAA seems capable of creating an average temperature for many of those months. The result is referred to as the “Areal data”.
The Areal data are stored in a file called hcn_doe_mean_data, which can be found here. Even though the daily data files are updated frequently, hcn_doe_mean_data has not been updated in nearly a year. The Areal data also seem to be stored in the GHCN v2.mean file, which can be found here on NOAA’s FTP site. This is the case for Mohonk Lake.
Of course, more NOAA adjustments are needed.
The Areal data is adjusted for time of observation and stored as a seperate entry in hcn_doe_mean_data. TOB adjustment is briefly described here. Following the TOB adjustment, the series is tested for homogeneity. This procedure evaluates non-climatic discontinuities (artificial changepoints) in a station’s temperature caused by random changes to a station such as equipment relocations and changes. The version 2 algorithm looks at up to 40 highly-correlated series from nearby stations. The result of this homogenization is then passed on to FILNET which creates estimates for missing data. The output of FILNET is stored as a seperate entry in hcn_doe_mean_data.
Now GISS wants to use the data, but the NOAA adjustments are not quite what they are looking for. So what do they do? They estimate the NOAA adjustments and back them out!
GISS now takes both v2.mean and hcn_doe_mean_data, and lops off any record before 1880. GISS will also look at only the FILNET data from hcn_doe_mean_data. Temperatures in F are converted and scaled to 0.1C.
This is where things get bizarre.
For each of the twelve months in a calendar year, GISS looks at the ten most recent years in common between the two data sets. For each month in those ten most recent years it takes the difference between the FILNET temperature and the v2.mean temperature, and averages them. Then, GISS goes through the entire FILNET record and subtracts the monthly offset from each monthly temperature.
It appears to me that what GISS is attempting to do is remove the corrections done by NOAA from the USHCN data. Standing back to look at the forest through the trees, GISS appears to be trying to recreate the Areal data, failing to recognize that v2.mean is the Areal data, and that hcn_doe_mean_data also contains the Areal data.
Here is a plot of the difference between the monthly raw data from Mohonk Lake and the data GISS creates in GISTEMP STEP0 (yes, I am well aware that in this case it appears the GISS process slightly cools the record). Units on the left are 0.1C.
Even supposedly pristine data cannot escape the adjustment process.