Guest essay by John Goetz
As noted in an earlier post, the monthly raw averages for USHCN data are calculated with up to nine days are missing from the daily records. Those monthly averages are usually not discarded by the USHCN quality control and adjustment models, although the final values are almost always estimated as a result of that process.
The daily USHCN temperature record collected by NCDC contains daily maximum (TMAX) and minimum (TMIN) temperatures for each station in the network (ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/daily/hcn/). In some cases, measurements for a particular day were not recorded and are shown as -9999 in either or both the TMAX or TMIN record for that day. In other cases, a measurement was recorded but failed one of a number of quality-control checks.
I was curious as to how often different quality-control checks failed, so I wrote a program to cull through the daily files to learn more. I happened to have a very small number of USHCN daily records already downloaded for another purpose, so I used them to debug the software.
I quickly noticed that my code was calculating a larger number of consistency check fails from the daily record for Muleshoe, TX than was indicated by the “I” flag in the station’s corresponding USHCN monthly record. The daily record, for example, flagged the minimum value on February 6 and 7, 1929 and the maximum value on February 7 and 8. My code was counting that as three failed days but the monthly raw data for Muleshoe indicated it was two days.
Regardless of how many failures should have been counted, it was clear from the daily record why they were flagged. The minimum temperature for February 6 was higher than the maximum temperature for February 7, which is an impossibility. The same was true for February 7th relative to the 8th.
I noticed there were quite a few errors like this in the Muleshoe daily record, spanning many years. I wondered how the station observer(s) could make such a mistake repeatedly. It was time to turn to the B-91 observation form to see if it could shed any light on the matter.
The B-91 form obtained from http://www.ncdc.noaa.gov/IPS/coop/coop.html is linked below. After converting the temperatures to Celsius the problem became apparent. The first temperature (43) appears to have been scratched out. The last temperature in that column (39) has a faint arrow pointing to it from a lower line labelled “1*”. The “*” is a note that states “Enter maximum temperature of first day of following month”.
It appeared that whoever transcribed this manual record into electronic form thought that the observer intended to scratch out the first temperature and replace it with the one below, and thus shifted the maximum values up one day for the entire month.
To determine the observer’s intent, the B-91 for March, 1929 was examined to see if the first maximum temperature was 39, as indicated by the “1*” line on the February form. Not only was the first maximum temperature 39, it appeared to be scratched out with the same marking. Although the scratch marking appeared on the March form, that record was transcribed correctly. A quick check of the January, 1929 B-91 showed the same scratch marks over the first temperature.
The scratch marks appear in other forms as well. October, 1941 looked interesting because both of the failed quality checks were not due to an obvious reason. The flagged temperatures were not unusual for that time of year or relative to the temperatures the day before and after. Upon opening the B-91, the same “scratch out” artifact was visible over the first maximum temperature entry! Sure enough, the maximum temperatures were shifted in the same manner as February, 1929. As a result, two colder days were discarded from the average temperature calculation.
Because the markings were similar, it appeared they were transferred to multiple forms when they lay piled in a stack, probably because the forms were carbon copies. This likely would have happened after they were submitted, because on the 1941 form the observer did scratch out temperatures and was clear where the replacements were written.
Impact of the Errors
In addition to one incorrect maximum temperature, the full three days flagged as failing the quality check were not used to calculate the monthly average. The unadjusted average reflected in the electronic record was 0.8C whereas the paper record was 0.24C, just over half a degree cooler. The time of observation estimate was 1.41C. The homogenization model decided that a monthly value could not be computed from daily data and discarded it. It infilled the month instead, replacing the value with an estimate of 0.12C computed using values from surrounding stations. While that was not a bad estimate, the question is would it have been 0.12C if the transcription had been correct? Furthermore, because the month was infilled, GHCN did not include it.
In the case of January, 1941, the unadjusted average reflected in the electronic record was 2.56C whereas the paper record was 2.44C. The TOB model estimated the average as 3.05C. Homogenization estimated the temperature at 2.65C. That was was retained by GHCN.
Only recently have we had the ability to collect and report climate data automatically, without the intervention of humans. Much of the temperature record we have was collected and reported manually. When humans are involved, errors can and do occur. I was actually impressed with the records I saw from Muleshoe because the observers corrected errors and noted observation times that were outside the norm at the station. My impression was that the observers at that station tried to be as accurate as possible. I have looked through B-91 forms at other stations where no such corrections or notations were made. Some of those stations were located at people’s homes. Is it reasonable to believe that the observers never missed a 7 AM observation for any reason, such as a holiday or vacation, for years on end? That they always wrote their observation down the first time correctly?
The observers are just one human component. With respect to Muleshoe, the people who transcribed the record into electronic form clearly misinterpreted what was written, and for good reason. Taken by themselves, the forms appeared to have corrections. The people doing data entry likely did so many years ago with no training as to what common errors might occur in the record or the transcription process.
But the transcribers did make mistakes. In other records I have seen digits transposed. While transposing a 27 to a 72 is likely to be caught by a quality control check, transposing a 23 to 32 probably won’t be caught. Incorrectly entering 20 instead of -20 can get a whole month’s worth of useful data tossed out by the automatic checkers. That data could be salvaged by a thorough re-examination of the paper record.
Now expand that to the rest of the world. I think we have done as good a job as could be expected in this country, but it is not perfect. Can we say the same about the rest of the world? I’ve seen a multitude of justification for the adjustments made to the US data, but a lack of explanation as to why the rest of the world is adjusted half as frequently.