Can Temperature Adjustments Right a Wrong?

Guest Post by John Goetz

Adjustments to temperature data continue to receive attention in the mainstream media and science blogs. Zeke Hausfather wrote an instructive post on the Climate Etc. blog last month explaining the rationale behind the Time of Observation (TOBS) adjustment. Mr. Hausfather pointed to the U.S. Climate Reference Network (CRN) as a source of fairly pristine data that can be used to analyze TOBS. In examining the CRN data, there is no doubt that the time of observation affects the minimum, maximum, and average temperature recorded on a given day. Also, changing the TOBS one or more times during a station’s history can affect the station’s temperature trend.

Temperature adjustments have bothered me not because they are made, but because there is a broad assumption that they skillfully fix a problem. Somehow, climate scientists are capable of adjusting oranges into apples. However, when adjustments are made to temperature data – whether to correct for TOBS, missing data entries, incorrect data logging, etc. – we are no longer left with data. We are left instead with a model of the original data. As with all models, there is a question of how accurately that model reflects reality.

After reading Mr. Hausfather’s post, I wondered how well the TOBS adjustments corrected the presumably flawed raw temperature data. In the process of searching for an answer, I came to the (preliminary) conclusion that TOBS and other adjustments are doing nothing to bring temperature data into clearer focus so that global temperature trends can be calculated with the certainty needed to round the results to the nearest hundredth of a degree C.

The CRN station in Kingston, RI is a good place to examine the efficacy of the TOBS adjustment. This is because it is one of several CRN pairs around the country. Kingston 1 NW and Kingston 1 W are CRN stations located in Rhode Island and separated by just under 1400 meters. Also, a USHCN station that NOAA adjusts for TOBS and later homogenizes is located about 50 meters from Kingston 1 NW. The locations of the stations can be seen on the following Google Earth image. Photos of the two CRN sites follow – Kingston 1 W on top and Kingston 1 NW on the bottom (both courtesy NCDC).

Kingston_Stations
Locations of Kingston, RI USHCN and CRN Stations
Slide 1
Kingston CRN 1 W
Slide 1
Kingston CRN 1 NW

The following images are of the Kingston USHCN site from the Surface Stations Project. The project assigned the station a class 2 rating for the time period in question, 2003 – 2014. Stations with a class 1 or class 2 rating are regarded as producing reliable data (see the Climate Reference Network Rating Guide – adopted from NCDC Climate Reference Network Handbook, 2002, specifications for siting (section 2.2.1) of NOAA’s new Climate Reference Network). Only 11% of the stations surveyed by the project received a class 1 or 2 rating, so the Kingston USHCN site is one of the few regarded as producing reliable data. Ground level images by Gary Boden, aerial images captured by Evan Jones.

KINGSTON_ RI_ Proximity
Google Earth image showing locations of USHCN station monitoring equipment.
KINGSTON_ RI_ Measurement_001
Expanded Google Earth image of USHCN station. Note the location of the Kingston 1 NW CRN station in the upper right-center.
Kingston_looking_south
Kingston USHCN station facing south.
Kingston_looking_north
Kingston USHCN station facing north. The Kingston 1 NW CRN station can be seen in the background.

CRN data can be downloaded here. Download is cumbersome, because each year of data is stored in a separate directory, and each file represents a different station. Fortunately the file names are descriptive, showing the state and station name, so locating the two stations used in this analysis is straightforward. After downloading each year’s worth of data for a given station, they must be concatenated into a single file for analysis.

USHCN data can be downloaded here. The raw, TOBs, and homogenized (52i) files must be downloaded and unzipped into their directories. All data for a station is found in a single file in the unzipped directories. The Kingston USHCN data has a file name that begins with USH00374266.

Comparison of Kingston 1 NW and Kingston 1 W Temperatures

Both Kingston CRN stations began recording data in December, 2001. However, the records that month were incomplete (more than 20% of possible data missing). In 2002, Kingston 1 NW reported incomplete information for May, October, and November while Kingston 1 W had incomplete information for July. Because of this, CRN data from 2001 and 2002 are not included in the analysis.

The following chart shows the difference in temperature measurements between Kingston 1 NW and Kingston 1 W. The temperatures were determined by taking the average of the prior 24-hour minimum and maximum temperatures recorded at midnight. The y-axis is shown in degrees C times 100. The gray range shown centered at 0 degrees C is 1 degree F tall (+/- 0.5 degrees F). I put this range in all of the charts because it is a familiar measure to US readers and helps put the magnitude of differences in perspective.

Figure_1
The y-axis units are degrees C times 100. The gray band has a y-dimension of 1 degree F, centered on 0.

Given the tight proximity of the two stations, I expected their records to track closely. I found it somewhat surprising that 22 of the months – or 15% – differed by the equivalent of half a degree F or more. This makes me wonder how meaningful (not to say accurate) homogenization algorithms are, particularly ones that make adjustments using stations up to 1200 Km. With this kind of variability occurring less than a mile apart, does it make sense to homogenize a station 50 or 100 miles away?

Comparison of Kingston 1 NW and Kingston 1 W Data Logging

A partial cause of the difference is interruption in data collection. Despite the high-tech equipment deployed at the two sites, interruptions occurred. Referring to the previous figure, the red dots indicate months when 24 or more data hours were not collected. The interruptions were not continuous, representing a few hours here and a few there of missing data. The two temperature outliers appear to be largely due to data missing from 79 and 68 hours, respectively. However, not all differences can be attributed to missing data.

In the period from 2003 through 2014, the two stations recorded temperatures during a minimum 89% of the monthly hours, and most months had more than 95% of the hours logged. The chart above shows that calculating a monthly average when missing 10-11% of the data can produce a result with questionable accuracy. However, NOAA will calculate a monthly average for GHCN stations missing up to nine days worth of data (see the DMFLAG description in ftp://ftp.ncdc.noaa.gov/pub/data/ushcn/v2.5/readme.txt). Depending on the month’s length, GHCN averages will be calculated despite missing up to a third of the data.

Comparison of Kingston USHCN and CRN Data

To test the skill of the TOBS adjustment NOAA applied to the Kingston USHCN site, a synthetic TOBS adjustment for the CRN site was calculated. The B91 forms for Kingston USHCN during 2003-2014 show a 4:30 PM observation time. Therefore, a synthetic CRN 4:30 PM observation was created by averaging the 4:00 PM and 5:00 PM observation data. The difference between the USHCN raw data and the synthetic CRN 4:30 observation is shown in the following figure. Despite a separation of approximately 50 meters, the two stations are producing very different results. Note that 2014 data is not included. This is because the 2014 USHCN data was incomplete at the time it was downloaded.

Figure_2
The y-axis units are degrees C times 100. The gray band has a y-dimension of 1 degree F, centered on 0.

Although the data at the time of observation is very different, perhaps the adjustment from midnight (TOBS) is similar. The following figure represents the TOBS adjustment amount for the Kingston USHCN station minus the TOBS adjustment for the synthetic CRN 4:30 PM data. The USHCN TOBS adjustment amount was calculated by subtracting the USHCN raw data from the USHCN TOBS data. The CRN TOBS adjustment amount was calculated by subtracting the synthetic CRN 4:30 PM data from the CRN midnight observations. As can be seen in the following figure, TOBS adjustments to the USHCN data are very different than what would be warranted by the CRN data.

Figure_3
The y-axis units are degrees C times 100. The gray band has a y-dimension of 1 degree F, centered on 0.

Adjustment Skill

The best test of adjustment skill is to take the homogenized data for the Kingston USHCN station and compare it to the midnight minimum / maximum temperature data collected from the Kingston CRN 1 NW station located approximately 50 meters away. This is shown in the following figure. Given the differences between the homogenized data from the USHCN station and the measured data from the nearby CRN station, it does not appear that the combined TOBS and homogenization adjustments produced a result that reflected real temperature data at this location.

Figure_4
The y-axis units are degrees C times 100. The gray band has a y-dimension of 1 degree F, centered on 0.

Accuracy of Midnight TOBS

Whether the minimum and maximum temperatures are read at midnight or some other time, they represent just two samples used to calculate a daily average. The most accurate method of calculating the daily average temperature would be to sample continuously, and calculate an average over all samples collected during the day. The CRN stations sample temperature once every hour, so 24 samples are collected each day. Averaging the 24 samples collected during the day will give a more accurate measure of the day’s average temperature than simply looking at the minimum and maximum for the past 24 hours. This topic was covered in great detail by Lance Wallace in a guest post two and a half years ago. It is well worth another read.

The following chart shows the difference between using the CRN hourly temperatures to calculate the daily average, and the midnight minimum / maximum temperature. The chart tends to show that the hourly temperatures would produce a higher daily average at this station.

Figure_5
The y-axis units are degrees C times 100. The gray band has a y-dimension of 1 degree F, centered on 0.

Discussion

Automated methods to adjust raw temperature data collected from USHCN stations (and by extension, GHCN stations) are intended to improve the accuracy of regional and global temperature calculations to, in part, better monitor trends in temperature change. However, such adjustments show questionable skill in correcting the presumed flaws in the raw data. When comparing the raw data and adjustments from a USHCN station to a nearby CRN station, no improvement is apparent. It could be argued that the adjustments degraded the results. Furthermore, additional uncertainty is introduced when monthly averages are computed from incomplete data. This uncertainty is propagated when adjustments are later made to the data.

A Note on Cherry-Picking

Some will undoubtedly claim that I cherry-picked the data to make a point, and they will be correct. I specifically looked for the closest-possible CRN and USHCN station pairs, with the USHCN station having a class 1 or 2 rating. My assumption was that their differences would be minimized. The fact that a second CRN station was located less than a mile away cemented the decision to analyze this location. If anyone is able to locate a CRN and USHCN pair closer than 50 meters, I will gladly analyze their differences.

================================

Edited to add clarification on y-axis units and meaning of the gray band to the description of each figure.

Edited to add links to the CRN and USHCN source data on the NCDC FTP site.

5 1 vote
Article Rating

Discover more from Watts Up With That?

Subscribe to get the latest posts sent to your email.

276 Comments
Inline Feedbacks
View all comments
Steve Garcia
March 7, 2015 7:44 pm

Wow. I’ve been saying it for about 14 years now:
The science just isn’t there. Not good enough.
As to those who do this sort of stuff, who are these bozos, and what box of Cracker Jacks did they get their science decrees out of?
I mean, this is really BAD science.
I used to say that when they change over a station for TOBS or change of instruments or change of location they should run a year with BOT old and new and adjust the new one till it matches the old.
Now it appears that even THAT won’t give continuous data that is good enough to pee on.
And THEN add in the proxy data that is presumed to be a good fit. (And don’t forget the Divergence Problem in tree rings there…)
GIGO. GIGO. GIGO. GIGO.

Brandon Gates
Reply to  Steve Garcia
March 8, 2015 12:21 am

Steve Garcia,

As to those who do this sort of stuff, who are these bozos, and what box of Cracker Jacks did they get their science decrees out of?

Apparently not the box you got yours out of: the one with a time machine as the prize, by which you can travel back and retroactively do the changeover procedure you describe. I’m sorry that the laws of physics don’t conform to the GIGO principle, but here in the real world we must use the data we’ve got. There’s no other option.

knr
Reply to  Brandon Gates
March 8, 2015 5:04 am

Well the ‘its better than nothing ‘ approach does not change the fact that is often poor at best , especially when great claims , which are supposed to go unquestioned , and great demands are made on the back of this ‘better then nothing’ data .
Lets take an example your trying to measure something that changes by 0.1 but you can only measure accurately to 1.0 , you bases of its ‘better than nothing ‘ means so we can make claims about the nature of the changes but thinking about for a second shows this is nonsense .
You cannot magic garbage data into good by saying that its all we got while GI is garbage in , so yes GIGO very much applies .
Climate ‘science’ very big problems with knowing the reality of past climate should be reason for concern and
caution not claims of ‘settled science’ at all.

RACookPE1978
Editor
Reply to  Brandon Gates
March 8, 2015 8:40 am

Gates:

I’m sorry that the laws of physics don’t conform to the GIGO principle, but here in the real world we must use the data we’ve got. There’s no other option.

BUT! The laws of computer programming approximate and inaccurate computer models of real world physical processes that are only approximated in the computer simulations DO follow exactly the Garbage In Garbage Out model. Precisely and every time. GIGO. No “science” in there at all.

mpainter
Reply to  Brandon Gates
March 8, 2015 7:41 pm

There is an option, Gates: do not use questionable data. But this option does not appeal to the climatologist/climateers.
No statistically significant warming for 25 years. AGW. RIP.

Brandon Gates
Reply to  Brandon Gates
March 8, 2015 8:20 pm

mpainter,

There is an option, Gates: do not use questionable data.

Somewhere along the way, someone has to determine what data are questionable or not. How do you propose one look at raw surface data from a single and tell if it is reliable or not?

Brandon Gates
Reply to  Brandon Gates
March 8, 2015 8:46 pm

RACookPE1978,

… inaccurate computer models of real world physical processes …

You’re changing the subject for some reason. Very well. Which physical processes have been incorrectly modelled, and how do you know they’re incorrect if the observational data themselves are suspect?

RACookPE1978
Editor
Reply to  Brandon Gates
March 8, 2015 9:05 pm

A valid question.
We cannot. Which is why “climate change” is NOT a Catastrophic-we-must-harm-billions-now-by-restricting-energy-immediately!!!!! problem. We have centuries to figure out the right data, and what is “weather” …

Brandon Gates
Reply to  Brandon Gates
March 8, 2015 8:53 pm

knr,

Well the ‘its better than nothing ‘ approach does not change the fact that is often poor at best , especially when great claims , which are supposed to go unquestioned , and great demands are made on the back of this ‘better then nothing’ data.

The guy typing this message never said anything about not questioning claims. On that note, who is it that is saying you should not question conclusions? Specific citation if you please.
And yes, I will reiterate: I am not aware of any “better” data from the past. Are you? I can’t change that reality for the both of us as much as I’d like to. I’m perfectly happy to spend more money on better observation going forward, because, like you, I don’t “settle” for better than nothing if, and only if, it can be helped.

Brandon Gates
Reply to  Brandon Gates
March 8, 2015 8:55 pm

m p a i n t e r,
My replies to you are consistently being eaten by WordPress, creative spelling of your handle as a test to see whether that makes a difference.
[visible, not in pending queue. .mod]

Brandon Gates
Reply to  Brandon Gates
March 8, 2015 8:58 pm

Looks like that might essplain it. Anywho, similar question to you: how do you propose to determine what are questionable data and what are not? “Don’t use questionable data” doesn’t mean anything unless someone defines what “questionable” means. See the problem yet?

knr
Reply to  Brandon Gates
March 9, 2015 1:38 pm

‘I am not aware of any “better” data from the past.’
So in what way does that create the magic process by which poor data becomes good ?
If you had no data at all , would be OK to just make it up and apply the same magic to say its valid .
Its really simply to valid measure something you have to match certain parameters for taken measurements , if you cannot match them they you cannot produce a valid measurement.
What you got is ‘a guess ‘ no matter how much computing power you throw at it , and if your ‘guessing ‘ your in no position to make grand statements, especially about accuracy, based on the guess .
Now that is not usual , indeed ideas like error bars are designed to cope with such issues .
But within ‘settled’ climate ‘science’ with seen an active movement against this very idea to meet the political demands of the area . And so time again we seen great claims for precision in data that are simply not justified by the means of collecting the data .
The idea can should spend trillions and make large changes to society on the back of ‘better than nothing data ‘ should be regarded as a joke.

Brandon Gates
Reply to  Steve Garcia
March 9, 2015 5:08 pm

knr,

So in what way does that create the magic process by which poor data becomes good?

You tell me, that’s your argument not mine. How do you know the data are “poor” to begin with, hmmm?

If you had no data at all, would be OK to just make it up and apply the same magic to say its valid.

No, I would consider that fraudulent.

Its really simply to valid measure something you have to match certain parameters for taken measurements, if you cannot match them they you cannot produce a valid measurement.

When doing carpentry or lab bench chemistry, yes, it’s a cinch compared to the scope of the system we’re discussing here.

What you got is ‘a guess ‘ no matter how much computing power you throw at it …

I prefer the word estimate myself because it appropriately describes the rigor and thought that went into the process, but why quibble over semantics.

… and if your ‘guessing ‘ your in no position to make grand statements, especially about accuracy, based on the guess .

That’s a big if … we wouldn’t want to be making bad assumptions here now, would we?

Now that is not usual , indeed ideas like error bars are designed to cope with such issues .

Mmm, error bars don’t fix anything either. Nor are they just magically drawn on the plot. All they do is communicate an estimate — or a guess if you will — about the level of uncertainty in the data. And those estimates are still only as “good” as the human beings doing the analysis. Your speech here has solved nothing. The data are still “bad”. The human bias has not been removed — they can still “cheat” on the estimates of the error.

But within ‘settled’ climate ‘science’ with seen an active movement against this very idea to meet the political demands of the area .

ROFL!!! I thought the whole idea was to take the political motives OUT of the science, not demand that researchers accede to political agendas. Yet more evidence that my own assumptions about things are quite often wrong, I ‘spose.

And so time again we seen great claims for precision in data that are simply not justified by the means of collecting the data .

How do you propose improving data which were collected in the past?

The idea can should spend trillions and make large changes to society on the back of ‘better than nothing data ‘ should be regarded as a joke.

By and large, the world is a funny place that way. “Shoot first, ask questions later” is about as pure a survival instinct as I can think of. Why mess with success?
Perhaps you need to do some reevaluation about how the real world works, because unreasonable fantasies about pristine data — or failing that, slapping magically derived error bars around the cruddy stuff — are the quickest way to learn nothing about anything. One tends to wonder if that’s the whole point.
As for me, I’m all for spending the money to improve our observations, and estimates derived from them. Better instruments, more of them, more coverage … you name it. What say you? A few more billion on instrumentation and the resources to process and analyze it sound right to you?

March 8, 2015 7:06 pm

You pick as many regional stations as you can find with long station histories, and you run with that. That’s your temperature history.
All this nonsense trying to reconstruct temperature using some sort of model of temperature history is a problem not in need of a solution. We’re talking about fractions of 1C. Step back. It’s unimportant. These people should not have jobs and if one can’t find useful things for them to do, they should be let off.

mpainter
Reply to  Will Nitschke
March 8, 2015 7:31 pm

Ditto
If the data is questionable, discard it. Don’t infill, don’t homogenize, don’t adjust for tobs or whatever. Fabrication is fabrication, no matter how plausible the rationale.
Why is that principle universally ignored by the temperature data keepers?

Reply to  mpainter
March 8, 2015 7:59 pm

@mpainter
“If the data is questionable, discard it.”
No you don’t, that’s just another format of bias.
You can remove a bad value. Which for the temperature data I’ve used is +/-199 or larger absolute value. Which I think most would agree are not likely actual temperatures on earth. Other than that, the rest of your comment holds.

March 9, 2015 6:31 am

Anybody still reading this thread?
If so, please help me out with this (sorry if it was already discussed above and I missed it):
In a 24 hour period, midnight to midnight local time at a station, we get a Tmax of 75 degrees F and a Tmin of 55 degrees F.
What is the average temperature for that station on that day?
Answer: somewhere between 55 and 75 degrees F. We do not have enough information to know what the daily average might have been. If, for example, a cold or warm front came through in the first or last 4 hours of the day it may have changed the actual daily average by 10 degrees or more.
I can see where Tmax/Tmin can help provide long term trends, but how can they be used to obtain a global average surface temperature?

Reply to  JohnWho
March 9, 2015 9:25 am

JohnWho commented

If so, please help me out with this (sorry if it was already discussed above and I missed it):
In a 24 hour period, midnight to midnight local time at a station, we get a Tmax of 75 degrees F and a Tmin of 55 degrees F.
What is the average temperature for that station on that day?
Answer: somewhere between 55 and 75 degrees F. We do not have enough information to know what the daily average might have been. If, for example, a cold or warm front came through in the first or last 4 hours of the day it may have changed the actual daily average by 10 degrees or more.
I can see where Tmax/Tmin can help provide long term trends, but how can they be used to obtain a global average surface temperature?

I can tell you in the global summary of days data set I get from the NCDC data server, Temp is listed as the mean temp for that day, I have run an average of min and max temp and tested for differences between that result and the mean temp field, and there wasn’t any.
Mean temp = Min + Max /2 , At least in the GSoD data.

Reply to  Mi Cro
March 9, 2015 11:32 am

Thanks for the response, Mi Cro.
If that is true, then in my opinion, “that ain’t right”.
Maybe the comment I often see here, regarding the “global average temperatures” is correct.
Oh, the comment:
It’s worse than we thought!.
Just sayin’…

Reply to  JohnWho
March 9, 2015 9:38 am

Let me add one more comment.
I think using min and max temp to compare daily warming to nightly cooling, and to calculate a daily rate of change at each station, and then looking at the how the daily rate of change changes through the year is very useful, and while the data isn’t very good, I think the rate of change info can provide a unique view of the station data, one which BTW doesn’t show any loss of cooling at a minimum of since 1950.
I have lots of data at the url in my name in this post.

Reply to  Mi Cro
March 9, 2015 11:35 am

Perhaps, as long as the station isn’t moved and TOBS remains the same at the station, and it isn’t “adjusted” to match other, uh, “nearby” stations, and the station is properly sited, and the proper siting has remained the same over the years, and it isn’t now nor ever was in a UHI, and …, well, you get the idea.

Reply to  JohnWho
March 9, 2015 12:18 pm

JohnWho commented on

Perhaps, as long as the station isn’t moved and TOBS remains the same at the station, and it isn’t “adjusted” to match other, uh, “nearby” stations, and the station is properly sited, and the proper siting has remained the same over the years, and it isn’t now nor ever was in a UHI, and …, well, you get the idea.

Of course, it just seems such as waste of good information when you filter out the dynamic response to the surface in the time frame that should matter, DWIR happens at the speed of light.
Here the Annual average of the day to day change in both max and min temps. A station has to have both min and max to be included, and when you look at smaller areas you see the swing in min temps are from specific areas and at different dates. (degrees F)
http://www.science20.com/sites/all/modules/author_gallery/uploads/1871094542-global.png
Here’s the slope(rate) of temp change over the year, by year from spring to fall, and fall to spring.
http://www.science20.com/sites/all/modules/author_gallery/uploads/543663916-global.png

Ivan
March 9, 2015 9:47 am

the discussion is on TOBS. Is anyone aware what happened to Watts et al (2012) paper? it was withdrawn because of TOBS issues. Almost three years passed and the final version is not yet out…

1sky1
March 10, 2015 3:04 pm

The essence of the so-called “TOBS bias” of daily extreme readings is that the temperature AT TIME OF RESET of max/min thermometers is occasionally mistaken for the diurnal (midnight to midnight) extreme. This occurs only when Treset is either greater than the following diurnal Tmax (afternoon reset) or less than the following Tmin (morning reset). The empirical TOBS “adjustment” based on determining the extremes of HOURLY readings fails to address this problem

Verified by MonsterInsights