Spiking temperatures in the USHCN – an artifact of late data reporting

Correcting and Calculating the Size of Adjustments in the USHCN

By Anthony Watts and Zeke Hausfather

A recent WUWT post included a figure which showed the difference between raw and fully adjusted data in the United States Historical Climatology Network (USHCN). The figure, used in that WUWT post was from from Steven Goddard’s website, and in addition to the delta from adjustments over the last century, included a large spike of over 1 degree F for the first three months of 2014.  That spike struck some as unrealistic, but knowing that a lot of adjustment goes into producing the final temperature record, some weren’t surprised at all. This essay is about finding the true reason behind that spike.

2014_USHCN_raw-vs-adjusted

One commenter on that WUWT thread, Chip Knappenberger, said he didn’t see anything amiss when plotting the same data in other ways, and wondered in an email to Anthony Watts if the spike was real or not.

Anthony replied to Knappenberger via email that he thought it was related to late data reporting, and later repeated the same comment in an email to Zeke Hausfather, while simultaneously posting it to Nick Stokes blog, who had also been looking into the spike.

This spike at the end may be related to the “late data” problem we see with GHCN/GISS and NCDC’s “state of the climate” reports. They publish the numbers ahead of dataset completeness, and they have warmer values, because I’m betting a lot of the rural stations come in later, by mail, rather than the weathercoder touch tone entries. Lot of older observers in USHCN, and I’ve met dozens. They don’t like the weathercoder touch-tone entry because they say it is easy to make mistakes.

And, having tried it myself a couple of times, and being a young agile whippersnapper, I screw it up too.

The USHCN data seems to show completed data where there is no corresponding raw monthly station data (since it isn’t in yet) which may be generated by infilling/processing….resulting in that spike. Or it could be a bug in Goddard’s coding of some sorts. I just don’t see it since I have the code. I’ve given it to Zeke to see what he makes of it.

Yes the USHCN 1 and USHCN 2.5 have different processes, resulting in different offsets. The one thing common to all of it though is that it cools the past, and many people don’t see that as a justifiable or even an honest adjustment.

It may shrink as monthly values come in.

Watts had asked Goddard for his code to reproduce that plot, and he kindly provided it. It consists of a C++ program to ingest the USHCN raw and finalized data and average it to create annual values, plus an Excel spreadsheet to compare the two resultant data sets. Upon first inspection, Watts couldn’t see anything obviously wrong with it, nor could Knappenberger. Watts also shared the code with Hausfather.

After Watts sent the email to him regarding the late reporting issue, Hausfather investigated that idea, and ran some different tests and created plots which demonstrate how the spike was created due to that late reporting problem. Stokes came to the same conclusion after Watts’ comment on his blog.

Hausfather, in the email exchange with Watts on the reporting issue wrote:

Goddard appears just to average all the stations readings for each year in each dataset, which will cause issues since you aren’t converting things into anomalies or doing any sort of gridding/spatial weighting. I suspect the remaining difference between his results and those of Nick/myself are due to that. Not using anomalies would also explain the spike, as some stations not reporting could significantly skew absolute temps because of baseline differences due to elevation, etc.”

From that discussion came the idea to do this joint essay.

To figure out the best way to estimate the effect of adjustments, we look at four difference methods:

1. The All Absolute Approach – Taking absolute temperatures from all USHCN stations, averaging them for each year for raw and adjusted series, and taking the difference for each year (the method Steven Goddard used).

2. The Common Absolute Approach – Same as the all absolute approach, but discarding any station-months where either raw and adjusted series are missing.

3. The All Gridded Anomaly Approach – Converting absolute temperatures into anomalies relative to a 1961-1990 baseline period, gridding the stations in 2.5×3.5 lat/lon grid cells, applying a land mask, averaging the anomalies for each grid cell for each month, calculating the average temperature for the whole continuous U.S. by a size-weighted average of all gridcells for each month, averaging monthly values by year, and taking the difference each year for resulting raw and adjusted series.

4. The Common Gridded Anomaly Approach – Same as the all-gridded anomaly approach but discarding any station-months where either raw and adjusted series are missing.

The results of each approach are shown in the figure below, note the spike has been reproduced using method #1 “All Absolutes”:

USHCN-Adjustments-by-Method-Year

The latter three approaches all find fairly similar results; the third method (The All Gridded Anomaly Approach) probably best reflects the difference in “official” raw and adjusted records, as it replicates the method NCDC uses in generating the official U.S. temperatures (via anomalies and gridding) and includes the effect of infilling.

The All Absolute Approach used by Goddard gives a somewhat biased impression of what is actually happening, as using absolute temperatures when raw and adjusted series don’t have the same stations reporting each month will introduce errors due to differing station temperatures (caused by elevation and similar factors). Using anomalies avoids this issue by looking at the difference from the mean for each station, rather than the absolute temperature. This is the same reason why we use anomalies rather than absolutes in creating regional temperature records, as anomalies deal with changing station composition.

The figure shown above also incorrectly deals with data from 2014. Because it is treating the first four months of 2014 as complete data for the entire year, it gives them more weight than other months, and risks exaggerating the effect of incomplete reporting or any seasonal cycle in the adjustments. We can correct this problem by showing lagging 12-month averages rather than yearly values, as shown in the figure below. When we look at the data this way, the large spike in 2014 shown in the All Absolute Approach is much smaller.

USHCN-Adjustments-by-Method-12M-Smooth

There is still a small spike in the last few months, likely due to incomplete reporting in April 2014, but its much smaller than in the annual chart.

While Goddard’s code and plot produced a mathematically correct result, the procedure he chose (#1 The All Absolute Approach) comparing absolute raw USHCN data and absolute finalized USHCN data, was not, and it allowed non-climatic differences between the two datasets, likely caused by missing data (late reports) to create the spike artifact in the first four months of 2014 and somewhat overstated the difference between adjusted and raw temperatures by using absolute temperatures rather than anomalies.

0 0 votes
Article Rating

Discover more from Watts Up With That?

Subscribe to get the latest posts sent to your email.

176 Comments
Inline Feedbacks
View all comments
A C Osborn
May 10, 2014 11:31 am

How can anyone say Stephen Goddard is flat out wrong?
It does not matter that there may be an odd station error here or there when he is demonstrating that the so called corrections to the data (which he ignores) are causing a greater than 1 degree cooling of the past and also that using incomplete latest month data produces the “hottest month ever” syndrome.
Just about everybody can see and agrres that that is what is happening, so how is it “wrong”?

A C Osborn
May 10, 2014 11:33 am

Nobody has tried to answer Paul Homewood’s question either, was he doing it wrong as well?
And all the others in the past that have shown data tampering on a massive scale in the name of “Correction”.

johnbuk
May 10, 2014 12:00 pm

The point JohnWho makes is key here –
“USHCN temperature records have been “corrected” to account for various historical changes in station location, instrumentation, and observing practice.”
It would be fine if our lords and masters were using the data as Anthony and Zeke are – ie in a genuine joint attempt to determine the optimum data.
But when it’s being used for out and out propaganda then I’d suggest Steve G blowing his whistle for the obvious foul is acceptable.
Sadly until the “team” et al releases all data and are prepared to accept full discussion of the issues at hand (as indeed our two authors are doing), then “playing fair” will be rather naive to say the least.
As a pleb tax-payer in the UK having to put up with complete BS for several years now and having my pocket picked with impunity the burden of proof on the CAGW crowd has now reached quite a considerable level – the chances of any of the existing criminals reaching that level is very close to zero.

Doug Jones
May 10, 2014 12:08 pm

AAARGH! I’m not colorblind, but looking at those charts I feel as if I am. Can you PLEASE please please increase the color saturation and brightness of the lines? They look like four shades of mud.

May 10, 2014 12:23 pm

I think there is little value in ground-based thermometers for the purpose of determining a global temperature. They can be locally useful if well placed, maintained and reported.

Ima
May 10, 2014 12:24 pm

Steven Mosher says:
May 10, 2014 at 10:12 am
Ima.
The reasons for adjustments are real.
The adjustments are validated
The adjustments have been investigated by skeptics and vindicated.

Steven: Perhaps you can address the following question as a way to confirm your comments:
Why did the temperature gauges that were used prior to 1940 overstate actual temperatures by approximately 1 or more degrees F? (If the raw data has been adjusted downward by over a degree, is this not stating that the original readings had overstated temperatures by a like amount?)
The encroachment of the UHI has been in the years afterwards. I would have thought that the temperature readings in these earlier years would have required less correction.
If there is a clear and reasonable explanation as to why historical temperature readings consistently overstated the actual temperature of that period of time, then this needs to be articulated so that people such as myself can comprehend.
If there is not a clear and reasonable explanation, then perhaps we need to step back and ask ourselves if the temperature adjustment process has not perhaps been compromised by some unintended infusion of bias or through methodology error.
Please help me to understand. This is all that I am asking.

May 10, 2014 12:27 pm

From my understanding of the BEST methodology, which I invite Mr. Mosher to correct if wrong, a station move is considered a new station, a station change in instrumentation is considered a new station, and a station change in Tobs is a new station. Then, after this chopping the data into pieces that are internally consistent and comparable, anomalies are created, with no adjustment to any raw numbers (other than simple erroneous ones). The anomalies are then woven together into regional and global trends.
The BEST methodology strikes me as the right way to go about things, although it is still susceptible to a changing mix of site qualities. The USHCN method adds adjustments that are well-founded, but have errors of their own.

FundMe
May 10, 2014 12:30 pm

Steve Goddard
Is it possible to differentiate between reported (by the weather stations) and infilled data. If so it should be possible to run the comparison between the absolute temps as reported (climate report) and the raw.. In that way one should be able to compare only the stations that have already reported. I have a feeling the hockey stick will remain. I realize there is a confounding use of the word reported what I am trying to say is just leave out the missing stations if that is possible.

David Riser
May 10, 2014 12:42 pm

Anthony,
not to pick but Doug is right! Pretty difficult to parse the graph I had to break out the glasses earlier, I wouldn’t have said anything except for your reply. I am pretty sure messing with the monitor would not fix the issue.
v/r,
David Riser

Eric Barnes
May 10, 2014 1:00 pm

Yes. Issue a correction. It’s a road well traveled by NOAA, NCDC, NASA,etc who have issued corrections over the years to the point where nobody believes anything they publish. Raw data is missing, but they go ahead and publish a monthly climate report to justify their existence. (Monthly climate report, a nice oxymoron).
They shouldn’t be expected to get the numbers right. They’re government employees after all.
.
” stevengoddard says:
May 10, 2014 at 10:28 am
NCDC reports absolute temperatures, not anomalies. Most of my comparisons are vs. GHCN HCN vs. NCDC. Once in a while I do the USHCN comparisons like this one.
REPLY: Yes, but it’s wrong, so learn from the mistake, issue a correction and move on. – Anthony

May 10, 2014 1:03 pm

Ima,
Back in the 1940s virtually all the stations used liquid-in-glass thermometers, which read about 0.6 degrees warmer in max temperatures (and about 0.2 degrees colder in min temperatures) than the new MMTS instruments introduced in the 1980s. This means that actual max temperatures (as measured by MMTS instruments) would have been ~0.6 degrees colder, and contribute part of the reason for adjusting past temps downwards. Time of observation biases introduce similar warming, as shown in this figure from Menne et al 2009: http://stevengoddard.files.wordpress.com/2014/01/screenhunter_28-jan-18-13-25.gif
Here are changes in TOBs over time: http://rankexploits.com/musings/wp-content/uploads/2012/07/TOBs-adjustments.png
And a detailed discussion of MMTS biases: http://rankexploits.com/musings/2010/a-cooling-bias-due-to-mmts/
The combination of MMTS and TOBs drives the bulk of the downward adjustment in past mean temperatures. Min temperatures are actually adjusted down slightly via homogenization, likely due to detecting and correcting for some UHI bias.
UnfrozenCavemanMD,
Indeed, Berkeley doesn’t technically “adjust” anything; we cut stations at breakpoints detect through neighbor comparisons and treat everything after a breakpoint as a new station. Interestingly enough, you end up with pretty much the same result as using NCDC’s method: http://rankexploits.com/musings/wp-content/uploads/2013/01/USHCN-adjusted-raw-berkeley.png
FundMe,
That is exactly what our method 2 in this post does; only look at station-months where both raw and adjusted series have readings. It nicely eliminates the “spike”.
To other folks: Sorry if I’m slow to respond, I’m heading out on a camping trip and will have little if any internet access.

u.k.(us)
May 10, 2014 1:03 pm

-=NikFromNYC=- says:
May 10, 2014 at 8:43 am
============
Yep, I’ve been banned from Goddard’s site.
Maybe I deserved it ?

May 10, 2014 1:05 pm

Sorry, that last post should have read “Past min temperatures are actually adjusted up slightly via the non-TOBs part of homogenization, likely due to detecting and correcting for some UHI bias”. Its somewhat confusing when down and up mean opposite things in terms of trend impact depending on if they happen in the present of in the past…

Nick Stokes
May 10, 2014 1:23 pm

I had noted on my blog a diagnosis here. The problem isn’t late notification by USHCN, or even not using anomalies, though with anomalies that would not arise. I didn’t use anomalies. The problem is just wrong methodology. You can’t subtract the average of a whole lot of adjusted readings from the average of a different whole lot of unadjusted readings and expect the difference to reflect the effect of adjustment, unless you have avoided other major reasons for difference. And those reasons are the disparate station/month times that have gone into the averages. It’s not apples to apples.
Here the dominant problem is that in 2014, all stations (1218) had data for all four months. Raw data were 891 for Jan, 883 for Feb, 883 for Mar, and 645 for Apr. Biased toward winter. To exaggerate, you’re comparing winter raw with spring final. You’ll get a big difference, but it’s not due to adjustment.
But there can be biases due to different stations in the selection too. It’s just the wrong thing to do. You can do it right by forming differences first for months where stations have both raw and final data and averaging those. Then you know that adjustment is the reason for the difference.

DesertYote
May 10, 2014 1:31 pm

A 1 degree anomaly in Pumpkin Center does not have the same significance as a 1 degree anomaly in central Phoenix. Anomalies at a location need to be normalized by, e.g. converted to sigma of the measurements for that location.

David Riser
May 10, 2014 1:38 pm

Anthony,
Yes you can blow it up but the colors are so close together and bland it makes it hard to read. You picked some colors that are particularly close together spectrum wise. Many people have color perception issues that don’t impact them in day to day reading etc. But the colors you picked are exactly those colors that cause most of the issue’s (blue and green and brown and tan). I imagine that most folks don’t have a lot of issues with it, because the 3 are so far apart from the one. Many others don’t really care because quite frankly there isn’t a lot of love for statistical abuse of the temperature record. A lot of folks look at what Steven Goddard did and say he has a valid point, he was forthright in providing code and data, he just disagrees with how folks interpret his graph. But the point is well made.
v/r,
David Riser

Latitude
May 10, 2014 1:38 pm

Can someone take a minute and explain to me how you can get an anomaly…
…and not know what the temperature is

Nick Stokes
May 10, 2014 1:40 pm

Zeke Hausfather says: May 10, 2014 at 7:52 am
“For folks arguing that we should be using absolute temperatures rather than anomalies: no, thats a bad idea, unless you want to limit your temperature estimates only to stations that have complete records over the timeframe you are looking at”

Zeke, it is a bad idea, but it’s what NOAA does for CONUS temperatures. I think it’s just because that’s how it has always been done. They come up with an average temp for the US of 54°F or whatever.
Tp do that, as you say, you must have complete records for every station/month, else you’ll get artefacts like this spike. That’s why, as part of their “final”, they have included FILNET, which in effect interpolates all missing data. No-one else needs to do that, because they use anomalies instead.
In fact this works, and if you do it right, it gives a fair result. But it’s very trappy. Great caution is needed in dealing with averages of absolute temperatures. Here is another place where it goes wrong. If the stations aren’t the same, then you get differences due to their different situations. Here it was probably that USCRN stations are on average higher altitude. But it’s an apples/apples issue.

David Riser
May 10, 2014 1:41 pm

In case its not clear, ROYGBIV, note green and blue are together and brown and tan are together in orange yellow. So maybe a Red Line, a Yellow line, Blue Line and a Violet Line would have been better. Just a thought for future.
v/r,
David Riser

David Riser
May 10, 2014 1:42 pm

LOL, roger that! have fun, not really tryin to ruin your day!

Ima
May 10, 2014 1:45 pm

Zeke Hausfather says:
May 10, 2014 at 1:03 pm
Ima,
Back in the 1940s virtually all the stations used liquid-in-glass thermometers, which read about 0.6 degrees warmer in max temperatures (and about 0.2 degrees colder in min temperatures) than the new MMTS instruments introduced in the 1980s
My sources (Wikipedia yuk yuk) tell me that mercury thermometers are more precise than that:
“According to British Standards, correctly calibrated, used and maintained liquid-in-glass thermometers can achieve a measurement uncertainty of ±0.01 °C in the range 0 to 100 °C, and a larger uncertainty outside this range: ±0.05 °C up to 200 or down to −40 °C, ±0.2 °C up to 450 or down to −80 °C.[37]”
I’ve yet to be convinced that we have been measuring temperatures incorrectly for hundreds of years. If you have any further documentation I would appreciate it. I suppose this is why they call us skeptics.

Merovign
May 10, 2014 1:52 pm

So. All the adjustments are still going the same way. Eventually, the charts will say “yesterday it was near absolute zero, but tomorrow we will burn!”

David Riser
May 10, 2014 1:59 pm

Zeke,
Dad knew what he was doing, it is complete arrogance to say otherwise without proof. And using statistics to detect step changes in a chaotic system and to use close stations to do pairwise comparison when differentials of 10’s of degrees are possible over very short distances (parts of a mile) due to land use, elevation, bodies of water and even vegetation is not defensible.
v/r,
David Riser

Editor
May 10, 2014 1:59 pm

Thanks, Zeke.

Nick Stokes
May 10, 2014 2:05 pm

A C Osborn says: May 10, 2014 at 11:31 am
“How can anyone say Stephen Goddard is flat out wrong?”

It’s just wrong method. The thing he’s graphing does not reflect only adjustment. The spike is the most obvious issue. It’s predictable through the year. Here he has one in Illinois.
April has about 200 more stations (than earlier months) where final data has added estimated data. That’s adding spring data to a mainly winter average, and pushes up the average. If you kept doing SG-type plots through the year, you’ll see first the spike diminish, because the proportional effect of the extra 200 fades. By about November, the spike will go negative. That’s because the latest estimated data added will be autumn data, and colder than the year to date average. It brings the average down. It’s predictable, and reflects just seasonality.

Verified by MonsterInsights