On 'denying' Hockey Sticks, USHCN data, and all that – part 1

Part 2 is now online here.

One of the things I am often accused of is “denying” the Mann hockey stick. And, by extension, the Romm Hockey stick that Mann seems to embrace with equal fervor.

While I don’t “deny” these things exist, I do dispute their validity as presented, and I’m not alone in that thinking. As many of you know Steve McIntyre and Ross McKitrick, plus many others have extensively debunked statistics that went into the Mann hockey stick showing where errors were made, or in some cases known and simply ignored because it helped “the cause”.

The problem with hockey stick style graphs is that they are visually compelling, eliciting reactions like whoa, there’s something going on there! Yet, oftentimes when you look at the methodology behind the compelling visual you’ll find things like “Mike’s Nature Trick“. The devil is always in the details, and you often have to dig very deep to find that devil.

Just a little over a month ago, this blog commented on the hockey stick shape in the USHCN data set which you can see here:

2014_USHCN_raw-vs-adjusted

The graph above was generated by” Stephen Goddard” on his blog and it generated quite a bit of excitement and attention.

At first glance it looks like something really dramatic happened to the data, but again when you look at those devilish details you find that the visual is simply an artifact of methodology. Different methods clearly give different results and the”hockey stick” disappears when other methods are used.

USHCN-Adjustments-by-Method-Year

The graph above is courtesy of Zeke Hausfather Who co-wrote that blog entry with me. I should note that Zeke and I are sometimes polar opposites when it comes to the surface temperature record. However, in this case we found a point of agreement. That point was that the methodology gave a false hockey stick.

I wrote then:

While Goddard’s code and plot produced a mathematically correct result, the procedure he chose (#1 The All Absolute Approach) comparing absolute raw USHCN data and absolute finalized USHCN data, was not, and it allowed non-climatic differences between the two datasets, likely caused by missing data (late reports) to create the spike artifact in the first four months of 2014 and somewhat overstated the difference between adjusted and raw temperatures by using absolute temperatures rather than anomalies.

Interestingly, “Goddard” replied and comments with a thank you for helping to find the reason for this hockey stick shaped artifact. He wrote:

stevengoddard says:

http://wattsupwiththat.com/2014/05/10/spiking-temperatures-in-the-ushcn-an-artifact-of-late-data-reporting/#comment-1632952  May 10, 2014 at 7:59 am

Anthony,

Thanks for the explanation of what caused the spike.

The simplest approach of averaging all final minus all raw per year which I took shows the average adjustment per station year. More likely the adjustments should go the other direction due to UHI, which has been measured by the NWS as 8F in Phoenix and 4F in NYC.

Lesson learned. It seemed to me that was the end of the issue. Boy, was I wrong.

A couple of weeks later in e-mail Steven Goddard circulated a new graph with a hockey stick shape which you can see below. He wrote to me and a few others on the mailing list this message:

Here is something interesting. Almost half of USHCN data is now completely fake.

Goddard_screenhunter_236-jun-01-15-54

http://stevengoddard.wordpress.com/2014/06/01/more-than-40-of-ushcn-station-data-is-fabricated/

After reading his blog post I realized he had made a critical error and I wrote back an e-mail the following:

This claim: “More than 40% of USHCN final station data is now generated from stations which have no thermometer data.”

Is utterly bogus.

This kind of unsubstantiated claim is why some skeptics get called conspiracy theorists. If you can’t back it up to show that 40% of the USHCN has stopped reporting, then don’t publish it.

What I was objecting to was the claim if 40% of the USHCN network was missing – something I know from my own studies to be a false claim.

He replied back with a new graph and the strawman argument and a new number:

The data is correct.

Since 1990, USHCN has lost about 30% of their stations, but they still report data for all of them. This graph is a count of valid monthly readings in their final and raw data sets.

Goddard_screenhunter_237-jun-01-16-10

The problem  was, I was not disputing the data, I was disputing the claim that 40% of USHCN stations were missing and had “completely fake” data (his words).  I knew that to be wrong. So I replied with a suggestion.

On Sun, Jun 1, 2014 at 5:13 PM, Anthony  wrote:

I have to leave for the rest of the day, but again I suggest you take this post down, or and the very least remove the title word “fabricated” and replace it with “loss” or something similar.
Not knowing what your method is exactly, I don’t know how you arrived at this, but I can tell you that what you plotted and the word “fabricated” don’t go together they way you envision.
Again, we’ve been working on USHCN for years, we would have noticed if that many stations were missing.
Anthony

Later when I returned, I noted a change had been made to Goddard’s blog post. The word “fabrication” remained but made a small change with no mention of it to the claim about stations. Since I had open a new browser window I had the before and after that change which you can see below:

http://wattsupwiththat.files.wordpress.com/2014/06/goddard_before.png

http://wattsupwiththat.files.wordpress.com/2014/06/goddard_after.png

I thought it was rather disingenuous to make that change without noting it, but I started to dig a little deeper and realized that Goddard was doing the same thing he was before when we pointed out the false hockey stick artifact in the USHCN; he was performing a subtraction on raw versus the final data.

I then knew for certain that his methodology wouldn’t hold up under scrutiny, but beyond doing some more private e-mail discussion trying to dissuade him from continuing down that path, I made no blog post or other writings about it.

Four days later, over at Lucias blog “The Blackboard” Zeke Hausfather took note of the issue and wrote this post about it: How not to calculate temperature

Zeke writes:

The blogger Steven Goddard has been on a tear recently, castigating NCDC for making up “97% of warming since 1990″ by infilling missing data with “fake data”. The reality is much more mundane, and the dramatic findings are nothing other than an artifact of Goddard’s flawed methodology.

Goddard made two major errors in his analysis, which produced results showing a large bias due to infilling that doesn’t really exist. First, he is simply averaging absolute temperatures rather than using anomalies. Absolute temperatures work fine if and only if the composition of the station network remains unchanged over time. If the composition does change, you will often find that stations dropping out will result in climatological biases in the network due to differences in elevation and average temperatures that don’t necessarily reflect any real information on month-to-month or year-to-year variability. Lucia covered this well a few years back with a toy model, so I’d suggest people who are still confused about the subject to consult her spherical cow.

His second error is to not use any form of spatial weighting (e.g. gridding) when combining station records. While the USHCN network is fairly well distributed across the U.S., its not perfectly so, and some areas of the country have considerably more stations than others. Not gridding also can exacerbate the effect of station drop-out when the stations that drop out are not randomly distributed.

The way that NCDC, GISS, Hadley, myself, Nick Stokes, Chad, Tamino, Jeff Id/Roman M, and even Anthony Watts (in Fall et al) all calculate temperatures is by taking station data, translating it into anomalies by subtracting the long-term average for each month from each station (e.g. the 1961-1990 mean), assigning each station to a grid cell, averaging the anomalies of all stations in each gridcell for each month, and averaging all gridcells each month weighted by their respective land area. The details differ a bit between each group/person, but they produce largely the same results.

Now again, I’d like to point out that Zeke and I are often polar opposites when it comes to the surface temperature record but I had to agree with him on this point; the methodology created the artifact. In order to properly produce a national temperature gridding must be employed, using the raw data without gridding will create various artifacts.

Spatial interpolation (gridding) for a national average temperature would be required in a constantly changing dataset, such as GHCN/USHCN, no doubt, gridding is a must. For a guaranteed quality dataset, where stations will be kept in the same exposure, producing reliable data, such as the US Climate Reference Network (USCRN), you could in fact use the raw data as a national average and plot it. Since it is free of the issues that gridding solves, it would be meaningful as long as the stations all report, don’t move, aren’t encroached upon, and don’t change sensors- i.e. the design and production goals of USCRN.

Anomalies aren’t necessarily required, they are an option depending on what you want to present. For example NCDC gives an absolute value for the national average temperature in their State of the Climate report each month, they also give a baseline and the departure anomaly from that baseline for both CONUS and Global temperature.

Now let me qualify that by saying that I have known for a long time that NCDC uses in filling of data from surrounding stations as part of the process of producing a national temperature average. I don’t necessarily agree with their methodology as being perfect, but it is a well-known issue and what Goddard discovered was simply a back door way of pointing out that the method exists. It wasn’t news to me and to many others who have followed the issue.

This is why you haven’t seen other prominent people in the climate debate ( Spencer, Curry, McIntyre, Michaels, McKitrick) and even myself make a big deal out of this hockey stick of data difference that Goddard has been pushing. If this were really an important finding you can bet they and yours truly would be talking about it and providing support and analysis.

It’s also important to note that Goddards graph  does not represent a complete loss of data from these stations. The differencing method that Goddard is using detects every missing data point from every station in the network. This could be as simple as one day of data missing in an entire month, or a string of days, or even an entire month which is rare. Almost every station in the USHCN at one time or another is missing some data. One exception might be the station at Mohonk Lake, New York which has a perfect record due to a dedicated observer, but has other problems related to siting.

If we were to throw out an entire month’s worth of observations because one day out of 31 is missing, chances are we’d have no national temperature average at all. So the method was created to fill in missing data from surrounding stations. In theory and in a perfect world this would be a good method, but as we know the world is a messy place, and so the method introduces some additional uncertainty.

The National Cooperative Observer network a.k.a. co-op is a mishmash of widely different stations and equipment. the co-op network is a much larger set of stations than the USHCN. The USHCN is a subset of the larger co-op network comprising some 8000 stations around the United States. Some are stations in Observer’s backyards, or at their farms, some are at government entities like fire stations and Ranger stations, some are electronic ASOS systems at airports. The vast majority of stations are poorly sited as we have documented using the surface station project, by our count 80% of the USHCN as poorly sited stations.  The real problem is with the micro-site issues of the stations. this is something that is not effectively dealt with in any methodology used by NCDC. We’ll have more on that later but I wanted to point out that no matter which data set you look at (NCDC, GISS, HadCRUT, BEST) the problem of station siting bias remains and is not dealt with. for those who don’t know NCDC provides the source data for the other interpretations of the surface temperature record, so they all have it. More on that later, perhaps in another blog post.

When it was first created the co-op network was done entirely on paper forms called B – 91’s. the observer would write down the daily high and low temperatures along with precipitation for each day of the month and then at the end of the month mail it in. An example B-91 form from Mohonk Lake, NY is shown below:

mohonk_lake_b91_image

Not all forms are so well maintained. Some B-91 forms have missing data, which can be due to the observer missing work, having an illness, or simply being lazy:

Marysville_B91

The form above is missing weekends because the secretary at the fire station doesn’t work on weekends and the firefighters aren’t required to fill in for her. I know this having visited this station and I interviewed the people involved.

So, in such an imperfect “you get what you pay for” world of volunteer observers, you know from the get-go that you are going to have missing data, and so, in order to be able to use any of these at all, a method had to be employed to deal with it, and that was infilling of data. This has been a process done for years, long before Goddard “discovered” it.

There was no nefarious intent here, NOAA/NCDC isn’t purposely trying to “fabricate” data as Goddard claims, they are simply trying to be able to figure out a way to make use of it at all.  The word “fabrication” is the wrong word to use, as it implies the data is being plucked out of thin air. It isn’t – it is being gathered from nearby stations and used to create a reasonable estimate. Over short ranges one can reasonably expect daily weather (temperature at least, precip not so much) to be similar assuming the stations are similarly sited and equipped but that’s where another devil in the details exists.

Back when I started the surfacestations project, I noted one long-period well sited station, Orland was in a small sea of bad stations, and that its temperature diverged markedly from its neighbors, like the horrid Marysville Fire station where the MMTS thermometer was directly next to asphalt:

marysville_badsiting[1]

Orland is one of those stations that reports on paper at the end of the month. Marysville (shown above) reported daily using the touch-tone weathercoder, so its data was available by the end of each day.

What happens in the first runs of the NCDC CONUS temperature process is that they end up with mostly the airports ASOS stations and the weathercoder stations. The weathercoder reporting stations tend to be more urban than rural since a lot of observers don’t want to make long distance phone calls. And so in the case of missing station data on early in the month runs, we tend to get a collection of the poorer sited stations. The FILNET process, designed to “fix” missing data goes to work, and starts infilling data.

A lot of the “good” stations don’t get included in the early runs, because the rural observers often opt for a paper form mailed in rather than the touch-tone weathercoder, and those stations have data infilled from many of the nearby ones, “polluting” the data.

And we have shown back in 2012, those stations have a much lower century scale trend than than the majority of stations in the surface network. In fact, by NOAA’s own siting standards, over 80% of the surface network is producing unacceptable data and that data gets blended in.

Steve McIntyre noted that even in good stations like Orland, the data gets “polluted” by the process:

http://climateaudit.org/2009/06/29/orland-ca-and-the-new-adjustments/

So, imagine this going on for hundreds of stations, perhaps even thousands early on in the month.

To the uninitiated observer, this “revelation” by Goddard could look like NCDC is in fact “fabricating” data. Given the sorts of scandals that have happened recently with government data such as the IRS “loss of e-mails”, the padding of jobs and economic reports, and other issues from the current administration I can see why people would easily embrace the word “fabrication” when looking at NOAA/NCDC data. I get it. Expecting it because much of the rest of the government has issues doesn’t make it true though.

What is really going on is that the FILNET algorithm, design to fix a few stations that might be missing some data in the final analysis is running a wholesale infill on early incomplete data, which NCDC pushes out to their FTP site. The process gets to be less and less as the month goes on, as more data comes in.

But over time, observers have been less inclined to produce reports, and attrition in both the USHCN and and the co-op network is something that I’ve known about for quite some time having spoken with hundreds of observers. Many of the observers are older people and some of the attrition is due to age, infirmity, and death. You can see what I’m speaking of my looking through the quarterly NOAA co-op newsletter seen here: http://www.nws.noaa.gov/om/coop/coop_newsletter.htm

NOAA often has trouble finding new observers to take the place of the ones they have lost, and so, it isn’t a surprise that over time we would see the number missing data points rise. Another factor is technology many observers I spoke with wonder why they still even do the job when we have computers and electronics that can do the job faster. I explained to them that their work is important because automation can never replace the human touch. I always thank them for their work.

The downside is that the USHCN and is a very imperfect and heterogeneous network and will remain so; it isn’t “fixable” at an operational level, so statistical fixes are resorted to. That has both good and bad influences.

The newly commissioned USCRN will solve that with its new data gathering system, some of its first data is now online for the public.

USCRN_avg_temp_Jan2004-April2014

Source: NCDC National Temperature Index time series plotter

Since this is a VERY LONG post, it will be continued…in part 2

In part 2 I’ll talk about things that we disagree on and the things we can find a common ground on.

Part 2 is now online here.

5 1 vote
Article Rating

Discover more from Watts Up With That?

Subscribe to get the latest posts sent to your email.

174 Comments
Inline Feedbacks
View all comments
Rud Istvan
June 25, 2014 1:29 pm

I wrote an essay about this after studying the matter as carefully as I could. Goddard is right about past cooling adjustments, mostly apparently inserted through homogenization. Easily provable either by specific locations (Reykjavik Iceland, Sulina Romania, and Darwin Australia were examples used), by state (California and Maine) by country (US, Australia, New Zealand) and by data set (NCDC, GISS, and HADCrut 4).
He is wrong in his recent posts about what is being done. And should stop it, as it opens up ‘flat earth’ avenues of dismissal of the fact that records have been adjusted, and in the opposite from which UHI bias ober time would properly have been handled.

Latitude
June 25, 2014 1:35 pm

talldave2 says:
June 25, 2014 at 1:11 pm
Except, again… from the published data, we already know something happened, don’t we? So when you show up and say “ignore the temperatures that were actually measured, use the anomaly!” you can forgive us for hearing it as “pay no attention to that man behind the curtain!”
=====
Dave thanks, that’s an even better example of what I was saying….
Zeke says NASA/GISS switched from NOAA raw data to homogenized data…..that was around 2000
….that switch, and switch alone, caused the warming trend prior to 2000
Conveniently, after 2000, the warming trend stopped
The only trend in warming was an adjustment to past temp history….
And people are basing their science on that garbage…….

AlexS
June 25, 2014 1:36 pm

“If we were to throw out an entire month’s worth of observations because one day out of 31 is missing, chances are we’d have no national temperature average at all. ”
This is one the most dangerous affirmations done here.
It might force people to accept bad data just because it needs a result.

June 25, 2014 1:37 pm

BTW Nick apologies, I think I scrolled too fast and confused your post with Zeke’s. Mea culpa!
At any rate, unless the NSF wants to give me $750K instead of using it on a play about the perils of global warming (which is a totally obejctive, scientific, and unobjectionable use of taxpayer funds) I have to get back to the private sector here, so I’ll just leave you with this thought: is it really plausible that every economist is wrong about the US having a really cold Q1, and that we had record-late Great Lakes ice in a year that was relatively warm for the region? I think if we pull enough proxies together and look for correlations, we will probably find that the simple average, for all its many flaws, appears closer to reality than what is being published.

pouncer
June 25, 2014 1:41 pm

The P.R. aspect is to raise the issue during the same week that “data loss” takes the media spotlight, with “hard drive crashes” reportedly afflicting both the US Treasury /IRS and the US Environmental /EPA. The notion that temperature data might be missing, perhaps deliberately MADE to be missing, is more plausible this week than it would have been say last October.

June 25, 2014 1:50 pm

Talldave2,
Interestingly enough, my recent paper on UHI found results similar to Steve’s blog post in the raw data: ftp://ftp.ncdc.noaa.gov/pub/data/ushcn/papers/hausfather-etal2013.pdf
Also, you may have missed the part in Steve’s post where he uses anomalies rather than absolute temperatures :-p

NikFromNYC
June 25, 2014 1:56 pm

Nick Stokes lawyerlike glossed as usual: “Marcott et al did use anomalies. But because they were dealing with long time periods during which a lot changed, the expected values were no longer near zero near the end of the time. So changing the mix of proxies, as some dropped out, did have a similar effect.”
A pattern of rent and fame seeking alarmist behavior exists that now in 2014 demands a much stronger condemnation of Marcott 2013. Marcott et al. *purposefully* re-dated proxies in order to get their alarmist paper published at all in top journal Science since without that re-dating there was no result at all except proxies that didn’t go anywhere and thus were likely not even valid proxies at all. Then a co-author himself described the completely spurious and thus completely fraudulent blade to New York Times reporter as a “super hockey stick” using a swoosh gesture. Then comes its promotion (and eventual post-headline news whitewashing) by hockey stick team PR machine at RealClimate.org, a site registered to a notorious junk science promoting PR firm, and a real media sensation resulted that helped propagate the emergency level funding of climate “science.”
http://youtu.be/PgnMuKuVXzU
This co-author of all people was very well aware that the blade was a mere artifact so his promotion of it amounts to criminal fraud and should be prosecuted as such, just like this AIDs researcher is now facing criminal charges:
http://www.cbsnews.com/news/aids-researcher-charged-with-fraud-for-falsifying-data/
“Responding to a major case of research misconduct, federal prosecutors have taken the rare step of filing charges against a scientist after he admitted falsifying data that led to millions in grants and hopes of a breakthrough in AIDS vaccine research.”

pdtillman
June 25, 2014 2:05 pm

Heh. A pleasure to read a WUWT comment thread that’s mostly about science, with only a bit of political cant.
Thanks, Anthony, for looking into this.

JFD
June 25, 2014 2:07 pm

Nick says: I agree that noise at the end of the series should not have been allowed to cause a distraction. But it does not undermine the paper.
———-
Nick, work such as Marcott’s is being used by governments such as the USA to declare war on fossil fuels. Experts may be able ignore the bad parts of Marcott’s paper, but governments are not peopled by experts in climatology or experts for much of anything to do with science. Marcott should have withdrawn his paper, corrected his errors, rewritten the paper and resubmitted it. Your defense of those with moral ineptitude such as Marcott does not help the world.

Rob
June 25, 2014 2:07 pm

Wow!
Great post.
Hang on to all original data folks.
Counterfeit is everywhere!

June 25, 2014 2:09 pm

Steve McIntyre says:
June 25, 2014 at 12:58 pm
Further to Zeke and Nick Stokes comments above acknowledging the similarity of Goddard’s error to Marcott’s error, there is, of course, a major difference.
##############################
Steve we have over the years fought against bad methods.
Can you help by simply explaining to people why Goddards methods of averaging absolute temperatures is simply wrong.
1 this issue does not need to be confused with discussions of Marcott
2. this issue does not need to be confused with discussions of adjustments.
perhaps, You and McKittrick and Spencer can join Anthony in a forthright criticism of Goddards approach. Its a bad method. It gives spurious results
Enough with the silence of the skeptical lambs

Latitude
June 25, 2014 2:10 pm

“The only trend in warming was an adjustment to past temp history….”
correction: I’m falling trap to their words too….
The only trend in warming was from throwing out the old temp data prior to 1999…
…and using a totally different set of data
That’s not called an adjustment…

Editor
June 25, 2014 2:17 pm

Zeke Hausfather – I’m interested in your comment : “Its worth mentioning again that infilling (as done by NCDC) has virtually no effect on the trends in temperatures over time.“. If we know what the temperature trend is without infilling, then we don’t need to infill in order to get the trend. If we don’t know what the temperature trend is without infilling, then we can’t know that infilling “has virtually no effect” on the trend.

June 25, 2014 2:18 pm

“If we were to throw out an entire month’s worth of observations because one day out of 31 is missing, chances are we’d have no national temperature average at all. So the method was created to fill in missing data from surrounding stations. In theory and in a perfect world this would be a good method, but as we know the world is a messy place, and so the method introduces some additional uncertainty.”
This is a bit of an exagerration.
You can construct a US average and a global average using daily data where you require 100%
data coverage.
See our daily data product.
Then you can also test how many days of drop out you can have without effecting the result
then you can test how good an infilling routine is.

NikFromNYC
June 25, 2014 2:20 pm

JeffC taunted: “How about you refute Cook with your own study?”
Well we do have two published papers already doing that, first one by Legates, Soon, Briggs & Monckton that clarified his 97% claim as really being 0.2% when you use the proper IPCC consensus definition of half of recent warming being anthropogenic, and now we have Richard Tol’s overall debunking of his entire methodology as being that of an unscientific hack. What else do you want? The meteorologist study? That showed 48% skepticism towards climate alarm. And already the American Physical Society has filled half of its new climate statement committee with skeptics replacing activists and might that open the dam once and for all? You bet it will. Was it not enough to expose Cook’s intent to widely promote his finding of consensus *before* he did the actual analysis?! I think that’s what’s known as “being busted!” And are we really losing the debate? In the last few years, three major Western countries have severely turned away from climate alarm, as has fully half of the American political machine, with Canada, Britain and Australia turning sharply to the right in general in a backlash against climate alarm and related liberal excesses.
“To achieve this goal, we mustn’t fall into the trap of spending too much time on analysis and too little time on promotion. As we do the analysis, would be good to have the marketing plan percolating along as well.” – John Cook, who was also busted early on by Tol for using the bizarre boutique search term “global climate change” to cherry pick studies of the effects of climate change by everyday naturalists while omitting studies of its possible attribution to mankind because no mainstream climatologist uses Cook’s phrase rather than “global warming” or “climate change” and absolutely no skeptic would use such a weird term when they are trying to expose global warming claims as being false, because no skeptic accepts the silly term “climate change” since it represents propaganda over clarity.

June 25, 2014 2:28 pm

Mike Jonas,
Its pretty simple to calculate. Average all the anomalies excluding any infilled values. Average all the anomalies including any infilled values. Compare the two results.

Tom In Indy
June 25, 2014 2:28 pm

REPLY: See the note about Part 2. I have actual money earning work I must do today -Anthony
Sorry, I should have made it clear that I was asking for a discussion of that particular chart be included in part 2. I didn’t expect you to spend time on anything new in this thread. I am on vacation from from my actual money earning work and have no excuse for not taking the time to consider you might think I was expecting an immediate response.

June 25, 2014 2:28 pm

From the various comments made here today; I get the distinct impression that the data sets are a mess and that it would be all too easy for those working on the temperature data sets to “see what they want to see and disregard the rest”. We have gotten to the point where we are arguing about what words to use when describing the “adjustments”. It looks like the anomaly game makes it fairly easy to hide any “fiddling” or to hide plain old observational bias: especially if we change the baseline when we need to do so. And the people working on these data sets are biased — never think they are not.
But on a slightly different note (very related though) is this post by the Scottish Skeptic:
http://scottishsceptic.wordpress.com/2014/06/25/my-best-estimate-less-than-1c-warming-if-co2-level-is-doubled/

GeologyJim
June 25, 2014 2:33 pm

Question for all:
From Anthony’s work on stations and temp measurements, I’ve learned many things that cast doubt on the climatological value of these data, regardless if they are stated as measured values or anomalies
One huge problem being overlooked in this discussion (I believe) is that of the “daily average temperature”, calculated by adding the daily high value to the daily low value and dividing by 2.
Daily highs and daily lows should have different long-term trends if, as Anthony has shown, the overnight lows are creeping upward due to “urbanization”. Calculating a daily average mashes together the distinct data recorded as highs and lows. Reducing the number of stations in the calculation adds a warming bias (more stations retained where more people live). Spatial averaging (gridding) just smears the warm-biased data over larger areas and enables NASA to produce really scary looking global anomaly maps with warm-biased color schemes.
The averages seem to show upward trends over time, but much of that seems due to rising overnight low values – – not to rising daytime high temperatures. Does anyone really doubt this?
Record high temps are still much more frequent in the 1930s-1940s than today
I suggest if one were to separately calculate trends based on daily high values vs daily low values, the upward trend would clearly be shown by the daily-low data, and not shown by the daily-high data.

June 25, 2014 2:36 pm

perhaps, You and McKittrick and Spencer can join Anthony in a forthright criticism of Goddards approach… Enough with the silence of the skeptical lambs
Translation: “I am a climate alarmist, and I enjoy infighting among skeptics. So, boys, let’s you and him fight!”
We are all better off constantly pointing out that the alarmist crowd has ownership of total failure. Not one of their numerous CAGW predictions have happened. Not one of them predicted that global warming would stop for nearly two decades. No model predicted that, either.
Now their tactic seems to be to try to forment dissention among any skeptics who have slightly different opinions or methods. Sort of like the Dems arranging for McCain to be the R nominee, isn’t it?

Jonathan Abbott
June 25, 2014 3:08 pm

Thanks for a very well written and necessary post. Goddard is a crank and cranks on either side should be exposed as such. There are all sorts of problems with the temperature records but Goddard helps nobody but himself.

Eliza
June 25, 2014 3:08 pm

I think this site is really falling for it. has anyone actually visited the NOAA or NCDC monthly report sites??? Its all about the warmest 100th month in the last 1000 years etc. They are totally biased toward the AGW agenda. In regards to Goddard’s site all I know is that nearly all his analysis of past temperatures especially USA ones, adjustments have been spot on. The one in contention is just one of many. For example adjustments of USA temperatures re 1934 V current warmest etc. All the articles and data from the past are not faked. No wonder the warmistas are SH@ scared of him as he keeps a meticulous record of ALL data and articles. The likes of Zeke etc and Mosher who are simply computer geekwarmist trolls in my view (vist their sites) basically live off or love modeling, the curse of climate science as they shall find out no doubt in coming years. LOL. By the way Goddards contention seems to be completely supported by John Coleman etc.Just check his show on USA temperature data tampering. BTW what about BOM, New Zealand data tampering. Of course Goddard is right and you are way wrong. My respect for this site dived dramatically when Watts as an excuse posted at Lucias site the fact the Goddard was a pseudonym (who gives a royal ####, its the data stupid!). You’ve lost me anyway until you get the message that AGW is over, finito, it ain’t happening ect. and you are basically feeding the warmist trolls here

Stephen Rasey
June 25, 2014 3:09 pm

The word “fabrication” is the wrong word to use, as it implies the data is being plucked out of thin air. It isn’t – it is being gathered from nearby stations and used to create a reasonable estimate.
Ok, change ‘fabrication’ to “estimation by infill’
The trend of the chart raises eye brows and it should.
Why should year 2005 have more infill than 1995?
Remember, infill is only to replace gaps in raw data with estimates (who knows how good) from near by stations (of unknown quality). Why the heck should there be more gaps in 2005 data than in all of 1990-1999?
I think Goddard has a point about the drop off of raw data (blue) compared to a constant number of stations (red). If anything, our data coverage should be getting better as attention and money get drawn to CAGW. Good ol’ peer review might get to the bottom of it. But it needs addressing.

Patrick B
June 25, 2014 3:10 pm

Infills or whatever you call it – IT IS NOT DATA – in science, data is information you collect from observation. Anything after that is not DATA – it may be derived from data but no one trained in science would call it data. Just one more example of corruption of science and language in climate “studies”.

David Riser
June 25, 2014 3:38 pm

While Mr. Goddard may be a bit over the top at times, his basic message is sound and it provides some interesting thoughts on the data. Of particular interest is the fact that his methodology leads to similar artifacts to what happens when creators of global temperature data sets do the same thing. Frequently you get a different anomaly number a few months later after they add in some more data and do a bit more mashing of the numbers, rarely do they ever bring the change to light.
I think its useful to discuss the issue but I also applaud Mr. Goddard for his efforts. The fact that he listens to feedback is the sign of someone actually interested in science. His focus is different and his suspicions of motive have some basis in fact. I would be more inclined to caution him about being over the top if the major data set producers were a bit more open about method and over time changes. Specifically I would like to see the code that produces some of this homogenization. I suspect that Mr. Hanson’s original work is very biased.
On a different note I am looking forward to Antho#y’s release of his latest surface station work.
v/r,
David Riser