On 'denying' Hockey Sticks, USHCN data, and all that – part 1

Part 2 is now online here.

One of the things I am often accused of is “denying” the Mann hockey stick. And, by extension, the Romm Hockey stick that Mann seems to embrace with equal fervor.

While I don’t “deny” these things exist, I do dispute their validity as presented, and I’m not alone in that thinking. As many of you know Steve McIntyre and Ross McKitrick, plus many others have extensively debunked statistics that went into the Mann hockey stick showing where errors were made, or in some cases known and simply ignored because it helped “the cause”.

The problem with hockey stick style graphs is that they are visually compelling, eliciting reactions like whoa, there’s something going on there! Yet, oftentimes when you look at the methodology behind the compelling visual you’ll find things like “Mike’s Nature Trick“. The devil is always in the details, and you often have to dig very deep to find that devil.

Just a little over a month ago, this blog commented on the hockey stick shape in the USHCN data set which you can see here:

2014_USHCN_raw-vs-adjusted

The graph above was generated by” Stephen Goddard” on his blog and it generated quite a bit of excitement and attention.

At first glance it looks like something really dramatic happened to the data, but again when you look at those devilish details you find that the visual is simply an artifact of methodology. Different methods clearly give different results and the”hockey stick” disappears when other methods are used.

USHCN-Adjustments-by-Method-Year

The graph above is courtesy of Zeke Hausfather Who co-wrote that blog entry with me. I should note that Zeke and I are sometimes polar opposites when it comes to the surface temperature record. However, in this case we found a point of agreement. That point was that the methodology gave a false hockey stick.

I wrote then:

While Goddard’s code and plot produced a mathematically correct result, the procedure he chose (#1 The All Absolute Approach) comparing absolute raw USHCN data and absolute finalized USHCN data, was not, and it allowed non-climatic differences between the two datasets, likely caused by missing data (late reports) to create the spike artifact in the first four months of 2014 and somewhat overstated the difference between adjusted and raw temperatures by using absolute temperatures rather than anomalies.

Interestingly, “Goddard” replied and comments with a thank you for helping to find the reason for this hockey stick shaped artifact. He wrote:

stevengoddard says:

http://wattsupwiththat.com/2014/05/10/spiking-temperatures-in-the-ushcn-an-artifact-of-late-data-reporting/#comment-1632952  May 10, 2014 at 7:59 am

Anthony,

Thanks for the explanation of what caused the spike.

The simplest approach of averaging all final minus all raw per year which I took shows the average adjustment per station year. More likely the adjustments should go the other direction due to UHI, which has been measured by the NWS as 8F in Phoenix and 4F in NYC.

Lesson learned. It seemed to me that was the end of the issue. Boy, was I wrong.

A couple of weeks later in e-mail Steven Goddard circulated a new graph with a hockey stick shape which you can see below. He wrote to me and a few others on the mailing list this message:

Here is something interesting. Almost half of USHCN data is now completely fake.

Goddard_screenhunter_236-jun-01-15-54

http://stevengoddard.wordpress.com/2014/06/01/more-than-40-of-ushcn-station-data-is-fabricated/

After reading his blog post I realized he had made a critical error and I wrote back an e-mail the following:

This claim: “More than 40% of USHCN final station data is now generated from stations which have no thermometer data.”

Is utterly bogus.

This kind of unsubstantiated claim is why some skeptics get called conspiracy theorists. If you can’t back it up to show that 40% of the USHCN has stopped reporting, then don’t publish it.

What I was objecting to was the claim if 40% of the USHCN network was missing – something I know from my own studies to be a false claim.

He replied back with a new graph and the strawman argument and a new number:

The data is correct.

Since 1990, USHCN has lost about 30% of their stations, but they still report data for all of them. This graph is a count of valid monthly readings in their final and raw data sets.

Goddard_screenhunter_237-jun-01-16-10

The problem  was, I was not disputing the data, I was disputing the claim that 40% of USHCN stations were missing and had “completely fake” data (his words).  I knew that to be wrong. So I replied with a suggestion.

On Sun, Jun 1, 2014 at 5:13 PM, Anthony  wrote:

I have to leave for the rest of the day, but again I suggest you take this post down, or and the very least remove the title word “fabricated” and replace it with “loss” or something similar.
Not knowing what your method is exactly, I don’t know how you arrived at this, but I can tell you that what you plotted and the word “fabricated” don’t go together they way you envision.
Again, we’ve been working on USHCN for years, we would have noticed if that many stations were missing.
Anthony

Later when I returned, I noted a change had been made to Goddard’s blog post. The word “fabrication” remained but made a small change with no mention of it to the claim about stations. Since I had open a new browser window I had the before and after that change which you can see below:

http://wattsupwiththat.files.wordpress.com/2014/06/goddard_before.png

http://wattsupwiththat.files.wordpress.com/2014/06/goddard_after.png

I thought it was rather disingenuous to make that change without noting it, but I started to dig a little deeper and realized that Goddard was doing the same thing he was before when we pointed out the false hockey stick artifact in the USHCN; he was performing a subtraction on raw versus the final data.

I then knew for certain that his methodology wouldn’t hold up under scrutiny, but beyond doing some more private e-mail discussion trying to dissuade him from continuing down that path, I made no blog post or other writings about it.

Four days later, over at Lucias blog “The Blackboard” Zeke Hausfather took note of the issue and wrote this post about it: How not to calculate temperature

Zeke writes:

The blogger Steven Goddard has been on a tear recently, castigating NCDC for making up “97% of warming since 1990″ by infilling missing data with “fake data”. The reality is much more mundane, and the dramatic findings are nothing other than an artifact of Goddard’s flawed methodology.

Goddard made two major errors in his analysis, which produced results showing a large bias due to infilling that doesn’t really exist. First, he is simply averaging absolute temperatures rather than using anomalies. Absolute temperatures work fine if and only if the composition of the station network remains unchanged over time. If the composition does change, you will often find that stations dropping out will result in climatological biases in the network due to differences in elevation and average temperatures that don’t necessarily reflect any real information on month-to-month or year-to-year variability. Lucia covered this well a few years back with a toy model, so I’d suggest people who are still confused about the subject to consult her spherical cow.

His second error is to not use any form of spatial weighting (e.g. gridding) when combining station records. While the USHCN network is fairly well distributed across the U.S., its not perfectly so, and some areas of the country have considerably more stations than others. Not gridding also can exacerbate the effect of station drop-out when the stations that drop out are not randomly distributed.

The way that NCDC, GISS, Hadley, myself, Nick Stokes, Chad, Tamino, Jeff Id/Roman M, and even Anthony Watts (in Fall et al) all calculate temperatures is by taking station data, translating it into anomalies by subtracting the long-term average for each month from each station (e.g. the 1961-1990 mean), assigning each station to a grid cell, averaging the anomalies of all stations in each gridcell for each month, and averaging all gridcells each month weighted by their respective land area. The details differ a bit between each group/person, but they produce largely the same results.

Now again, I’d like to point out that Zeke and I are often polar opposites when it comes to the surface temperature record but I had to agree with him on this point; the methodology created the artifact. In order to properly produce a national temperature gridding must be employed, using the raw data without gridding will create various artifacts.

Spatial interpolation (gridding) for a national average temperature would be required in a constantly changing dataset, such as GHCN/USHCN, no doubt, gridding is a must. For a guaranteed quality dataset, where stations will be kept in the same exposure, producing reliable data, such as the US Climate Reference Network (USCRN), you could in fact use the raw data as a national average and plot it. Since it is free of the issues that gridding solves, it would be meaningful as long as the stations all report, don’t move, aren’t encroached upon, and don’t change sensors- i.e. the design and production goals of USCRN.

Anomalies aren’t necessarily required, they are an option depending on what you want to present. For example NCDC gives an absolute value for the national average temperature in their State of the Climate report each month, they also give a baseline and the departure anomaly from that baseline for both CONUS and Global temperature.

Now let me qualify that by saying that I have known for a long time that NCDC uses in filling of data from surrounding stations as part of the process of producing a national temperature average. I don’t necessarily agree with their methodology as being perfect, but it is a well-known issue and what Goddard discovered was simply a back door way of pointing out that the method exists. It wasn’t news to me and to many others who have followed the issue.

This is why you haven’t seen other prominent people in the climate debate ( Spencer, Curry, McIntyre, Michaels, McKitrick) and even myself make a big deal out of this hockey stick of data difference that Goddard has been pushing. If this were really an important finding you can bet they and yours truly would be talking about it and providing support and analysis.

It’s also important to note that Goddards graph  does not represent a complete loss of data from these stations. The differencing method that Goddard is using detects every missing data point from every station in the network. This could be as simple as one day of data missing in an entire month, or a string of days, or even an entire month which is rare. Almost every station in the USHCN at one time or another is missing some data. One exception might be the station at Mohonk Lake, New York which has a perfect record due to a dedicated observer, but has other problems related to siting.

If we were to throw out an entire month’s worth of observations because one day out of 31 is missing, chances are we’d have no national temperature average at all. So the method was created to fill in missing data from surrounding stations. In theory and in a perfect world this would be a good method, but as we know the world is a messy place, and so the method introduces some additional uncertainty.

The National Cooperative Observer network a.k.a. co-op is a mishmash of widely different stations and equipment. the co-op network is a much larger set of stations than the USHCN. The USHCN is a subset of the larger co-op network comprising some 8000 stations around the United States. Some are stations in Observer’s backyards, or at their farms, some are at government entities like fire stations and Ranger stations, some are electronic ASOS systems at airports. The vast majority of stations are poorly sited as we have documented using the surface station project, by our count 80% of the USHCN as poorly sited stations.  The real problem is with the micro-site issues of the stations. this is something that is not effectively dealt with in any methodology used by NCDC. We’ll have more on that later but I wanted to point out that no matter which data set you look at (NCDC, GISS, HadCRUT, BEST) the problem of station siting bias remains and is not dealt with. for those who don’t know NCDC provides the source data for the other interpretations of the surface temperature record, so they all have it. More on that later, perhaps in another blog post.

When it was first created the co-op network was done entirely on paper forms called B – 91’s. the observer would write down the daily high and low temperatures along with precipitation for each day of the month and then at the end of the month mail it in. An example B-91 form from Mohonk Lake, NY is shown below:

mohonk_lake_b91_image

Not all forms are so well maintained. Some B-91 forms have missing data, which can be due to the observer missing work, having an illness, or simply being lazy:

Marysville_B91

The form above is missing weekends because the secretary at the fire station doesn’t work on weekends and the firefighters aren’t required to fill in for her. I know this having visited this station and I interviewed the people involved.

So, in such an imperfect “you get what you pay for” world of volunteer observers, you know from the get-go that you are going to have missing data, and so, in order to be able to use any of these at all, a method had to be employed to deal with it, and that was infilling of data. This has been a process done for years, long before Goddard “discovered” it.

There was no nefarious intent here, NOAA/NCDC isn’t purposely trying to “fabricate” data as Goddard claims, they are simply trying to be able to figure out a way to make use of it at all.  The word “fabrication” is the wrong word to use, as it implies the data is being plucked out of thin air. It isn’t – it is being gathered from nearby stations and used to create a reasonable estimate. Over short ranges one can reasonably expect daily weather (temperature at least, precip not so much) to be similar assuming the stations are similarly sited and equipped but that’s where another devil in the details exists.

Back when I started the surfacestations project, I noted one long-period well sited station, Orland was in a small sea of bad stations, and that its temperature diverged markedly from its neighbors, like the horrid Marysville Fire station where the MMTS thermometer was directly next to asphalt:

marysville_badsiting[1]

Orland is one of those stations that reports on paper at the end of the month. Marysville (shown above) reported daily using the touch-tone weathercoder, so its data was available by the end of each day.

What happens in the first runs of the NCDC CONUS temperature process is that they end up with mostly the airports ASOS stations and the weathercoder stations. The weathercoder reporting stations tend to be more urban than rural since a lot of observers don’t want to make long distance phone calls. And so in the case of missing station data on early in the month runs, we tend to get a collection of the poorer sited stations. The FILNET process, designed to “fix” missing data goes to work, and starts infilling data.

A lot of the “good” stations don’t get included in the early runs, because the rural observers often opt for a paper form mailed in rather than the touch-tone weathercoder, and those stations have data infilled from many of the nearby ones, “polluting” the data.

And we have shown back in 2012, those stations have a much lower century scale trend than than the majority of stations in the surface network. In fact, by NOAA’s own siting standards, over 80% of the surface network is producing unacceptable data and that data gets blended in.

Steve McIntyre noted that even in good stations like Orland, the data gets “polluted” by the process:

http://climateaudit.org/2009/06/29/orland-ca-and-the-new-adjustments/

So, imagine this going on for hundreds of stations, perhaps even thousands early on in the month.

To the uninitiated observer, this “revelation” by Goddard could look like NCDC is in fact “fabricating” data. Given the sorts of scandals that have happened recently with government data such as the IRS “loss of e-mails”, the padding of jobs and economic reports, and other issues from the current administration I can see why people would easily embrace the word “fabrication” when looking at NOAA/NCDC data. I get it. Expecting it because much of the rest of the government has issues doesn’t make it true though.

What is really going on is that the FILNET algorithm, design to fix a few stations that might be missing some data in the final analysis is running a wholesale infill on early incomplete data, which NCDC pushes out to their FTP site. The process gets to be less and less as the month goes on, as more data comes in.

But over time, observers have been less inclined to produce reports, and attrition in both the USHCN and and the co-op network is something that I’ve known about for quite some time having spoken with hundreds of observers. Many of the observers are older people and some of the attrition is due to age, infirmity, and death. You can see what I’m speaking of my looking through the quarterly NOAA co-op newsletter seen here: http://www.nws.noaa.gov/om/coop/coop_newsletter.htm

NOAA often has trouble finding new observers to take the place of the ones they have lost, and so, it isn’t a surprise that over time we would see the number missing data points rise. Another factor is technology many observers I spoke with wonder why they still even do the job when we have computers and electronics that can do the job faster. I explained to them that their work is important because automation can never replace the human touch. I always thank them for their work.

The downside is that the USHCN and is a very imperfect and heterogeneous network and will remain so; it isn’t “fixable” at an operational level, so statistical fixes are resorted to. That has both good and bad influences.

The newly commissioned USCRN will solve that with its new data gathering system, some of its first data is now online for the public.

USCRN_avg_temp_Jan2004-April2014

Source: NCDC National Temperature Index time series plotter

Since this is a VERY LONG post, it will be continued…in part 2

In part 2 I’ll talk about things that we disagree on and the things we can find a common ground on.

Part 2 is now online here.

5 1 vote
Article Rating
174 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
copernicus34
June 25, 2014 10:20 am

Thanks Mr Watts, important

June 25, 2014 10:27 am

Interesting post, and I look forward to part 2 in the near future.
I don’t understand a few things and hope you will address it in the future post. First, I don’t understand how the data sets always seem to make adjustments that warm the present and cool the past. In fact, I don’t understand the idea of changing the past (say 1940 or whatever) at all. Unless we have time machines and go back to read the thermometers better, why these changes?
The other big thing I fail to understand is reporting the temperature to hundredths of a degree. I understand that we humans have trouble reading an old time thermometer much better than to the nearest degree and that the electronic gadgets might report to tenths of a degree (accurately?). Am I wrong on that? If I am right, how can you average a bunch of temps that are only good to the nearest degree and get an answer to the nearest hundredth? Did they change measurement rules since I was in school?
I have a few more questions but those two are the ones that baffle me the most.
— Mark

Latitude
June 25, 2014 10:27 am

you guys realize you’re arguing over a fraction of a degree……you can’t even see it on a thermometer
..and debating “science” on a metric so small…..no one should really care

Steve McIntyre
June 25, 2014 10:34 am

Anthony, it looks to me like Goddard’s artifact is almost exactly equivalent in methodology to Marcott’s artifact spike – this is a much more exact comparison than Mann. Marcott’s artifact also arose from data drop-out.
However, rather than conceding the criticism, Marcott et al have failed to issue a corrigendum and their result has been widely cited.

Resourceguy
June 25, 2014 10:36 am

Thanks

Mark Bofill
June 25, 2014 10:46 am

Wait, what am I looking at here?
Ah! I remember what that’s called now. Integrity.
Don’t see examples of that all that often in the climate blogosphere.
Thanks Anthony.

NikFromNYC
June 25, 2014 10:47 am

Goddard willfully sponsors a hostile and utterly reason averse and pure tribal culture on his very high traffic skeptical blog where about a dozen political fanatics are cheerled on by a half dozen tag along crackpots who all pile on anybody who offers constructive criticism. His blog alone is responsible for the continuing and very successful negative stereotyping of mainstream skepticism by a highly funded alarmist PR machine. His overpolitization of climate model skepticism results in a great inertia by harshly alienating mostly liberal academic scientists and big city professionals who also lean left but who might otherwise be open to reason. I live two blocks from NASA GISS above Tom’s Diner, just above the extremely liberal Upper West Side and my main hassle in stating facts and showing official data plots is online extremism being pointed out by Al Gore’s activist crowd along with John Cook’s more sophisticated obfuscation crowd. Goddard’s regular conspiracy theory about CIA drug use to brainwash school kids into shooting incidents in order to disarm conservatives in preparation for concentration camps for conservatives is something skeptics should stop ignoring and start actively shunning. His blog is the crack house of skepticism.

Bloke down the pub
June 25, 2014 10:51 am

Am I being cynical to suggest that if the adjustments were making the temperature trend appear lower rather than higher, that a new method would have been introduced by now?

NikFromNYC
June 25, 2014 10:57 am

Backwards emphasis typo: “I explained to them that their work is important because the human touch can never replace automation.”
REPLY: Yes, a sign of my fatigue. Thanks – Anthony

Bloke down the pub
June 25, 2014 11:01 am

Anthony, you mention the uscrn and say it will fix the problem. Isn’t the number of sites on this network going to be limited? While it should give a much better indication of the true temp trend, won’t it be of limited use for infilling data? Until the ushcn is upgraded to uscrn standards surely it’s better to rely on satellite records?
REPLY: As far as I know, it was never intended to be used for data infilling, only to stand on its own merit. – Anthony

John Slayton
June 25, 2014 11:07 am

…the human touch can never replace automation.
Au contraire…
: > )
REPLY: Refresh, fixed while you were reading – Anthony

Dougmanxx
June 25, 2014 11:18 am

You can see plenty of stations that have “-9999” data points in the “raw” data. Those “-9999” data points auto-magically go away in the final. This process is called “infilling”, and yes, it’s completely bogus. I know it’s bogus because those “-9999” data points rarely go away in the “raw” data, indicating to me, that “data” was never actually reported for them, but “data” was created via “infilling”. Does some actual “data” come in late? Sure, but not the vast majority. So creating “data” where there is none is acceptable in science? Sorry, but I’ve lost a lot of respect for this blog if you support THAT kind of nonsense.
REPLY: Some actual data does come in quite late, sometimes months later. Remember that the B-91’s are mailed in. Then they have to be transcribed, error checked, and if possible errors recovered from handwriting mistakes (like a 9 looking like a 4) then they run the whole thing through a sanity check and then post the updated data. Some data never gets sent in until months later. I recall one observer I spoke with who told me he only mailed his B-91’s every 6 months to save on postage. When I pointed out how important his station was (He had no idea he was part of USHCN) he vowed to start doing it every month.
You have to watch very carefully to know which data points are latecomers and which are infills. Sometimes, infills in the final get retracted when actual B-91 form data arrives months late.
-Anthony

James Strom
June 25, 2014 11:19 am

I believe that adjustment and infilling are unavoidable. For example, after a point it would be irresponsible not to introduce new and more accurate measuring technologies, but then you have the problem of calibrating the old and new records for compatibility. But in each case of recalibration the possibility of error increases.
Steve Goddard has been going on about two different things. One is the quantity of “fabricated” data, which you address here, and the other is the comparison of raw and adjusted temperatures, in which the adjusted temperatures always seem to show a stronger warming trend. It would be useful for you to address this second type of claim; in fact, it would be useful to have a systematic audit of the weather services to ensure that their records and adjustment methods are unbiased.

john robertson
June 25, 2014 11:47 am

Ok so Goddard is over the top.
Good of you to do a sanity check.
I however doubt the concept of his, Goddards, claims being a weapon to discredit all sceptics.
There is no sceptic central is there?
The PR hacks of the C.C.C will do their normal smears no matter what.
It is all they have left.
WUWT’s evenhanded and honest approach to what little empirical data we have is all that is needed.
Now the graph posted does make my mind boggle.
Goddard exaggerates the corruption. Makes a 3 degree F adjustment appear.
But the correction posted shows an adjustment of 0.8 to 1 F.This is odious enough.
Given the total warming claimed Team IPCC ™ was less than the error bars of this historical data, are any “adjustments justified?

June 25, 2014 11:50 am

Sorry Anthony, you’re wrong on a couple points here.
This claim: “More than 40% of USHCN final station data is now generated from stations which have no thermometer data.” Is utterly bogus.
No, this statement is literally correct, if easy to misread. I have downloaded the files myself, it is a fairly trivial exercise in programming and I found 43% of the final (i.e. latest month) station data has the marker indicating the data was missing. (BTW, their data handling is ridiculously antiquated. It’s easy for me to write file-handling routines for hundreds of files, but non-programmers would probably prefer to get the data in Access or something.)
Can that statement be misinterpreted? Sure. And the UK article does, in fact, misinterpret it to mean “almost half the data is missing.” That’s only true for the last month, of course, not for the data set as a whole.
Now, a few caveats:
1) My guess is that this trend exists at least partly because some results trickle in late.
2) Steve tends to present these things in the least favorable light
OTOH, Steve seems to be correct that there is a warming trend in these adjustmentsand that using anomalies doesn’t work very well because they have essentially cooled the baseline.
Re the first graph: again, this is presenting the evidence in the least favorable light, but it is not wrong. Yes, the fact there is less data in the last point gives it much higher volatility, which should be taken into account, but it’s still interesting.
Obviously you (Anthony) have made tremendous contributions and I respect that you always strive to be very, very clear. At the same time, there’s real smoke here, even if Steve is kind of blowing it in the direction he favors 🙂

June 25, 2014 11:55 am

Some actual data does come in quite late, sometimes months later. Remember that the B-91′s are mailed in
Ah good, I was hoping someone knew this. As I said I suspected it.
For fun I will run the analysis again next month to prove the data is arriving and take a swing at an expected correction rate.

Tom In Indy
June 25, 2014 11:56 am

Anthony,
Could you also comment on this graph comparing “raw data” to “raw data with 30% of observations randomly removed”?comment image
The implication is that infilling is unnecessary, so why do it in the first place. On the other hand, if the actual missing observations are not random, but have systematic component, then any infilling program should account for the systematic bias. Maybe it does, just asking.
REPLY: See the note about Part 2. I have actual money earning work I must do today -Anthony

Woodshedder
June 25, 2014 12:03 pm

Thanks Anthony. This was exactly what I was looking for.

June 25, 2014 12:11 pm

Its worth noting that the difference between the impact of adjustments found using Goddard’s method (the red line) and the other three methods in the second figure in the post is the effect that Goddard is incorrectly attributing to infilling.
http://wattsupwiththat.files.wordpress.com/2014/05/ushcn-adjustments-by-method-year1.png
It is actually an artifact introduced due to the fact that decreased station data availability in recent years has changed the underlying sampled climatology; e.g. the average elevation and latitude of stations in the network. NCDC’s infilled data tries to avoid this by infilling through adding the long-term station climatology to the spatially weighted average of anomalies from nearby stations.
Of course, if you use anomalies the underlying climatologies are held constant, and you are only looking at changes over time, so it avoids this problem.
As Steve McIntyre pointed out, this is pretty much the exact same issue that Marcott used to generate their spurious blade, as well as the issue that E.M. Smith used to harp on back in the days of the “march of the thermometers” meme.
In fact, infilling has no real effect on the changes in temperatures over time, as shown in Menne et al 2009: http://i81.photobucket.com/albums/j237/hausfath/ScreenShot2014-06-24at80004PM_zps20cf7fe9.png
Again, no one is questioning that there are adjustments in the USHCN record. These adjustments for things like TOBs and MMTS transitions are responding to real biases, but they way they are implemented is a valid point of discussion. Unfortunately Goddard is just confusing the arguement by using an inappropriate method that conflates changes in station composition with adjustments.

Nick Stokes
June 25, 2014 12:11 pm

Steve McIntyre says: June 25, 2014 at 10:34 am
“Anthony, it looks to me like Goddard’s artifact is almost exactly equivalent in methodology to Marcott’s artifact spike”

Well, here’s something – I think this is a good post, and I think there is a Marcott analogy.
What Goddard does is to try to show the effect of adjustment by averaging a whole lot of adjusted readings and subtracting the average of a lot of raw readings. While many of the adjusted readings correspond to the time/place of raw readings, a lot don’t. The result reflects the extent to which those extra adjusted readings were from places that are naturally warmer/cooler, because of location or season.
It’s like trying to decide the effect of fertilizer by just comparing a treated field with an untreated. The treated field might just be better for other reasons. You have to be careful.
One way of being careful here is to use anomalies. An easy way of showing the problems with Goddard’s methods is to just repeat the arithmetic with longterm raw averages instead of monthly data. If the result reflects the differences between stations in the sets, you’ll get much the same answer. I showed here that you do. That’s not adjustment.
If you use anomalies, then the corresponding longterm averages (of the anomalies) should be near zero. There are no longer big differences in mean station properties to confound the results.
Marcott et al did use anomalies. But because they were dealing with long time periods during which a lot changed, the expected values were no longer near zero near the end of the time. So changing the mix of proxies, as some dropped out, did have a similar effect.
As Anthony says, you can’t always get perfect matching between datasets when you make comparisons. Months will have days missing etc. You have to compromise, carefully, or you’ll have no useful data left. Using anomalies is an important part of that. But sometimes it needs even more.

June 25, 2014 12:15 pm

All this talk of temp data set problems, data quality, and abuse-misuse of methodology is completely lost on all but the expert, it is arcane to the public.
But what will not be arcane empirical data is the when the average family experience multiple relentlessly cold winters with skyrocketing utility bills to stay warm, just so the Alarmists can say they are saving the world from CAGW. That will be the real world data points that folks like Mann, Holdren, and their co-conspirators can’t hide, as the Liberals try to shut down ever more carbon-fueled power plants.

June 25, 2014 12:19 pm

Another simple example of why Goddard’s method will lead you astray:
The graph linked below shows raw temperatures for the US from two source: stations with almost no missing data (< 5 years) and all stations, calculated using Goddard's averaged absolute method and rebaselined to the 1990-2000 period.
http://i81.photobucket.com/albums/j237/hausfath/CompleteandAllStationsUSHCNRaw_zps5074f10e.png
Stations with complete records have higher trends than all stations. This is because declining temperature of all stations in Goddard's approach isn't due to any real cooling, but is simply an artifact of changing underlying climatologies.

JFD
June 25, 2014 12:27 pm

Interesting exposure of how some mathematical averaging methods can give incorrect answers. Thanks, Anthony. Two points:
1. Goddard’s graph does show creeping upward warming bias in the way USCRN handles the raw temperature measurements. This could also be due to using a faulty method as well. If not, then you and the other climate professionals need to support Goddard and find a way to expose this to the public.
2. You use the words “absolute temperature” to mean measured temperature. Absolute temperature has an exact meaning and is expressed in degrees Rankine or degrees Kelvin. If possible, my suggestion is use the words “measured temperature” or perhaps “actual temperature” to avoid miscomprehension.

Latitude
June 25, 2014 12:28 pm

Zeke, if you’re going to use raw temps from NOAA (USHCN)..”Stations with complete records have higher trends than all stations”..you need to explain this
http://stevengoddard.wordpress.com/2014/06/23/noaanasa-dramatically-altered-us-temperatures-after-the-year-2000/

flyfisher
June 25, 2014 12:29 pm

Thanks Anthony. This clears up a lot with the graphs from Goddard. The problem I still have is that when you consider: a. differences in record-keeping/sending spread among hundreds of different sites/people, b. switches to different types of measuring devices, c. conditions/placement differences of each measuring type, d. gridding to interpolate data…I have a very difficult time believing that ANY of the data generated using these methods is worth a damn. The error bars after all is said and done I’d imagine are so large as to make any type of reasonable analysis of temperature patterns utterly useless. As far as I’m concerned I’d have to think that satellite data is the only thing that should be considered since you have fewer ‘cooks contributing to the overall burgoo’.

Dougmanxx
June 25, 2014 12:29 pm

You have to watch very carefully to know which data points are latecomers and which are infills. Sometimes, infills get retracted when data arrives months late.
-Anthony
Helpfully, in the raw station data, they include an “E” character before the “infilled” data. It’s a simple matter to pull out how many instances of this there are, from the raw data. As Talldave2 says above, there are way too many. Our friend NikFromNY posted an example at Goddard’s site which I thought was comical, trying to show how wrong Steve was. He posted 5 years of data from a random site in New York State. He must not have “quality checked” it, because fully 20% of the monthly data was infilled. In the parlance of GISS: estimated. What I might call: made up. And none of that data was recent, so it wasn’t waiting on any kind of “late report”. The record is chock full of just that: made up data. And it’s becoming more and more prevalent as the record gets longer. The record is awful, I’m not sure why anyone claims otherwise. I’m not sure why you would defend it.

jmrSudbury
June 25, 2014 12:31 pm

Here is one example. In Aug 2005, USH00021514 stopped publishing data save two months (June 2006 and April 2007) that have measurements. Save those two same months, the final tavg file has estimates from Aug 2005 until May 2014. What is the point in estimating values beyond 2007? Why not just drop it from the final data set as it is no longer a reporting station?

Jim S
June 25, 2014 12:33 pm

There are applied sciences/engineering that “fill in data” or interpolate data, but those disciplines rely on continued feedback to make adjustments to their assumptions. They are using data and models to solve falsifiable problems. Data is NOT being used to generate/design the actual MODELS themselves. The models in other professions have been validated and verified independently of any given set of data. This is why it is wrong to “fabricate” data if the goal is to create climate models for use in geoengineering (i.e. limit CO2 outputs to X to affect temperature changes of Y, etc.).
It’s like meteorology. If a meteorologist wants to guess tomorrows temperature, fine. But let’s not confuse meteorology/climate science models with engineering models.

more soylent green!
June 25, 2014 12:39 pm

What else is wrong here – phone, paper, transcribed — seriously, what century are we living in? I guess we know the government can’t build websites, so I shouldn’t be so surprised.

June 25, 2014 12:40 pm

NCDC’s infilled data tries to avoid this by infilling through adding the long-term station climatology to the spatially weighted average of anomalies from nearby stations.
Which very conveniently adds a warming trend. I smell some fertilizer, all right 🙂
One way of being careful here is to use anomalies.
One way of avoiding detection by anomaly is to cool the baseline. Oh look, you did that! Interesting you use “careful” in the context of justifying adjustments that are adding a confirmation-bias-friendly warming trend that is helping to drive trillion-dollar climate policy. Normally one would be very “careful” to avoid that sort of thing.
an artifact of changing underlying climatologies.
And thanks to the discovery of these “changing underlying climatologies” we just happened to add warming to the 20th century since 2001.

June 25, 2014 12:48 pm

Latitude,
There is a very simple explanation. NASA GISS used to use raw USHCN data. They switched at some point in the last decade to using homogenized USHCN data.
Again, no one is questioning that there are adjustments in the USHCN record. These adjustments for things like TOBs and MMTS transitions are responding to real biases, but they way they are implemented is a valid point of discussion. Unfortunately Goddard is just confusing the argument by using an inappropriate method that conflates changes in station composition with adjustments.

Alf
June 25, 2014 12:48 pm

Time for the truth. Either I quit following Steve’s bog or he is vindicated and his ideas given more prominence

Dougmanxx
June 25, 2014 12:50 pm

So why not put this whole silly exercise in futility to bed? It can be done simply and with information that already exists. Goddard claims that past data is being tampered with. Fine. To show how legitimate the calculations are, from now on simply include the “average temperature” used to calculate the “anomalies” for every year. If there’s nothing to what Goddard says, those “average temperatures” will be the same for the past month-after-month-after-month (understanding that many current ones will change a bit due to “late reports”, as Anthony pointed out to me) and we can all say:”Gosh that Goddard guy nearly had me bamboozled!” Why not? Everyone knows the reasons to use “anomalies”, we get the idea behind showing how temps change in a way that is transparent to changes in recording instruments, siting, etcetcetc… So…. why not include that tiny little bit of information, it must exist, in order for an “anomaly” to be calculated? So: do it. Prove Goddard wrong with facts. Do it, so we can look at the temps for every month in 1957 or 1969 or 1928 or 1936 or any other year and see they are the same today, as they will be 2 years from now. Do that. It’s simple, and it puts to rest any of these arguments.

June 25, 2014 12:52 pm

talldave2,
You seem somewhat confused about exactly what anomalies mean. Using anomalies can preclude calculating absolute temperatures (though averaging absolute temperatures, or even spatially interpolating them, generally does not do a good job of calculating “true” absolute temperatures due to poor sampling of elevation fields and similar factors). Using anomalies generally cannot bias a trend. So claiming that anomalies would somehow hide cooling or exaggerate warming is misguided. Using absolutes can bias a trend if the composition of the station network is changing over time.
As I discussed in a recent Blackboard post, using anomalies ironically reduces global land warming by 50% compared to Goddard’s averaged absolute method: http://rankexploits.com/musings/2014/how-not-to-calculate-temperatures-part-2/

June 25, 2014 12:53 pm

Stations with complete records have higher trends than all stations
I don’t think anyone would be surprised by that, as everyone knows we are losing more rural data than urban, and McIntyre showed the urban stations have a higher trend. What’s problematic is that you’re smearing the uncorrected UHI across those lost stations.
Unfortunately Goddard is just confusing the arguement by using an inappropriate method that conflates changes in station composition with adjustments.
The method seems appropriate if the goal is find out what effect those changes in station composition are having on the trend.

June 25, 2014 12:55 pm

Its worth mentioning again that infilling (as done by NCDC) has virtually no effect on the trends in temperatures over time. It only impacts our estimates of absolute temperatures if we calculate them by averaging all the stations together.
Here is the difference between USHCN homogenized data without infilling (TOBs + PHA) and with infilling (TOBs + PHA + FILNET). The differences are miniscule: http://rankexploits.com/musings/wp-content/uploads/2014/06/USHCN-infilled-noninfilled.png

Robert of Ottawa
June 25, 2014 12:56 pm

NikFromNYC, you cannot criticize Goddard for “over-politicizing” global warming. It’s the Warmistas, the Lysenkoists, that have made it a political issue.

June 25, 2014 12:58 pm

talldave2,
Steve McIntyre mentioned just a few posts up that Goddard’s method will result in spurious artifacts when the composition of stations is changing over time. In this particular case, you can avoid most of these artifacts by using spatial gridding, anomalies, or ideally both.

Steve McIntyre
June 25, 2014 12:58 pm

Further to Zeke and Nick Stokes comments above acknowledging the similarity of Goddard’s error to Marcott’s error, there is, of course, a major difference. Marcott’s error occurred in one of the two leading science journals and was not detected by peer reviewers. Even after the error was pointed out, Marcott and associates did not issue a corrigendum or retraction. Worse, because Marcott and associates failed to issue a corrigendum or retraction and because it was accepted just at the IPCC deadline, it was cited on multiple occasions by IPCC AR5 without IPCC reviewers having an opportunity to point out the stupid error.

Latitude
June 25, 2014 12:58 pm

These adjustments for things like TOBs and MMTS transitions are responding to real biases, but they way they are implemented is a valid point of discussion
Zeke, so in your opinion switching from raw data to homogenized data….is justification for changing a cooling trend…into a warming trend
NOAA changed sometime after 2000……1999 was not the dark ages

June 25, 2014 1:00 pm

So claiming that anomalies would somehow hide cooling or exaggerate warming is misguided.
This again? http://stevengoddard.wordpress.com/2014/06/24/no-you-dont-want-to-use-anomalies/

Latitude
June 25, 2014 1:05 pm

“Its worth mentioning again that infilling (as done by NCDC) has virtually no effect on the trends in temperatures over time.”
Well of course, and who cares…..they had already cooled to past to show a warming trend that didn’t exist….
…Prior to 2000, they used raw data that showed a cooling trend….
…after 2000, they changed to homogenized data…which instantly showed a past warming trend
After 2000 the past warming trend stopped showing up on the satellite temperature record…..and temps went flat
Take away the change from raw data to homogenized data…..and there’s no warming trend at all

Steve McIntyre
June 25, 2014 1:05 pm

Zeke wrote: “Steve McIntyre mentioned just a few posts up that Goddard’s method will result in spurious artifacts when the composition of stations is changing over time. In this particular case, you can avoid most of these artifacts by using spatial gridding, anomalies, or ideally both.”
Zeke, I commented on Marcott’s method. I didn’t directly comment on Goddard’s method as I haven’t examined it. Based on Anthony’s description, I observed that its artifact spike appeared to arose from a similar phenomenon as Marcott’s. As you observe, there are a variety of ways of allowing for changing station composition. It seems to me that mixed effects methods deal with the statistical problem more directly than anomalies, but in most cases, I doubt that there is much difference.
In Marcott’s case, because he took anomalies at 6000BP and there were only a few modern series, his results were an artifact – a phenomenon that is all too common in Team climate science.

KNR
June 25, 2014 1:09 pm

One of things that marks of AGW has have more religion than science outlook is the endless need to defend someone which has entered its dogma, no matter how poor its quality or even how much it disagrees with reality . The stick and 97% are classical examples of that, both are poor from top to bottom but are are regarded as unquestionable and unchallengeable by the AGW hard core.
In science a challenge is welcome, for that is often how progress is made , remember its ‘critical review ‘ which should be done and questions ,even if they dum ones , are what you expect. Its politics and religion that makes claims to ‘divine unchangeable truth ‘ science should always be willing to ‘prove it’

June 25, 2014 1:11 pm

Yes, but that’s a bit like saying “Todd’s often drunker than Jim, let’s have Jim drive us home” — that’s not such a great plan if you already know Jim’s passed out in the corner.
When you don’t use the anomalies, it immediately becomes obvious that the baseline has cooled. When you use the anomalies, it looks like nothing much happened.
Except, again… from the published data, we already know something happened, don’t we? So when you show up and say “ignore the temperatures that were actually measured, use the anomaly!” you can forgive us for hearing it as “pay no attention to that man behind the curtain!”

June 25, 2014 1:17 pm

The graph above is courtesy of Zeke Hausfather Who co-wrote that blog entry with me.

=====================================================
Small typo below the second graph. Shouldn’t that be Dr. Who? 😎

Nick Stokes
June 25, 2014 1:20 pm

Steve McIntyre says: June 25, 2014 at 12:58 pm
“Further to Zeke and Nick Stokes comments above acknowledging the similarity of Goddard’s error to Marcott’s error, there is, of course, a major difference. Marcott’s error occurred in one of the two leading science journals and was not detected by peer reviewers.”

There’s another major difference. Marcott’s paper was not about temperatures in the twentieth century. It is obvious that poorly resolved proxies are not adding to our already extensive knowledge there. Marcott’s paper was about temperatures in the Holocene, and there it made a major contribution.
I agree that noise at the end of the series should not have been allowed to cause a distraction. But it does not undermine the paper.

JeffC
June 25, 2014 1:22 pm

So what % of the data is made up ? he says 40% … you say he’s wrong but don’t put out your own number … seems like you don’t like the word “fabricate” … so what ? I don’t like the statistics mumbo jumbo you spout here … I put up with it …
you may be getting alittle full of your nitpicking abilities …
How about getting something published so that we can push back against this AGW nonsense ?
How about you refute Cook with your own study ?
Otherwise WUWT is turning into a niche corner of very intelligent posts that refute study after study but in the end never do a damn thing to stop the AGW crowd … your are preaching to the choir and getting stomped in the press and the publications …
{You could say the same exact thing about “Steve Goddard”. Anthony has in fact published some papers. there was also an inspector general review of NCDC due to his surface stations project. Can you cite anything like that from “Goddard”? -mod}

June 25, 2014 1:24 pm

BTW here is McIntyre’s UHI paper, quite elegant in its simplicity. Major league sports doesn’t cause UHI, but is strongly associated with more urban areas.
http://climateaudit.org/2007/08/04/1859/
Hansen 2010 (iirc) tried to rebut this using Google Nightlights, but a quick check revealed it was not precise enough for that use.
And as I recall Tony also later found a warming trend.
Lance makes a great point above — there are undoubtedly real biases that introduce spurious cooling, but there seems to be a much greater incentive to remove those as opposed to those that induce spurious warming — so much so that the warming trend in the past keeps increasing. And in a signal as noisy as this, it’s way too easy to find the trend you want.

Rud Istvan
June 25, 2014 1:29 pm

I wrote an essay about this after studying the matter as carefully as I could. Goddard is right about past cooling adjustments, mostly apparently inserted through homogenization. Easily provable either by specific locations (Reykjavik Iceland, Sulina Romania, and Darwin Australia were examples used), by state (California and Maine) by country (US, Australia, New Zealand) and by data set (NCDC, GISS, and HADCrut 4).
He is wrong in his recent posts about what is being done. And should stop it, as it opens up ‘flat earth’ avenues of dismissal of the fact that records have been adjusted, and in the opposite from which UHI bias ober time would properly have been handled.

Latitude
June 25, 2014 1:35 pm

talldave2 says:
June 25, 2014 at 1:11 pm
Except, again… from the published data, we already know something happened, don’t we? So when you show up and say “ignore the temperatures that were actually measured, use the anomaly!” you can forgive us for hearing it as “pay no attention to that man behind the curtain!”
=====
Dave thanks, that’s an even better example of what I was saying….
Zeke says NASA/GISS switched from NOAA raw data to homogenized data…..that was around 2000
….that switch, and switch alone, caused the warming trend prior to 2000
Conveniently, after 2000, the warming trend stopped
The only trend in warming was an adjustment to past temp history….
And people are basing their science on that garbage…….

AlexS
June 25, 2014 1:36 pm

“If we were to throw out an entire month’s worth of observations because one day out of 31 is missing, chances are we’d have no national temperature average at all. ”
This is one the most dangerous affirmations done here.
It might force people to accept bad data just because it needs a result.

June 25, 2014 1:37 pm

BTW Nick apologies, I think I scrolled too fast and confused your post with Zeke’s. Mea culpa!
At any rate, unless the NSF wants to give me $750K instead of using it on a play about the perils of global warming (which is a totally obejctive, scientific, and unobjectionable use of taxpayer funds) I have to get back to the private sector here, so I’ll just leave you with this thought: is it really plausible that every economist is wrong about the US having a really cold Q1, and that we had record-late Great Lakes ice in a year that was relatively warm for the region? I think if we pull enough proxies together and look for correlations, we will probably find that the simple average, for all its many flaws, appears closer to reality than what is being published.

pouncer
June 25, 2014 1:41 pm

The P.R. aspect is to raise the issue during the same week that “data loss” takes the media spotlight, with “hard drive crashes” reportedly afflicting both the US Treasury /IRS and the US Environmental /EPA. The notion that temperature data might be missing, perhaps deliberately MADE to be missing, is more plausible this week than it would have been say last October.

June 25, 2014 1:50 pm

Talldave2,
Interestingly enough, my recent paper on UHI found results similar to Steve’s blog post in the raw data: ftp://ftp.ncdc.noaa.gov/pub/data/ushcn/papers/hausfather-etal2013.pdf
Also, you may have missed the part in Steve’s post where he uses anomalies rather than absolute temperatures :-p

NikFromNYC
June 25, 2014 1:56 pm

Nick Stokes lawyerlike glossed as usual: “Marcott et al did use anomalies. But because they were dealing with long time periods during which a lot changed, the expected values were no longer near zero near the end of the time. So changing the mix of proxies, as some dropped out, did have a similar effect.”
A pattern of rent and fame seeking alarmist behavior exists that now in 2014 demands a much stronger condemnation of Marcott 2013. Marcott et al. *purposefully* re-dated proxies in order to get their alarmist paper published at all in top journal Science since without that re-dating there was no result at all except proxies that didn’t go anywhere and thus were likely not even valid proxies at all. Then a co-author himself described the completely spurious and thus completely fraudulent blade to New York Times reporter as a “super hockey stick” using a swoosh gesture. Then comes its promotion (and eventual post-headline news whitewashing) by hockey stick team PR machine at RealClimate.org, a site registered to a notorious junk science promoting PR firm, and a real media sensation resulted that helped propagate the emergency level funding of climate “science.”
http://youtu.be/PgnMuKuVXzU
This co-author of all people was very well aware that the blade was a mere artifact so his promotion of it amounts to criminal fraud and should be prosecuted as such, just like this AIDs researcher is now facing criminal charges:
http://www.cbsnews.com/news/aids-researcher-charged-with-fraud-for-falsifying-data/
“Responding to a major case of research misconduct, federal prosecutors have taken the rare step of filing charges against a scientist after he admitted falsifying data that led to millions in grants and hopes of a breakthrough in AIDS vaccine research.”

pdtillman
June 25, 2014 2:05 pm

Heh. A pleasure to read a WUWT comment thread that’s mostly about science, with only a bit of political cant.
Thanks, Anthony, for looking into this.

JFD
June 25, 2014 2:07 pm

Nick says: I agree that noise at the end of the series should not have been allowed to cause a distraction. But it does not undermine the paper.
———-
Nick, work such as Marcott’s is being used by governments such as the USA to declare war on fossil fuels. Experts may be able ignore the bad parts of Marcott’s paper, but governments are not peopled by experts in climatology or experts for much of anything to do with science. Marcott should have withdrawn his paper, corrected his errors, rewritten the paper and resubmitted it. Your defense of those with moral ineptitude such as Marcott does not help the world.

Rob
June 25, 2014 2:07 pm

Wow!
Great post.
Hang on to all original data folks.
Counterfeit is everywhere!

June 25, 2014 2:09 pm

Steve McIntyre says:
June 25, 2014 at 12:58 pm
Further to Zeke and Nick Stokes comments above acknowledging the similarity of Goddard’s error to Marcott’s error, there is, of course, a major difference.
##############################
Steve we have over the years fought against bad methods.
Can you help by simply explaining to people why Goddards methods of averaging absolute temperatures is simply wrong.
1 this issue does not need to be confused with discussions of Marcott
2. this issue does not need to be confused with discussions of adjustments.
perhaps, You and McKittrick and Spencer can join Anthony in a forthright criticism of Goddards approach. Its a bad method. It gives spurious results
Enough with the silence of the skeptical lambs

Latitude
June 25, 2014 2:10 pm

“The only trend in warming was an adjustment to past temp history….”
correction: I’m falling trap to their words too….
The only trend in warming was from throwing out the old temp data prior to 1999…
…and using a totally different set of data
That’s not called an adjustment…

Editor
June 25, 2014 2:17 pm

Zeke Hausfather – I’m interested in your comment : “Its worth mentioning again that infilling (as done by NCDC) has virtually no effect on the trends in temperatures over time.“. If we know what the temperature trend is without infilling, then we don’t need to infill in order to get the trend. If we don’t know what the temperature trend is without infilling, then we can’t know that infilling “has virtually no effect” on the trend.

June 25, 2014 2:18 pm

“If we were to throw out an entire month’s worth of observations because one day out of 31 is missing, chances are we’d have no national temperature average at all. So the method was created to fill in missing data from surrounding stations. In theory and in a perfect world this would be a good method, but as we know the world is a messy place, and so the method introduces some additional uncertainty.”
This is a bit of an exagerration.
You can construct a US average and a global average using daily data where you require 100%
data coverage.
See our daily data product.
Then you can also test how many days of drop out you can have without effecting the result
then you can test how good an infilling routine is.

NikFromNYC
June 25, 2014 2:20 pm

JeffC taunted: “How about you refute Cook with your own study?”
Well we do have two published papers already doing that, first one by Legates, Soon, Briggs & Monckton that clarified his 97% claim as really being 0.2% when you use the proper IPCC consensus definition of half of recent warming being anthropogenic, and now we have Richard Tol’s overall debunking of his entire methodology as being that of an unscientific hack. What else do you want? The meteorologist study? That showed 48% skepticism towards climate alarm. And already the American Physical Society has filled half of its new climate statement committee with skeptics replacing activists and might that open the dam once and for all? You bet it will. Was it not enough to expose Cook’s intent to widely promote his finding of consensus *before* he did the actual analysis?! I think that’s what’s known as “being busted!” And are we really losing the debate? In the last few years, three major Western countries have severely turned away from climate alarm, as has fully half of the American political machine, with Canada, Britain and Australia turning sharply to the right in general in a backlash against climate alarm and related liberal excesses.
“To achieve this goal, we mustn’t fall into the trap of spending too much time on analysis and too little time on promotion. As we do the analysis, would be good to have the marketing plan percolating along as well.” – John Cook, who was also busted early on by Tol for using the bizarre boutique search term “global climate change” to cherry pick studies of the effects of climate change by everyday naturalists while omitting studies of its possible attribution to mankind because no mainstream climatologist uses Cook’s phrase rather than “global warming” or “climate change” and absolutely no skeptic would use such a weird term when they are trying to expose global warming claims as being false, because no skeptic accepts the silly term “climate change” since it represents propaganda over clarity.

June 25, 2014 2:28 pm

Mike Jonas,
Its pretty simple to calculate. Average all the anomalies excluding any infilled values. Average all the anomalies including any infilled values. Compare the two results.

Tom In Indy
June 25, 2014 2:28 pm

REPLY: See the note about Part 2. I have actual money earning work I must do today -Anthony
Sorry, I should have made it clear that I was asking for a discussion of that particular chart be included in part 2. I didn’t expect you to spend time on anything new in this thread. I am on vacation from from my actual money earning work and have no excuse for not taking the time to consider you might think I was expecting an immediate response.

June 25, 2014 2:28 pm

From the various comments made here today; I get the distinct impression that the data sets are a mess and that it would be all too easy for those working on the temperature data sets to “see what they want to see and disregard the rest”. We have gotten to the point where we are arguing about what words to use when describing the “adjustments”. It looks like the anomaly game makes it fairly easy to hide any “fiddling” or to hide plain old observational bias: especially if we change the baseline when we need to do so. And the people working on these data sets are biased — never think they are not.
But on a slightly different note (very related though) is this post by the Scottish Skeptic:
http://scottishsceptic.wordpress.com/2014/06/25/my-best-estimate-less-than-1c-warming-if-co2-level-is-doubled/

GeologyJim
June 25, 2014 2:33 pm

Question for all:
From Anthony’s work on stations and temp measurements, I’ve learned many things that cast doubt on the climatological value of these data, regardless if they are stated as measured values or anomalies
One huge problem being overlooked in this discussion (I believe) is that of the “daily average temperature”, calculated by adding the daily high value to the daily low value and dividing by 2.
Daily highs and daily lows should have different long-term trends if, as Anthony has shown, the overnight lows are creeping upward due to “urbanization”. Calculating a daily average mashes together the distinct data recorded as highs and lows. Reducing the number of stations in the calculation adds a warming bias (more stations retained where more people live). Spatial averaging (gridding) just smears the warm-biased data over larger areas and enables NASA to produce really scary looking global anomaly maps with warm-biased color schemes.
The averages seem to show upward trends over time, but much of that seems due to rising overnight low values – – not to rising daytime high temperatures. Does anyone really doubt this?
Record high temps are still much more frequent in the 1930s-1940s than today
I suggest if one were to separately calculate trends based on daily high values vs daily low values, the upward trend would clearly be shown by the daily-low data, and not shown by the daily-high data.

June 25, 2014 2:36 pm

perhaps, You and McKittrick and Spencer can join Anthony in a forthright criticism of Goddards approach… Enough with the silence of the skeptical lambs
Translation: “I am a climate alarmist, and I enjoy infighting among skeptics. So, boys, let’s you and him fight!”
We are all better off constantly pointing out that the alarmist crowd has ownership of total failure. Not one of their numerous CAGW predictions have happened. Not one of them predicted that global warming would stop for nearly two decades. No model predicted that, either.
Now their tactic seems to be to try to forment dissention among any skeptics who have slightly different opinions or methods. Sort of like the Dems arranging for McCain to be the R nominee, isn’t it?

June 25, 2014 3:08 pm

Thanks for a very well written and necessary post. Goddard is a crank and cranks on either side should be exposed as such. There are all sorts of problems with the temperature records but Goddard helps nobody but himself.

Eliza
June 25, 2014 3:08 pm

I think this site is really falling for it. has anyone actually visited the NOAA or NCDC monthly report sites??? Its all about the warmest 100th month in the last 1000 years etc. They are totally biased toward the AGW agenda. In regards to Goddard’s site all I know is that nearly all his analysis of past temperatures especially USA ones, adjustments have been spot on. The one in contention is just one of many. For example adjustments of USA temperatures re 1934 V current warmest etc. All the articles and data from the past are not faked. No wonder the warmistas are SH@@@ scared of him as he keeps a meticulous record of ALL data and articles. The likes of Zeke etc and Mosher who are simply computer geekwarmist trolls in my view (vist their sites) basically live off or love modeling, the curse of climate science as they shall find out no doubt in coming years. LOL. By the way Goddards contention seems to be completely supported by John Coleman etc.Just check his show on USA temperature data tampering. BTW what about BOM, New Zealand data tampering. Of course Goddard is right and you are way wrong. My respect for this site dived dramatically when Watts as an excuse posted at Lucias site the fact the Goddard was a pseudonym (who gives a royal ####, its the data stupid!). You’ve lost me anyway until you get the message that AGW is over, finito, it ain’t happening ect. and you are basically feeding the warmist trolls here

June 25, 2014 3:09 pm

The word “fabrication” is the wrong word to use, as it implies the data is being plucked out of thin air. It isn’t – it is being gathered from nearby stations and used to create a reasonable estimate.
Ok, change ‘fabrication’ to “estimation by infill’
The trend of the chart raises eye brows and it should.
Why should year 2005 have more infill than 1995?
Remember, infill is only to replace gaps in raw data with estimates (who knows how good) from near by stations (of unknown quality). Why the heck should there be more gaps in 2005 data than in all of 1990-1999?
I think Goddard has a point about the drop off of raw data (blue) compared to a constant number of stations (red). If anything, our data coverage should be getting better as attention and money get drawn to CAGW. Good ol’ peer review might get to the bottom of it. But it needs addressing.

Patrick B
June 25, 2014 3:10 pm

Infills or whatever you call it – IT IS NOT DATA – in science, data is information you collect from observation. Anything after that is not DATA – it may be derived from data but no one trained in science would call it data. Just one more example of corruption of science and language in climate “studies”.

David Riser
June 25, 2014 3:38 pm

While Mr. Goddard may be a bit over the top at times, his basic message is sound and it provides some interesting thoughts on the data. Of particular interest is the fact that his methodology leads to similar artifacts to what happens when creators of global temperature data sets do the same thing. Frequently you get a different anomaly number a few months later after they add in some more data and do a bit more mashing of the numbers, rarely do they ever bring the change to light.
I think its useful to discuss the issue but I also applaud Mr. Goddard for his efforts. The fact that he listens to feedback is the sign of someone actually interested in science. His focus is different and his suspicions of motive have some basis in fact. I would be more inclined to caution him about being over the top if the major data set producers were a bit more open about method and over time changes. Specifically I would like to see the code that produces some of this homogenization. I suspect that Mr. Hanson’s original work is very biased.
On a different note I am looking forward to Antho#y’s release of his latest surface station work.
v/r,
David Riser

Alexej Buergin
June 25, 2014 3:49 pm

Can anybody tell me which version of the temperatures in Reykjavik in the year 1940 is the correct one (maybe the two Tonys can agree at least on this one)?

Mike T
June 25, 2014 4:09 pm

more soylent green! says:
June 25, 2014 at 12:39 pm
What else is wrong here – phone, paper, transcribed — seriously, what century are we living in? I guess we know the government can’t build websites, so I shouldn’t be so surprised.
You have to remember that most of these observers are volunteers, and may not be comfortable with “new technology”. Also, means to transmit data in real time cost money, unless some way can be found to do it online (which, after a fashion, is what is happening in Australia).
An earlier comment was made about averages. I’m not a statistician, but from what I remember about calculating means is that they are always calculated to one decimal place more than the data being averaged. Since meteorological instruments such as thermometers are read to one decimal place (at least in this country, where we use Celcius) the mean would be to two. I guess US stations that may be read to one degree F would be averaged as if the the reading was “.0”, don’t know. It does seem counter-intuitive to come up with a mean figure which is more precise than the actual readings. In the case of monthly averages, they would be calculated to two decimal places and then rounded- for coding purposes to a whole number if required, rounded to the odd, so 28.5C would become 29C.
I have a concern about older data being massaged, especially if they end up colder than present readings, since there is a difference between manually read and electronic thermometers, in that maximum thermometers read slightly lower max temps than electronic probes: if any modifications were required, I’d have thought that older temps should be warmed slightly, not cooled, to compare with modern instrumentation.

NikFromNYC
June 25, 2014 4:25 pm

David Riser muses: “While Mr. Goddard may be a bit over the top at times, his basic message is sound….”
No, his basic message is that the government is using the CIA to brainwash children into killing their classmates in order to promote gun control.
MUST I REPEAT THAT AGAIN?
His main commenters include a convicted daughter/son rapist who continually spams his site nearly every day of the week, the *second* most popular skeptical blog, with bizarre World War Three scenarios about nuclear science conspiracies and an iron sun, and another of his main commenters publishes books about fractal coastlines containing animal figures that match star constellations in a way that proves that ancient gods played sandbox with Earth in order to get our attention. His other message is that mainstream *skeptics* are conspiring to enable climate alarm. He is a classic paranoid egomaniac who is acting like a guru for a parade of freaks that now includes as his bulldog one of the most politically divisive right wing activists of all time, Jim Robinson of FreeRepublic.com whose forum is regularly used to negatively stereotype conservatives as being bigots, since most of his commenters are exactly that, as revealed by a Twitter account that archives the best of the worst there:
https://twitter.com/FreeRepublicTXT
I have never been so viciously attacked as by Goddard’s new pet Robinson who repeatedly dubbed me “-=NikFromBellevue=-” without any moderation by Goddard. When I defended myself, Goddard banned me outright as others taunted me for not “daring” to reply, all for asking for a few before/after station plots as Goddard claimed data was being fabricated. I wanted to clarify his claim so that perhaps I could create one of my infographics about it, but his claim evaporated instead, as it did before, except this time he got serious media attention for his hacked together claim, setting skepticism back profoundly, making us no better than hockey stick promoters except we lack enough academic affiliation to support us.
What a terrible fiasco this Goddard fool has become.
-=NikFromNYC=-, Ph.D. in carbon chemistry (Columbia/Harvard)

Latitude
June 25, 2014 4:32 pm

Nik…you have been thread bombing Goddard’s site for months….Goddard asked you for over a week to please stop thread bombing….he warned you for days that he would ban you if you didn’t stop….you ramped it up
and he finally banned you
..now take his pictures off your mirror, go out and get some air

Latitude
June 25, 2014 4:36 pm

Alexej Buergin says:
June 25, 2014 at 3:49 pm
Can anybody tell me which version of the temperatures in Reykjavik in the year 1940 is the correct one (maybe the two Tonys can agree at least on this one)?
=====
It’s somewhere between Arctic and tropical…..
The animation below flashes between GISS V2 and V3
http://stevengoddard.files.wordpress.com/2012/05/iceland-1.gif

James Strom
June 25, 2014 4:40 pm

talldave2 says:
June 25, 2014 at 1:11 pm
Thanks for that link, which I repeat:
http://stevengoddard.wordpress.com/2013/12/21/thirteen-years-of-nasa-data-tampering-in-six-seconds/
That’s four adjustments of a data set, with the warming trend increasing each time. It would seem NASA has a lot of explaining to do, though I wouldn’t a priori rule out some valid reason. I would also point out that the underlying data is in the form of anomalies, so you don’t always need absolute temperatures to expose questionable adjustments.

June 25, 2014 4:47 pm

… now take his pictures off your mirror, go out and get some air
That was funny; enjoyed that line. I am reminded of the crack by Mencken that you could drag an idiot through the university and confer a Ph.D on him, but he will still be an idiot. (from memory and not an exact quote)

Scott Scarborough
June 25, 2014 4:58 pm

Goddard often quotes Hansen in 1999 as saying that there has been no global warming in the US and then shows a GISS plot from 1999 showing no warming. Then Goddard shows a GISS plot from 2003 where the historical data has changed and there is a record of warming. Is this true or not? If it is true, how do you explain it? (Goddard apparently has an explanation but you don’t like it. What is your explanation?).

June 25, 2014 5:23 pm

“Goddard often quotes Hansen in 1999 as saying that there has been no global warming in the US and then shows a GISS plot from 1999 showing no warming. Then Goddard shows a GISS plot from 2003 where the historical data has changed and there is a record of warming. Is this true or not?”
I started to follow AGW science and politics on a daily basis circa 2007. Back in those ancient times one curiosity we were all aware of was that the US land temps did not show any warming trends.
The explanation for this, no doubt, is that adjustments have corrected errors and biases in earlier records.

David Riser
June 25, 2014 5:28 pm

Nik,
Calm down, when you rant it kind of ruins your message. Your obviously upset and exaggerating a bit. The internet is pretty full of nuts and they spill out all over, Mr. Goddard is hardly a nut. He does pretty good work and he communicates, if you’re civil. Its kind of hard to paint any particular blog based on who a few commenters are, in the US you are allowed your opinion and free speech is mostly protected.
I would suggest not commenting when your emotionally compromised, give it some time and write calmly and deliberately. So for example you spout out about Mr. Goddard’s basic message. I don’t doubt that there are people who do believe what you ascribe as Mr. Goddard’s basic message, but I am reasonably sure that nothing of the sort was authored by Steve Goddard. So trying to tie a lie to someone ruins your own credibility. Using various search techniques with the words that you ascribe to Steve Goddard, you come across some gun control discussions, which is a valid political topic with many reasonable people on both sides. You can find a few articles about brainwashing children about CAGW. Honestly the alarmists are trying to brainwash everyone, fortunately their track record is pretty bad and somehow climate alarmism is like bottom of the worry totem pole. Nothing though about school shootings or any other recombination of your ascribed basic message.
Painting everyone, particularly the author, posting/commenting on a blog based on a few individuals who may post there is not a valid. It is a common straw man argument so I really am not going to go into details on all the silliness that involves. Blogs have to moderate and ban primarily when you violate the stated rules of the blog which is a basic right of the author, to make and hold the rules sacred. Sometimes they get overrun but mostly they don’t.
Fastest way to being banned at most sites is being uncivil, and or attacking. Soooo I am not surprised you have been banned by Steve Goddard based on your frequent comments about Steve Goddard.
Anyhow Nik have a nice night.
v/r,
David Riser

Jimmy Finley
June 25, 2014 6:17 pm

So, it seems the answer is to fire the people who get paid far more than they are worth to mess with these data. Who cares what a bunch of ill-sited or otherwise compromised thermometers say (or don’t say) about the temperature? If one is going to do it, do it right, and this miss-mash of crap is not right. And when I see Hansen’s chart from the 90’s, now misshaped to make the 1930s – you know, that time when a majority of the US temperature records were set –
look like some period of the Little Ice Age – my blood boils. Burn the house down. If it is so important, rebuild it. Until then, I am sick and tired of seeing Anthony Watts in bed with Zeke/Mosh (who else was named in one screed above?) defending a bullshit agglomeration of data, that then lends itself to manipulation by people who are, at the very best, there to ingratiate themselves with the warmists, and at the worst, are fueling the fires that will burn away the freedoms and defense against tyrannical rule that the IPCC, United Nations as a whole, and this American administration long to impose. Enough!
REPLY: Mr. Finley, first, calm down before you blow a blood vessel. Second. This isn’t about “Anthony Watts in bed with Zeke/Mosh (who else was named in one screed above?)”. It is about what is true and correct and in my opinion, “Goddard” is not true nor correct in this instance, and he has made it harder for climate skeptics to be taken seriously with this particular claim. There are plenty of real, quantifiable, verifiable, issues with the surface temperature record that can be criticized, (see part 2 when it is published) and if this was one of them you can bet I’d be out there talking about it in a positive way. As it stands, all I can do is muster my personal integrity to say that this particular claim is wrong, and why. Note Steve McIntyre’s comments upstream as well.
Feel free to be as upset as you wish, but I call them as I see them. – Anthony

Latitude
June 25, 2014 6:27 pm

“I was disputing the claim that 40% of USHCN stations were missing and had “completely fake” data (his words)”
“The differencing method that Goddard is using detects every missing data point from every station in the network. This could be as simple as one day of data missing in an entire month, or a string of days, or even an entire month which is rare. Almost every station in the USHCN at one time or another is missing some data”
=====
…would you say that at any given moment, 40% of the stations are missing data?
If 40% of the stations always report late, or don’t report at all…then at any given moment, 40% of the stations would have missing data…………Goddard has made it clear that he considers infilling “completely fake” data…I tend to agree

NikFromNYC
June 25, 2014 7:17 pm

Now the lefty political blogs are running with this fiasco that Goddard provided the fuel for, making Gavin my lazy lights-out-by-nine neighbor even more quotably famous:
http://www.politifact.com/punditfact/statements/2014/jun/25/steve-doocy/foxs-doocy-nasa-fudged-data-make-case-global-warmi/
Lattitude, posting comments with real thought-out content in no way amounts to “thread bombing” nor has Goddard ever even suggested the term. I have no book to sell, no blog to promote, no crackpot theory whatsoever to publicize. I added value there as far as I could over the years and tried to explain to the crowd an Upper West Side perspective in which whole scientific bodies are still sincerely but somewhat bizarrely promoting climate alarm, especially locating a very relevant Climategate e-mail about adjusting away the 1940s “blip” that supported Goddard’s legitimate claim that midcentury cooling that led to a new ice age scare has been erased.
What *actually* happened, gossip-wise, was that after his initial adjustment hockey stick correction, Goddard launched into a face-saving doubling down on a new zombie station claim and so I asked for details of his overall flowchart of procedure in deriving it. For this I was attacked en masse and then banned when I publicly concluded that Goddard has become a sensationalistic charlatan. Thread bombing is the stuff of agenda-laden promoters whereas my only agenda is making the best infographics to educate laypersons in skepticism and help expose a perceived fraud. Instead of a blog of insider compatriots and chummy buddies I just use an iPhone to reach out to worldwide news sites, and lately youth culture icon VICE magazine where moderation is slight, unlike for most lefty rags.
I used to be much more active, online, once spending a flurry of 16 hours a day for a solid eight months, around 2010, after I wanted to see some justice in this world finally. I both enjoyed Goddard’s blog as an emotional pit stop and then suffered it very much as I actually did real outreach into more liberal territory after my earlier blanketing of conservative blogs. I’m surrounded daily by thousands of evil climate alarmists and find that they are merely concerned dupes instead of klimate komrads after all, and I’m afraid the only fanaticism you will accurately label me with is fanatical normality given my old school science background along with Midwestern American roots.
Until skeptics loudly condemn a registered child rapist crackpot as the main commenter on the second most popular skeptical blog that features child killing conspiracy theories, you yourselves stand convicted of antisocial fanaticism that is willfully divorced from civil society.
You think his blog is just a few plots? No, it’s CIA plots! Please imagine what a layperson sees there on first and second inspection, for that is its cultural effect, one that will strongly reflect upon all skeptics.
The local West Side Rag online newspaper has *not* censored even my most impassioned condemnations of climate alarm. On the other hand, Goddard banned me even early on, rationing my account to a single post a day, then with no announcement arbitrarily banned it for a week or two at a time. Witness the truly pathetic tribalism there, cheering it all on, as if such a *gross* and demonstrably incompetent hack now represents the peak genius of skeptical thought.
I say no.
“It is what Zola calls triomphe de la médiocrité. Snobs, nobodies, take the place of workers, thinkers, artists; and it isn’t even noticed. The public, yes, one part of it is dissatisfied, but material grandeur also finds applause; however, do not forget that this is merely a straw fire, and that those who applaud generally do so only because it has become the fashion. But on the day after the banquet, there will be a void a silence and indifference after all that noise.” – Vincent van Gogh (letter to Theo van Gogh, 1882)

Owen
June 25, 2014 7:19 pm

Every word that comes from Obama’s mouth regarding the climate is a lie. The people in charge of the data are doctoring it to support Obama. If they don’t they won’t have a job. Steve Goddard is on to something. To dismiss him as a crank is the same type of tactic the Climate Liars use to dismiss the skeptics. I thought we were better than that !

NikFromNYC
June 25, 2014 7:30 pm

Anthony is sucessfully herding lions and tigers and bulls and snakes and sloths and wolves and sheep and prized poodles. He seems to grok the very core of NYC, that different folk in love make a better bee hive. The alternative is war that sounds fun but never is.

braddles
June 25, 2014 7:35 pm

Statistically, using ‘infilled’ data for statistical purposes, including averages, is extremely dubious. Infilling is not data, and will create a false impression of solidity or reliability in a dataset. Statistical inferences from data with significant infilling should not be trusted.

June 25, 2014 7:38 pm

Anthony, you should double check Zeke’s work.
Using USHCN Final Tavg dated v2.5.0.20140622
July 2012 – 880 Stations have data without the E for Estimated flag.
There are 1218 stations.
27% of the July 2012 Stations are missing data.
July of 1895 has 472 station reporting Real (non-Estimated) data
61%. of the July 1895 stations are missing data.
Now remember, I am only look at the monthly records. Monthly records avoid the E flag if there are enough daily daily. It doesn’t mean there is data for every day.

Jimmy Finley
June 25, 2014 7:42 pm

Finley says:
June 25, 2014 at 6:17 pm… in response to Anthony Watts: “…Feel free to be as upset as you wish, but I call them as I see them. – Anthony…” And I do respect that. This is the ONE place I look to to see “Watts Up” regarding climate issues. I sometimes get caught up in the moment! and this was one of them. I have little idea who Goddard is, but I do know what shoddy data is, and even shoddier handling of it. We “deniers” are being held to ransom by the charts these charlatans keep “revising and updating”. The best thing for us is to dispense with them; they are at best severely compromised to be used as “the surface temperature of the United States” (whatever that means), and at worst, basically what I described above: ammo for evildoers.
I await your Part 2, and I hope some concept of how to get out of this data trap that is being used far more effectively than the words of some little-known blogger to undermine skeptical arguments.
And, I retract my derogatory words about you. I know better than that, and you deserve my apology.
It was sub-45 degrees here this morning, for about the 25th time since June 1. Maybe this is the “summer that didn’t happen” a la 1812 or 1815.
REPLY: Apology accepted, and thank you. We all have our moments – Anthony

NikFromNYC
June 25, 2014 7:51 pm

Owen, if I dare:
(A) Every word that comes from Obama’s mouth regarding the climate is a lie.
You falsely attribute intellectual omnipotence to a mere affirmative action promoted community organizer who is following the policy statements of whole scientific bodies, responsibly.
(B) The people in charge of the data are doctoring it to support Obama. If they don’t they won’t have a job. Steve Goddard is on to something.
No, Goddard is making a scene to try to vastly oversimplify human foibles, divorced from History, now exposed to mass media influence and sensationalism.
(C) To dismiss him as a crank is the same type of tactic the Climate Liars use to dismiss the skeptics. I thought we were better then that !
We are better than them only to the extent that we derate and put on probation the skeptical Michael Mann known as Goddard.

Psalmon
June 25, 2014 7:51 pm

Your blog no longer gets my vote. Shameful AW. Shameful.

Darren Potter
June 25, 2014 8:03 pm

Various parties are not comparing Apples to Apples.
To determine if there has been historic change in Temperature, only weather stations that existed 225 years ago and still exist should be compared over time. With the comparison of those stations raw data and a comparison of their data corrected for U.H.I., which passes scientific and statistical scrutiny.
Idea of comparing a few dozen weather stations of past, to thousands weather stations of decades back, to a couple hundred weather stations of present to determine change in Temperature over years is well, ManBearPig. Idea of creating weather stations / data to fill in historical gaps or make for better distribution throughout world is total HokeySchtik.

Reg Nelson
June 25, 2014 8:03 pm

This debate seems to me to be an exercise in futility.
Goddard could arguably be misguided or biased, and may have used flawed logic, but he is absolutely correct, there is missing data.
More than two thirds of the surface of the Earth is covered by oceans (71% according to the NOAA), and there is essentially no surface temperature data of any historic length or scientific significance for an area that represents, by far, the majority of the planet.
Debating whether the data from a weather station in Kalamazoo or Timbuktu is missing or needs to be adjusted; or about grid and spatial distribution is a fool’s game.
The elephant in the room trumps all.

RokShox
June 25, 2014 8:06 pm

talldave2 says @ June 25, 2014 at 11:55 am
“For fun I will run the analysis again next month to prove the data is arriving and take a swing at an expected correction rate.”
You don’t have to wait. Just look at Goddard’s curve. As of June 2014, the data for 2011 is still missing 25% of the actual measured data. That’s 2.5 years after the end of 2011.
And attributing nefarious intent when Goddard makes a minor change from “stations” to “data” in a figure caption is over the top. That the missing information is “data” is implicit in the fact that the ordinate on his graph is station-months.
Also, every month we get headlines about “hottest ever month in the US”, and it is obvious that these claims are based on data that NOAA must know is incomplete and warm-biased. It is fair for Goddard to point out, at least in the context of these hyped claims, that the claims are based on such incomplete and biased data.
Finally, the fact that the fast-reporting sites are urban and warm-biased suggests that the UHI adjustment is inadequate.

June 25, 2014 8:07 pm

Zeke: “The way that NCDC, GISS, Hadley, myself, Nick Stokes, Chad, Tamino, Jeff Id/Roman M, and even Anthony Watts (in Fall et al) all calculate temperatures is by taking station data, translating it into anomalies by subtracting the long-term average for each month from each station (e.g. the 1961-1990 mean)”
There are 51 stations that had 360 values without an E flag from 1961-1990.
That means only 51 out of 1218 stations have relatively complete data to use as a baseline.
WY MORAN 5 WNW USH00486440 is one of the 51
WY NEWCASTLE USH00486660 is one that failed, Only 61 months of the 260 had an E flag.
And I just looked at the E flag. There lots of other flags.

NikFromNYC
June 25, 2014 8:10 pm

[snip let’s not go there, I don’t want to derail the thread with these wild off topic things. You don’t like Goddard’s style or content, I get it, but please move on to more relevant topics – Anthony]

Darren Potter
June 25, 2014 8:17 pm

“… in order to be able to use any of these at all, a method had to be employed to deal with it, and that was infilling of data. ”
One issue is the “infilling of data” lacked proper notation in databases, and well documented method of how the infilling of data was created.
On flip side of this is keepers of GHCN data started dropping data from selected weather stations that in previous releases had those records, with a propensity of those selected weather stations being located in colder locations.

June 25, 2014 8:23 pm

[snip – this wildly off-topic stuff started by Nik has no place here, sorry – Anthony]

Darren Potter
June 25, 2014 8:23 pm

NikFromNYC says: “O.K. this is now just everyday boring.” “EXHIBIT A: “Using drugs and hypnosis”
You do know the website your posting the above quoted on deals with Alarmism (shams) of Anthropological Global Warming?

June 25, 2014 8:24 pm

Anthonly, I wonder if NikFromNYC knows anything at all?

An Inquirer
June 25, 2014 8:41 pm

Anthony, I do follow your explanations, and I understand your reasoning, but I believe that important points are being missed in this discussion. #1) Alarms should ring if the adjustments are greater than the trend that supposedly is worrisome. The reason for adjustment is understandable, but the degree and even the direction of the change is questionable. Therefore, to base policy on such a adjusted series is perilous. #2) Natural phenomenon do not match the outcome of the adjustments. If the adjustments were reasonable, then the Great Lakes ice would have been at a typical boring level. if the adjustments were reasonable, we should consistently be seeing all-time highs for states. If the adjustments were reasonable, we would be seeing lakes drying up like they did in the 1930s. (I know that some alarmists point out a lake in Georgia and a lake in Texas that were at record low levels during their recent droughts — but these lakes did not exist in the 30s — nor in the droughts of the 50s.)

NikFromNYC
June 25, 2014 8:45 pm

[snip – please see the note above, this is wildly off-topic – Anthony]

NikFromNYC
June 25, 2014 8:50 pm

You are right, Anthony, simply, activists be damned.

Darren Potter
June 25, 2014 9:14 pm

sunshinehours1 says: “There are 51 stations that had 360 values without an E flag from 1961-1990.” “That means only 51 out of 1218 stations have relatively complete data to use as a baseline.”
51 Apples to compare to 51 Apples.
Now back that up to few number of Apples that existed back in 1742 and still exist today…

RossP
June 25, 2014 9:28 pm

I have cut and pasted this from Steve’s blog
” For example in 2013 there were 95,004 final USHCN monthly temperature readings, which were derived from 70,970 raw monthly temperature readings – which means there were 34% more final readings than actual underlying data. This year about 40% of the monthly final data is being reported for stations with no corresponding raw data – i.e. fabricated. ”
Can someone please explain to this simpleton what is wrong with the claim from Steve. ( I’m not interested in whether the word fabricated is right or wrong –personally I’d use a stronger word)

Frank K.
June 25, 2014 9:32 pm

Zeke H. – I remember in a thread a while back that you had access to the NCDC software that does the adjustments of the raw data (particularly TOBS). I tried the link you provided but it was a dead link (apparently). Could you please provide us with a valid link to this software? I would like to examine the methods they use for processing the data. Thanks in advance.

Nick Stokes
June 25, 2014 9:55 pm

RossP says: June 25, 2014 at 9:28 pm
‘I have cut and pasted this from Steve’s blog
” For example in 2013 there were 95,004 final USHCN monthly temperature readings, which were derived from 70,970 raw monthly temperature readings – which means there were 34% more final readings than actual underlying data. This year about 40% of the monthly final data is being reported for stations with no corresponding raw data – i.e. fabricated. ”
Can someone please explain to this simpleton’

Could we start with you or SG explaining how 1218 stations can generate 95004 monthly readings in a year?

NikFromNYC
June 25, 2014 10:03 pm

That’s why you have to move here, you can’t get it online. The social network. True humanity. Normalcy.
Other opinions BANNED.
[Reply: Oh, stop it! ~mod.]

Bill Illis
June 25, 2014 10:09 pm

I propose we freeze the historical temperature record prior to 2011.
No more adjustments to the old records. None. Freeze them as they are.
Maybe problems remain but I see no reason why 1903 temperatures continue to get adjusted down every month – at least 3 of the 12 months in 1903 are adjusted down every single month.
If we are dealing with 1,000 stations in 1903 and 6,000 today, there is absolutely no math involving those stations that results in the 1903 record going down every month. The TOBs adjustment should have been completely implemented years ago. What homogenity adjustment can drop 1903 temperatures now? The base period ended 14 years ago. 1903 happened 110 years ago. It should be completely “done” now.
Write your congressman to pass a new law freezing the historical temperature record of ALL stations prior to 2011.

NikFromNYC
June 25, 2014 10:16 pm

Chain gangs. All of you, pathetically. Low light darkness. But Goddatd is God. I see. Yeah, you thus earn priesthood, you bafooons. Now you win all debate, hicks and cranks.

RossP
June 25, 2014 10:40 pm

Nick Stokes
You are right with your question. Steve G admits he made an error but the percentage difference does not change.
“Good catch. I was accidentally doing cumulative for the year. The correct numbers for 2013 are 14616 and 10863 Percentage doesn’t change.”
Corrected figures
“For example in 2013 there were 14,613 final USHCN monthly temperature readings, which were derived from 10,868 raw monthly temperature readings – which means there were 34% more final readings than actual underlying data. This year about 40% of the monthly final data is being reported for stations with no corresponding raw data – i.e. fabricated.”

Sleepalot
June 25, 2014 11:08 pm

@ NicFromNYC Dude, seriously, you need to chill out.

mobihci
June 26, 2014 12:21 am

I will take your set of always positive adjustments and raise you- here in australia we dont mess around!
we have our new top quality ACORN data showing the daily Tmin HIGHER than the daily Tmax in more than a thousand records. supercharged adjustments!
and the solution to this problem is apparently to just ” set both the max and min equal to the mean of the two.” haha
http://www.warwickhughes.com/blog/?p=3000
from memory these guys (bom) used to say that infilling and altering data with site changes etc would all sort themselves out with some going up and some going down, which is a fair enough assumption if there are grids forming the new data. is that what happens though? my guess this is the point being made by goddard.

June 26, 2014 12:53 am

I can’t get by this graph:
http://www.ncdc.noaa.gov/img/climate/research/ushcn/ts.ushcn_anom25_diffs_urb-raw_pg.gif
I assume the trend continues and that similar corrections have been/are being applied to global data. The current Watts/Goddard squabble doesn’t change that.

June 26, 2014 1:02 am

Anthony, in part 2, Please address the issue of adjusting previous direct measurements. That does not make any sense to me, and it seems to be a common concern. Also, is there any way to to adjust the confidence intervals to reflect the fact the actual number of measured data points is lower than it would appear. That should send the uncertainty up quite a bit.
NikFromNYC, Please calm the hell down. This is not an issue to hyperventilate over. I understand Goddard has been a jackass to you, but let’s stick to the data and what it says. Even rat bastards can hit the truth now and then.

RokShox
June 26, 2014 1:03 am

Re: Steve Case
Interesting question: What method does NOAA itself use to produce that plot?

Stephen Richards
June 26, 2014 1:25 am

REPLY: Some actual data does come in quite late, sometimes months later. Remember that the B-91′s are mailed in.
You must have some really slow posuies in the US. Some of the replaced data from 2013 is still missing.
You are on the wrong tack here Anthony. Change tack to the Steve Mc piste.

richard verney
June 26, 2014 1:40 am

Bill Illis says:
June 25, 2014 at 10:09 pm
//////////////
And lets restore them to what they were before this hysteria took off. Eg., Lets restore them to what the record said say back in 1980.
There should never be adustments to raw data, merely caveats i9dentifying issues with that data.
If TOB has changed, merely mention that TOB has changed. Any adjustment for TOB is a guess, may be an educated guess, but a guess nonetheless.
The problem is that it appears to a large extent that ‘we’ are merely interpreting the effects of the adjustments that we have made to the temperature record these past 30 or so years. If those adjustments are bad, then we have destroyed/basterdised the record and it then tells us nothing of importance.

June 26, 2014 2:27 am

“… Maybe problems remain but I see no reason why 1903 temperatures continue to get adjusted down every month – at least 3 of the 12 months in 1903 are adjusted down every single month. …” — Bill Illis
I think that Steve Goddard has made that point over and over. There is no honest reason to change data in the past. There is especially no reason to change data in the past when it always cools the past and warms the present. This is obviously dishonest handling of the data. I bet they are even destroying the original records so that the data set could never be restored to original. (anyone know about that?)
I would also point out that Steve Goddard made plain that he was using a nom de plume to protect his work conditions a long time ago. Why are attack dogs now claiming that this is somehow dishonest? You people know that “Mark Twain” was not his real name don’t ya? Was Samuel Langhorne Clemens a fraud?
As a final note; it is very obvious that the keepers of the data sets are warmists who use every trick to warm the present and cool the past. Do any of you people really trust these alarmists to honestly adjust the temperature records? (especially the ones in the distant past!) Come on now; do you really want us to just trust them?

Nick Stokes
June 26, 2014 2:33 am

Steve Case says: June 26, 2014 at 12:53 am
“I assume the trend continues and that similar corrections have been/are being applied to global data.”

No. TOBS is an issue with the volunteer observers of the US Coop. In ROW, observers are usually employees who are given instructions.
richard verney says: June 26, 2014 at 1:40 am
“If those adjustments are bad, then we have destroyed/basterdised the record and it then tells us nothing of importance.”

No record has been destroyed. Unadjusted records are published along with adjusted.
The purpose of the adjustment is to make a consistent record for calculating a regional or global average. We looked a few days ago at Wellington, where the station moved in 1928 from Thorndon at sea level to Kelburn at 128 m. So there is a record since 1856, but a drop in 1928 which is not due to climate, and has no place in a global average. No-one is saying that Kelburn or Thorndon records were wrong, but to use the whole record, you have to adjust for consistency.

charles nelson
June 26, 2014 3:16 am

As I write this. Steven Godard has at the top item on his blog a simple animation which flashes between two graphs which have been overlaid. The graphs are allegedly from the same Government organisation and illustrate the same data. They show temperature changes since 1880 and they have been substantially altered between earlier and later versions with the later graph showing a steeper warming trend.
Now there are only three possible explanations for this.
1. Steve Goddard has faked these graphs.
2. The graphs are real and they are evidence of scientific malpractice…or
3. These graphs are real but the alterations are justifiable.
Now this isn’t quantum physics, there is no uncertainty principle, there is no cat in a box waiting to find out if it’s dead or alive, there’s no hedging and no fudge.
This is real straight forward science.
1. 2. or 3. ?

David Walton
June 26, 2014 3:43 am

Thank you Anthony.

Nick Stokes
June 26, 2014 3:48 am

charles nelson says: June 26, 2014 at 3:16 am
“1. 2. or 3. ?”

3.

charles nelson
June 26, 2014 4:19 am

@ Nick Stokes.
Thank you.
How about one paragraph which explains why?

NikFromNYC
June 26, 2014 4:25 am

[snip . . think about it . . mod]

Frank de Jong
June 26, 2014 4:36 am

Anthony,
I can see how the infilling makes sense to have “at least something” to publish, even when many stations haven’t reported in. However, it gives an overly confident representation of the truth. I would say that the proper way of handling missing data is to present a result based on the available (raw) data plus an accompanying (large) error bar.
I’ve made this point before, but I think it’s time that (climate) scientists start to consistently give error bars whenever they are presenting results. Not only does it give important background on how to interpret results, but it also forces anyone wishing to present a result, to start thinking about how errors in their methodology propagate.
In the present case, grid averages could still be calculated, but the “incomplete” months should simply have a larger error bar. It’s probably not too hard to find a first rough estimate for the error bar by comparing historical “early results” with “final results”. If I understand your article correctly, the error bar will probably be larger on the bottom than on the top, as the values usually get lower as more data comes in.
Frank

NikFromNYC
June 26, 2014 4:36 am

“Your comment is awaiting moderation.”
…always…delayed.
Almost as fun as Homeland Security.
Go to the big city, guys.
That’s your homework.
Stop hating some bogeyman. Such posturing jest only makes you foolish and impotent, more so than usual.
I understand Steve Mosher better now. He hates you because you are simply more hateable than nice fossil hippies who throw money at him. Really though, what alternative do you offer? Child rapists and utter crackpots? Yup!
Here today do you do that. You defend the crack house in you own hood, shamefully, mostly unaware, but loudly so.

June 26, 2014 4:37 am

Mods: As my comment that is being held in moderation (I hope, I don’t even see it now) is very similar to charles nelson’s comments I would like to know what got it moderated in the first place. It seems I can make several comments and then all of a sudden they start going to moderation … and just sit there. Why? What words are we not allowed to type here? Thanks for any guidance.
@ charles nelson I posted a comment comparable with your concerns. There is no way that we should change the past. Regardless of what the warmest troll says about your #3, there is no justification to cool the past over and over and over and over again. It is pure fraud to do so.
[your comment contained the word “fraud” which is why it got caught in moderation. Scam and most of its brethren will do the same, as will NAZI etc. It is an automated process and requires one of us to read it and approve it. Also any post containing “Anthony” will be treated like that as the assumption by the software is that you are specifically addressing our host and so wish a response from him. 99% of the time that isn’t the case but it does slow you down particularly when mods are thin on the ground like now. . .mod]

NikFromNYC
June 26, 2014 4:49 am

[snip . . Nik, calm down. You use swear words, NAZI and a lot of intemperate language which is why you are in moderation. Think a bit about how you are saying things because the content is not an issue. Read the comment rules and try and stick to them, please. . . mod]

Latitude
June 26, 2014 4:53 am

The preceeding public service announcement was brought to you by AstraZeneca…
..makers of Seroquel

Latitude
June 26, 2014 6:00 am

The global warming scare did not start in the year 2000….
We can all agree that the temp history prior to the year 2000…was adjusted down
We can all agree that after the year 2000….temperatures stopped rising
…now, can we all agree that all of global warming is based on a temp history that was adjusted down around the year 2000?
Prior to the year 2000, we were told the temp history was accurate, they based their claim on global warming on that temp history…….can we now say global warming was based on a temp history that needed to be adjusted much later?

ferdberple
June 26, 2014 6:23 am

The implication is that infilling is unnecessary
===========
I also have a problem with infilling. gridding is supposed to resolve the problem with too many or too few stations. so if stations are missing, there should be no reason to infill. gridding should take care of this.
the problem I see with gridding and infilling is whether it is 2-D or 3-D. If you are only considering lat and long of the stations, without considering the elevation, you will introduce bias due to the lapse rate.

JustAnotherPoster
June 26, 2014 6:27 am

@NickStokes
“No record has been destroyed. Unadjusted records are published along with adjusted.”
When GISS or NASA publish headlines, “May Hottest May Ever”
http://www.slate.com/blogs/future_tense/2014/06/18/nasa_may_2014_was_the_hottest_may_in_recorded_history.html
Followed by,,,,
Please note May 2014 GISTEMP LOTI numbers preliminary due to a glitch with Chinese CLIMAT data. Update to follow”
https://twitter.com/ClimateOfGavin
There is no P.S. These are adjusted or estimated temperatures.
Its so disingenuous its unbelievable.

Dave
June 26, 2014 6:33 am

I have been a loyal WUWT reader for years but frankly I have a few requirements that WUWT is no longer meeting. Data is measured values, period. Anthony is straying further and further from this absolute requirement of science.
I further find it depressing how Anthony can produce a long boring response like this and completely miss the point. The point is that all adjustments produce warming, to the point of turning cooling trends into warming trends.
Anthony and Zeke are treating Steve like he is the enemy. Anthony has the most highly visited site in the climate field but has to go around to other blogs to character assassinate a fellow skeptic.
I am very disappointed in WUWT and my visit counts will henceforth show it.
REPLY: Maybe you missed the title and the end note about this long response?
It is unfortunate that you are making a knee-jerk decision before reading part 2 coming up today. In that you’ll see why I’m taking the position I have, what areas we can agree on, and how to solve the issue.
If you don’t want to wait for part 2, then I’d say as a reader you aren’t meeting my requirements. Cheers – Anthony

June 26, 2014 6:37 am

Nick Stokes, Zeke,
Why does the old data keep changing from one day to the next?
http://sunshinehours.wordpress.com/2014/06/22/ushcn-2-5-omg-the-old-data-changes-every-day/
Jan 1998 VA NORFOLK INTL AP USH00446139
The station has no flags for Jan 1998.
On Jun 21 USHCNH said tavg Final was 8.18C
On Jun 22 USHCN said tavg Final was 8.33C
Why … from one day to the next did the data go up .15C? I wonder what it is today?
Why do most of the adjustments go up?

ferdberple
June 26, 2014 6:39 am

2. The graphs are real and they are evidence of scientific malpractice…or
================
there is another alternative:
4. The graphs are real and they are evidence of a mathematical mistake, likely resulting from confirmation bias and the lack of experimental controls.
Human beings are notoriously bad at detecting errors that they create. Especially when the error is in the direction that confirms your subconscious beliefs. Thus, the need for double blind controls in experiments. There is little if any evidence that temperature data has been subject to any such controls.
One of the simplest ways to test for bias in the adjustments is to look for a trend. The adjustments should not introduce a trend. And they should have set off alarm bells with those doing the adjustments if they did.
The simple fact that the adjustments are showing a trend tells me there is a problem.

ferdberple
June 26, 2014 6:46 am

Why do most of the adjustments go up?
=================
that in a nutshell is the problem. if the adjustments are introducing a trend (cooling the past, warming the present), this is a problem.
Adjustments should be neutral (random), with no long term trend. If there is a trend, then this trend should be mathematically eliminated by apportioning the trend back into the adjustments. The trend in the adjustments should be mathematically zero, unless you have a bloody good reason why not. Not simply one that is plausible. It better be set in stone.
Otherwise, there is no way to know if the final result shows a real trend, or a trend due to adjustments.

ferdberple
June 26, 2014 6:55 am

Different methods clearly give different results and the”hockey stick” disappears when other methods are used.
=================
Yet, all 4 methods show show a net warming trend in the adjustments over time. this makes no sense mathematically. adjustment errors should be random because the errors you are correcting should be random, so the adjustments should show no trend.
the fact that all 4 methods show a very similar trend indicates the trend is real and that the adjustments are mathematically faulty.

ferdberple
June 26, 2014 7:05 am

Anthony, your graph USHCN Adjustments by Method, Yearly is very clear.
While it shows that the 4 methods give somewhat different results, all 4 methods show a warming trend in the adjustments. Mathematically this should not be. The adjustments should be neutral over time, due to the random distribution of the errors.
It is the trend in the adjustments that is important. Not the differences between what Steve Goddard calculates and what you calculate, because both of you are showing that the trend in the adjustments exists and it is consistently biased.
I would council that both parties concentrate on this point. That the adjustments should not be showing a trend over time. The fact that one method shows a greater trend than the other is really not a big deal, when all methods are showing that the trend lies in the same direction.

Dougmanxx
June 26, 2014 7:05 am

sunshinehours1 says:
June 26, 2014 at 6:37 am
To answer your question using the June 26th “data”:
On Jun 26 USHCN said tavg Final was 8.34C for VA NORFOLK INTL AP USH00446139
So January 1998 at that station just got .01 degrees warmer in the last 4 days. But there’s nothing wrong with the data….
What a complete farce.

Eliza
June 26, 2014 7:09 am

Wow it appears that in the main, apart from Stokes ect., that most of the postings above are in fact SUPPORTING Goddard, not WUWT. Take note. When you look at the WHOLE picture its obvious that Goddard is correct just look at all the “other” adjustments everywhere BOM, NZ, GISS graphs its all over the place this is just ONE quibble (this whole post is just about that ONE quibble). Anyway I back Goddard 100% his contributions outnumber any other I’ve seen in the ciurrent AGW debate.So far all the current climate data is supporting his contention (no Warming, AGW is Bull####) take note. Take Five LOL.

Owen
June 26, 2014 7:13 am

Owen, if I dare:
(A) Every word that comes from Obama’s mouth regarding the climate is a lie.
You falsely attribute intellectual omnipotence to a mere affirmative action promoted community organizer who is following the policy statements of whole scientific bodies, responsibly.
Nik,
‘ following the policy statements of whole scientific bodies ‘ AHAHAHAHA.
Thanks for the laughs man. Al Gore is your mentor, isn’t he.

Alexej Buergin
June 26, 2014 7:25 am

Nick Stokes says:
June 26, 2014 at 3:48 am
charles nelson says: June 26, 2014 at 3:16 am
“1. 2. or 3. ?”
3.
So what was the average temperature in Reykjavik in 1940?
(And why do you think that the meteorologists from Iceland are very stupid?)

James Strom
June 26, 2014 7:27 am

Steve McIntyre says:
June 25, 2014 at 1:05 pm
I didn’t directly comment on Goddard’s method as I haven’t examined it.
____
Who would be better to examine all this stuff and put the dispute to rest?

June 26, 2014 8:07 am

In my Jun 22 2014 copy of USHCN monthly there were 1218 files so there should be 14,616 monthly values.
For 2013:
11,568 had a non-blank value. – 79%
9,384 did not have an E (for estimated) flag.- 64%
7,374 had no flag at all – 50%
For 1998:
14,208 had a non-blank value.
12,316 did not have an E (for estimated) flag.
6,702 had no flag at all

Eliza
June 26, 2014 8:18 am

Fron Goddards site with the raw data!
The graph is quite accurate. For example in 2013 there were 14,613 final USHCN monthly temperature readings
ftp://ftp.ncdc.noaa.gov/pub/data/ushcn/v2.5/ushcn.tavg.latest.FLs.52i.tar.gz, which were derived from 10,868 raw monthly temperature readings
ftp://ftp.ncdc.noaa.gov/pub/data/ushcn/v2.5/ushcn.tavg.latest.raw.tar.gz
which means there were 34% more final readings than actual underlying data. This year about 40% of the monthly final data is being reported for stations with no corresponding raw data – i.e. fabricated”
What is it you don’t get.
Sincerely

June 26, 2014 8:37 am

Talldave2,
Interestingly enough, my recent paper on UHI found results similar to Steve’s blog post in the raw data:
I look forward to the corrections that will restore the 20C negative US trend. 🙂
Also, you may have missed the part in Steve’s post where he uses anomalies rather than absolute temperatures :-p
Not relevant, as he wasn’t looking for something hidden in baseline cooling. Do you at least acknowledge that using anomalies will tend to hide baseline cooling?
Its worth mentioning again that infilling (as done by NCDC) has virtually no effect on the trends in temperatures over time
Not the homogenized temps, no, because you’re smearing them around anyway. That’s why you need to use the absolute, unhomogenized data. It’s like you’re mixing plain and chocolate milk together, pouring the result into cups marked PLAIN and CHOCOLATE, and then noting that removing the cups labelled CHOCOLATE doesn’t change the overall chocolate content of all your milk. Well, of course not!
This kind of 3-card monte is why Goddard insists on using actual recorded temperatures.

June 26, 2014 8:41 am

Sigh, I feel like we’ve had the same conversation way too many times now. But I thank Zeke for stopping by and engaging anyway, it’s more than anyone else seems willing to do.
I’ll reserve further comment until Tony’s part 2 and try to do something productive 🙂

Frank K.
June 26, 2014 8:51 am

Well, I guess Zeke is going to once again ignore my request for a valid link to the NCDC data processing software. That’s OK – whenever we get into these discussions, no one wants to give us the software source codes that NCDC is using so we can see exactly how they are arriving at their adjustments! [sigh].
And it doesn’t matter what metrics GISS and other publish since they start with the NCDC adjusted temperatures, and rightly point to them if there is a problem or glitch (unless of course the issue is with the crappy processing algorithm GISS is using…).

Latitude
June 26, 2014 8:55 am

Dave, that’s a great analogy…..
This claim: “More than 40% of USHCN final station data is now generated from stations which have no thermometer data.”
—–
He replied back with a new graph and the strawman argument and a new number:
—–
The data is correct.
Since 1990, USHCN has lost about 30% of their stations, but they still report data for all of them.
——-
UH, he was talking about two different things………..

June 26, 2014 9:04 am

The trouble with anomalies …
By my count, there are only 51 stations with 360 monthly values from 1961-1990 (Zeke’s baseline) that are not estimated.
Its all chocolate milk …

June 26, 2014 9:17 am

Nick Stokes, Zeke, Anthony.
What we need is an R script that creates Estimated values for any set of USHCN monthly values we choose to throw at it.
Just the Final Step to start with.
Then we can compare what USHCN “estimates” to reality by randomly removing stations.
That way we could actual see how “Estimating” works … or doesn’t work. And run it for the next year or 2.
And we need the old data to quit changing.
Transparency is important.

June 26, 2014 9:20 am

“I assume the trend continues and that similar corrections have been/are being applied to global data.”
Nick Stokes said at 2:33 am
No. TOBS is an issue with the volunteer observers of the US Coop. In ROW, observers are usually employees who are given instructions.
So the adjustments stopped at the U.S. border and the beginning of the new millennium. Interesting.
Sort of like sea level rise notching up a click or two when the altimetry satellites were launched in 1992

Samuel C Cogar
June 26, 2014 9:52 am

Anthony says:
The newly commissioned USCRN will solve that with its new data gathering system, some of its first data is now online for the public.
And cited said “first data” via the Contiguous US Average Temperature Anomaly graph.
Source: NCDC National Temperature Index time series plotter
——————–
I found that graph quite interesting given the fact that the greatest “anomaly” (+/- of the 0.00 grid line) for each of the years covered by said graph (2004 thru 2014) always occurred between the normally “coldest” of months of November and February inclusive.
None of said greatest “anomalies” occurred during the normally “hottest” of months of June and August inclusive.
Now if I am reading said graph correctly ….. then my question is:
What has the most effect on the US Yearly Average Surface Temperature calculations, ….. the “winter” (Nov/Feb inclusive) temperature anomalies ….. or the “summer” (June/Aug inclusive) temperature anomalies?
In other words, are the “warmer” winter temperatures “driving” the claimed increase in Average Surface Temperatures, ……. or are the “hotter” summer temperatures “driving” the claimed increase in Average Surface Temperatures?
Or, “in other words” as stated by:
GeologyJim on: June 25, 2014 at 2:33 pm
The averages seem to show upward trends over time, but much of that seems due to rising overnight low values – – not to rising daytime high temperatures. Does anyone really doubt this?
—————-
Just change the two (2) “noted” words in his above comment to read …. “The averages seem to show upward trends over time, but much of that seems due to rising winter time low values” …….. not to rising “summer time high temperatures”.
————
And my next question is, …… why is anyone getting excited or concerned about the adjusted, … infilled …. and/or interpolated data that is contained within the various temperature records of/for the past forty (40) years …. when they are, …. in actuality, …… like 500% more accurate than the adjusted, … infilled …. and/or interpolated data that is contained within the various temperature records from pre-1950?
It is of my opinion that if one wishes to have more accurate Average Surface Temperature measurements then it is prerequisite that all Surface Measurement Stations be converted to “liquid anti-freeze” based measurements ….. whereby the “liquid” volume itself would do the “averaging” of all the daily/weekly variations in/of near-surface air temperatures.
It takes less time to do a job right … than it does to explain why you did it wrong

June 26, 2014 9:59 am

“No record has been destroyed”
Is that true? Didn’t CRU toss the originals because they “lacked storage space?”
Even if you believe that no recorded raw data is missing, it is abundantly clear that thermometers have been destroyed. There are fewer of them today than 30 years ago. Given the “save the Planet” peril believed by key people governing the disbursement of funds, you’d think we would be measuring with more thermometers more frequently than ever. So there is no doubt that records that should have and could have should be made have been destroyed. It is far easier to fabricate data than it is to actually measure it.
Finally, I return to BEST. (See WUWT June 10, Why Automatic Temperature adjustments Don’t Work and my concuring comment. The BEST process of slicing long running records, preserving instrument drift and cutting out the recalibration destroys records. It destroys recalibration information. It preserves noise from drift as signal. The scalpel it destroys long records and replaces them with shorter segments, thereby attenuates and filters out the lower frequency components of the Fourier spectrum in the original data. Climate signal is ALL low frequency — it is Weather that is high frequency. No amount of regional homogenization can preserve the low frequency content destroyed in the slicing process. The most homogenization can do is “fabricate” low frequency in the final product. In this case, perhaps a better word than “fabricate” is “counterfeit” — to make something that looks real, to use it as real, but has little real intrinsic value. The coin looks good, but most of the gold has been removed from the alloy.

June 26, 2014 10:00 am

“Well, I guess Zeke is going to once again ignore my request for a valid link to the NCDC data processing software. That’s OK – whenever we get into these discussions, no one wants to give us the software source codes that NCDC is using so we can see exactly how they are arriving at their adjustments! [sigh].”
ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/v3/software/52i/
http://www1.ncdc.noaa.gov/pub/data/ghcn/v3/techreports/Technical%20Report%20NCDC%20No12-02-3.2.0-29Aug12.pdf
http://www.ncdc.noaa.gov/oa/climate/research/ushcn/#phas

June 26, 2014 10:07 am

“Reg Nelson says:
June 25, 2014 at 8:03 pm
This debate seems to me to be an exercise in futility.
Goddard could arguably be misguided or biased, and may have used flawed logic, but he is absolutely correct, there is missing data.”
################
when there is missing data there is ONE THING you can NEVER do.
average absolute temperatures.
If people can begin to see that, then we can have a real discussion about the OPTIONS
one has when data is missing.
But one thing is clear. If there is missing data you cannot average absolute temperatures.
that is what goddard does. it is wrong. It is the most wrong solution to the problem of missing data

Eliza
June 26, 2014 10:52 am

“It takes less time to do a job right … than it does to explain why you did it wrong” Excellent this is why Zekes and Stokes and other modelers replies go on for hourssssss. LOLJust joking please take it lightly.

Solomon Green
June 26, 2014 11:15 am

I await part 2 but on one (small) point I must disagree with Anthony when he writes ” The word “fabrication” is the wrong word to use, as it implies the data is being plucked out of thin air. It isn’t – it is being gathered from nearby stations and used to create a reasonable estimate.” The word fabrication means exactly what it says – the data has been constructed and not observed.
It many be that the construction is soundly based but it is still a construct. If, as suggested by RossP, Sunshinehours1 and Eliza, 40% of the data is fabricated then the room for error is substantial, no matter how sound the estimation.

ferdberple
June 26, 2014 11:16 am

Didn’t CRU toss the originals because they “lacked storage space?”
=============
At 09:41 AM 2/2/2005, Phil Jones wrote:
Mike, I presume congratulations are in order – so congrats etc !
Just sent loads of station data to Scott. Make sure he documents everything better this time ! And don’t leave stuff lying around on ftp sites – you never know who is trawling them. The two MMs have been after the CRU station data for years. If they ever hear there is a Freedom of Information Act now in the UK, I think I’ll delete the file rather than send to anyone.

ferdberple
June 26, 2014 11:17 am

“The two MMs have been after the CRU station data for years. If they ever hear there is a Freedom of Information Act now in the UK, I think I’ll delete the file rather than send to anyone.”

June 26, 2014 11:21 am

For the past few weeks I’ve lurked at nearly all the sites doing the arguing on this subject, and it dawns on me that the arguments against Steve G’s approach are tenuous. It seems to me for a variety of reasons starting with the actual measured and unadjusted data is considered wrong when question is how do the adjustments effect the unadjusted data.
But then when we see alternative solutions, it is to adjust the unadjusted data, or ignore major portions of it away.
What am I missing in this discussion?

June 26, 2014 11:29 am

Mosher: “But one thing is clear. If there is missing data you cannot average absolute temperatures.”
But if only 51 stations have 360 monthly values for 1961-1990, you can’t do anomalies either. The “baseline” is contaminated by missing data.

FundMe
June 26, 2014 11:30 am

So do I have this right
Every station has an individual climatology from which the anomalies are calculated
When a station is missing data it refers to the nearest station that has data using the nearest stations climatology.
All the stations are then grid averaged (grid weighting included of course)
OK so now a whole lot of missing stations report in. However with the new data the climatology reverts to its old self so that as data is added the climatology is constantly shifting (for each station). This then triggers a change in the past record because using the new un-infilled data changes the past climatology therefore the anomaly calculations. The algorithm is built to constantly scan for problems so the changes in reporting stations constantly alter the past.
This sounds crazy…. a constantly shifting climatology data record…just weird someone please tell me I am wrong.

June 26, 2014 11:55 am

Oh, by the way Mosher. There is only ONE station with no data flags between 1961 and 1990.
USH00301012 BUFFALO NIAGARA INTL

Frank K.
June 26, 2014 1:54 pm

Steven Mosher says:
June 26, 2014 at 10:00 am
Link #1 is dead (does not work for me)…does it work for you? If so, please upload the software to a third party site.
Links #2 and #3 have no source code.
Sorry…

Jeff Alberts
June 27, 2014 7:45 am

Zeke: First, he is simply averaging absolute temperatures rather than using anomalies. Absolute temperatures work fine if and only if the composition of the station network remains unchanged over time.

Averaging these intensive properties is completely physically meaningless. So no, averaging “absolute temperatures” or “anomalies” doesn’t “work fine”, unless your goal is something meaningless.

richard verney
June 28, 2014 2:05 am

Steven Mosher says:
June 26, 2014 at 10:07 am
“……..
################
when there is missing data there is ONE THING you can NEVER do.
average absolute temperatures……”
//////////////////////////////////
I sometimes wonder whether you read what you write before you submit your comments.
If there is missing data, there is missing data. It makes no difference as to whether the missing data is the absolute temperature, or the anomaly. The underlying data is mssing, simples.
When you fabricate the anomaly, you are in effect fabricating the absolute temperature, albeit you are entering it in your ‘records’ as an anomaly, not as an absolute temperature figure.
The importent point is that every time there is missing data, the error in the record grows. Every time there is missing data, it becomes more and more uncertain what the observed and collected data is telling us. This becomes particularly problematic when we are trying to wean out a signical in the form of anomalies measured in tenths of degree in circumstances where the underlying uncertainty in the raw data is in whole degrees. The signal is less than the bandwith of the marghin of error in the data stream,
I understand your arguments when discussing the thermometer record ‘well that’s all we have’, and whilst I accept that this forces us to make the best of a bad bunch, what is important is to make it absolutely clear how unreliable the record being discussed and interpreted actually is, and its true margins of error. The biggest problem when discussing these data sets is the less than honest (and I use that expression deliberately) detailing and recognition of the margin of error bandwidth, and consequential over reliance upon the certainty of what they are telling us. we are over extrapolating the record beyond its reasonable bounds, and this should be faced up to. .

Samuel C Cogar
June 28, 2014 4:45 am

Averages are abstract numbers. The sole purpose of “averages” is to obtain a “snap-shot” picture of a specific entity at a specific time. Thus, they only exist in “time & place” but can be or are represented by a numerical figure. They are neither concrete nor physical quantities …. and therefore should never be used to make “statements-of-fact” about anything except the calculated “average” itself.
Given the fact that “averages” are abstract numbers it matters little if the “number set” is in error or not, ….. the calculated “average” for said “number set” is still a correct figure.
Remember, ….. “averages” are akin to ”boats”, ….. they “rise & fall” with the changes in/of the “waves & tides”. And “tsunamis”, also. 🙂 :)