Rumours of my death have been greatly exaggerated

Image: NOAA USHCN COOP station at Hanksville, UT, sited over a grave. Click for larger image. Photo by surfacestations volunteer Juan Slayton

by Anthony Watts

There has been a lot of buzz about the Menne et al 2010 paper “On the reliability of the U.S. Surface Temperature Record” which is NCDC’s response to the surfacestations.org project. One paid blogger even erroneously trumpeted the “death of UHI” which is humorous, because the project was a study about station siting issues, not UHI. Anybody who owns a car with a dashboard thermometer who commutes from country to city can tell you about UHI.

There’s also claims of this paper being a “death blow” to the surfacestations project. I’m sure in some circles, they believe that to be true. However, it is very important to point out that the Menne et al 2010 paper was based on an early version of the surfacestations.org data, at 43% of the network surveyed. The dataset that Dr. Menne used was not quality controlled, and contained errors both in station identification and rating, and was never intended for analysis. I had posted it to direct volunteers to so they could keep track of what stations had been surveyed to eliminate repetitive efforts. When I discovered people were doing ad hoc analysis with it, I stopped updating it.

Our current dataset at 87% of the USHCN surveyed has been quality controlled.

There’s quite a backstory to all this.

In the summer, Dr. Menne had been inviting me to co-author with him, and our team reciprocated with an offer to join us also, and we had an agreement in principle for participation, but I asked for a formal letter of invitation, and they refused, which seems very odd to me. The only thing they would provide was a receipt for my new data (at 80%) and an offer to “look into” archiving my station photographs with their existing database.  They made it pretty clear that I’d have no significant role other than that of data provider. We also invited Dr. Menne to participate in our paper, but he declined.

The appearance of the Menne et al 2010 paper was a bit of a surprise, since I had been offered collaboration by NCDC’s director in the fall. In typed letter on  9/22/09 Tom Karl wrote to me:

“We at NOAA/NCDC seek a way forward to cooperate with you, and are interested in joint scientific inquiry. When more or better information is available, we will reanalyze and compare and contrast the results.”

“If working together cooperatively is of interest to you, please let us know.”

I discussed it with Dr. Pielke Sr. and the rest of the team, which took some time since not all were available due to travel and other obligations. It was decided to reply to NCDC on a collaboration offer.

On November 10th, 2009, I sent a reply letter via Federal Express to Mr. Karl, advising him that we would like to collaborate, and offered to include NCDC in our paper.. In that letter I also reiterated my concerns about use of the preliminary surfacestation data (43% surveyed) that they had, and spelled out very specific reasons why I didn’t think the results would be representative nor useful.

We all waited, but there was no reply from NCDC to our reply to offer of collaboration by Mr. Karl from his last letter. Not even a “thank you, but no”.

Then we discovered that Dr. Menne’s group had submitted a paper to JGR Atmospheres using my preliminary data and it was in press. This was a shock to me since I was told it was normal procedure for the person who gathered the primary data the paper was based on to have some input in the review process by the journal.

NCDC uses data from one of the largest volunteer organization in the world, the NOAA Cooperative Observer Network. Yet NCDC director Karl, by not bothering to reply to our letter about an offer he initiated, and by the journal not giving me any review process opportunity, extends what Dr. Roger Pielke Senior calls “professional discourtesy” to my own volunteers and my team’s work. See his weblog on the subject:

Professional Discourtesy By The National Climate Data Center On The Menne Et Al 2010 paper

I will point out that Dr. Menne provided thanks to me and the surfacestations volunteers in the Menne et al 2010 paper, and I hear through word of mouth, also in a  recent verbal presentation. For that I thank him. He has been gracious in his communications with me, but I think he’s also having to answer to the organization for which he works and that limited his ability to meet some of my requests, like a simple letter of invitation.

Political issues aside, the appearance of the Menne et al 2010 paper does not stop the surfacestations project nor the work I’m doing with the Pielke research group to produce a peer reviewed paper of our own. It does illustrate though that some people have been in a rush to get results. Texas state Climatologist John Nielsen-Gammon suggested way back at 33% of the network surveyed that we had a statistically large enough sample to produce an analysis. I begged to differ then, at 43%, and yes even at 70% when I wrote my booklet “Is the US Surface Temperature Record Reliable?, which contained no temperature analysis, only a census of stations by rating.

The problem is known as the “low hanging fruit problem”. You see this project was done on an ad hoc basis, with no specific roadmap on which stations to acquire. This was necessitated by the social networking (blogging) Dr. Pielke and I employed early in the project to get volunteers. What we ended up getting was a lumpy and poorly spatially distributed dataset because early volunteers would get the stations closest to them, often near or within cities.

The urban stations were well represented in the early dataset, but the rural ones, where we believed the best siting existed, were poorly represented. So naturally, any sort of study early on even with a “significant sample size” would be biased towards urban stations. We also had a distribution problem within CONUS, with much of the great plains and upper midwest not being well represented.

This is why I’ve been continuing to collect what some might consider an unusually large sample size, now at 87%. We’ve learned that there are so few well sited stations, the ones that meet the CRN1/CRN2 criteria (or NOAA’s 100 foot rule for COOPS) are just 10% of the whole network. See our current census:

When you have such a small percentage of well sited stations, it is obviously important to get a large sample size, which is exactly what I’ve done. Preliminary temperature analysis done by the Pielke group of the the data at 87% surveyed looks quite a bit different now than when at 43%.

It has been said by NCDC in Menne et al “On the reliability of the U.S. surface temperature record” (in press) and in the June 2009 “Talking Points: related to “Is the U.S. Surface Temperature Record Reliable?” that station siting errors do not matter. However, I believe the way NCDC conducted the analysis gives a false impression because of the homogenization process used. As many readers know, the FILNET algorithm blends a lot of the data together to infill missing data. This means temperature data from both well sited and poorly sited stations gets combined to infill missing data. The theory is that it all averages out, but when you see that 90% of the USHCN network doesn’t meet even the old NOAA 100 foot rule for COOPS, you realize this may not be the case.

Here’s a way to visualize the homogenization/FILNET process. Think of it like measuring water pollution. Here’s a simple visual table of CRN station quality ratings and what they might look like as water pollution turbidity levels, rated as 1 to 5 from best to worst turbidity:

CRN1-bowl
CRN2-bowl
CRN3-bowl
CRN4-bowl
CRN5-bowl

In homogenization the data is weighted against the nearby neighbors within a radius. And so a station might start out as a “1” data wise, might end up getting polluted with the data of nearby stations and end up as a new value, say weighted at “2.5”. Even single stations can affect many other stations in the GISS and NOAA data homogenization methods carried out on US surface temperature data here and here.

bowls-USmap

In the map above, applying a homogenization smoothing, weighting stations by distance nearby the stations with question marks, what would you imagine the values (of turbidity) of them would be? And, how close would these two values be for the east coast station in question and the west coast station in question? Each would be closer to a smoothed center average value based on the neighboring stations.

Essentially, in my opinion, NCDC is comparing homogenized data to homogenized data, and thus there would not likely be any large difference between “good” and “bad” stations in that data. All the differences have been smoothed out by homogenization (pollution) from neighboring stations!

The best way to compare the effect of siting between groups of stations is to use the “raw” data, before it has passed through the multitude of adjustments that NCDC performs. However NCDC is apparently using homogenized data. So instead of comparing apples and oranges (poor sited -vs- well sited stations) they essentially just compare apples (Granny Smith -vs- Golden delicious) of which there is little visual difference beyond a slight color change.

We saw this demonstrated in the ghost authored Talking Points Memo issued by NCDC in June 09 in this graph:

http://wattsupwiththat.files.wordpress.com/2009/06/ncdc-surfacestations-rebuttal-graph.png

Referencing the above graph, Steve McIntyre suggested in his essay on the subject:

The red graphic for the “full data set” had, using the preferred terminology of climate science, a “remarkable similarity” to the NOAA 48 data set that I’d previously compared to the corresponding GISS data set here (which showed a strong trend of NOAA relative to GISS). Here’s a replot of that data – there are some key telltales evidencing that this has a common provenance to the red series in the Talking Points graphic.

When I looked at SHAP and FILNET adjustments a couple of years ago, one of my principal objections to these methods was that they adjusted “good” stations. After FILNET adjustment, stations looked a lot more similar than they did before. I’ll bet that the new USHCN adjustments have a similar effect and that the Talking Points memo compares adjusted versions of “good” stations to the overall average.

There’s references in the new Menne et al 2010 paper to the new USHCN2 algorithm and we’ve been told how it is supposed to be better. While it does catch undocumented station moves that USHCN 1 did not, it still adjusts data at USHCN stations in odd ways, such as this station in rural Wisconsin, and that is the crux of the problem.

USHCN station at Hancock Experiment Farm, WI

Or this one in Lincoln, IL at the local NWS office where they took great effort to have it well sited.

Lincoln, IL USHCN station, NWS office in background. Click to enlarge

Thanks to Mike McMillan for the graphs comparing USHCN1 and USHCN2 data

Notice the clear tendency in the graphs comparing USHCN1 to USHCN2 to cool off the early record and leave the current levels near recently reported levels or to increase them. The net result is either reduced cooling or enhanced warming not found in the raw data.

As for the Menne et all 2010 paper itself, I’m rather disturbed by their use of preliminary data at 43%, especially since I warned them that the dataset they had lifted from my website (placed for volunteers to track what had been surveyed, never intended for analysis) had not been quality controlled at the time. Plus there are really not enough good stations with enough spatial distribution at that sample size. They used it anyway, and amazingly, conducted their own secondary survey of those stations, comparing it to my non-quality controlled data, implying that my 43% data wasn’t up to par. Well of course it wasn’t! I told them about it and why it wasn’t. We had to resurvey and re-rate a number of stations from early in the project.

This came about only because it took many volunteers some time to learn how to properly ID them. Even some small towns have 2-3 COOP stations nearby, and only one of them is “USHCN”. There’s no flag in the NCDC metadatabase that says “USHCN”, in fact many volunteers were not even aware of their own station status. Nobody ever bothered to tell them. You’d think if their stations were part of a special subset, somebody at NOAA/NCDC would notify the COOP volunteer so they would have a higher diligence level?

If doing an independent stations survey was important enough for NCDC to do to compare to my 43% data now for their paper, why didn’t they just do it in the first place?

I have one final note of interest on the station data, specifically the issue of MMTS thermometers and their tendency to be sited closer to building due to cabling issues.

Menne et al 2010 mentioned a “counterintuitive” cooling trend in some portions of the data. Interestingly enough, former California State Climatologist James Goodridge did an independent analysis ( I wasn’t involved in data crunchng, it was a sole effort on his part) of COOP stations in California that had gone through modernization, switching from Stevenson Screens with mercury LIG thermometers to MMTS electronic thermometers. He sifted through about 500 COOPs in California and chose stations that had at least 60 years of uninterrupted data, because as we know, a station move can cause all sorts of issues. He used the “raw” data from these stations as opposed to adjusted data.

He writes:

Hi Anthony,

I found 58 temperature station in California with data for 1949 to 2008 and where the thermometers had been changed to MMTS and the earlier parts were liquid in glass. The average for the earlier part was 59.17°F and the MMTS fraction averaged 60.07°F.

Jim

A 0.9F (0.5C) warmer offset due to modernization is significant, yet NCDC insists that the MMTS units are tested at about 0.05C cooler. I believe they add this adjustment into the final data. Our experience shows the exact opposite should be done and with a greater magnitude.

I hope to have this California study published here on WUWT with Jim soon.

I realize all of this isn’t a complete rebuttal to Menne et al 2010, but I want to save that option for more detail for the possibility of placing a comment in The Journal of Geophysical Research.

When our paper with the most current data is completed (and hopefully accepted in a journal), we’ll let peer reviewed science do the comparison on data and methods, and we’ll see how it works out. Could I be wrong? I’m prepared for that possibility. But everything I’ve seen so far tells me I’m on the right track.

If doing a stations survey was important enough for NCDC to do to compare to my data now for their paper, why didn’t they just do it in the first place?

We currently have 87% of the network surveyed (1067 stations out of 1221), and it is quality controlled and checked. I feel that we have enough of the better and urban stations to solve the “low hanging fruit” problem of the earlier portion of the project. Data at 87% looks a lot different than data at 43%.

The paper I’m writing with Dr. Pielke and others will make use of this better data, and we also use a different procedure for analysis than what NCDC used.

Get notified when a new post is published.
Subscribe today!
0 0 votes
Article Rating
284 Comments
Inline Feedbacks
View all comments
geo
January 28, 2010 7:59 am

Btw, regarding the guidelines developed by Meteo France. . . is there a nice readable article somewhere on how those were developed, and the science that went in to doing so? In other words, why should we have confidence that those guidelines actually produce the results they say? If there isn’t such an article, perhaps inviting them to provide one would be appropriate?
REPLY: Yes, there’s a paper my Michel Leroy, who developed it.

From the USHCRN manual:
The USCRN will use the classification scheme below to document the
“meteorological measurements representativity” at each site.
This scheme, described by Michel Leroy (1998), is being used by Meteo-France
to classify their network of approximately 550 stations. The classification
ranges from 1 to 5 for each measured parameter. The errors for the different
classes are estimated values.

Over at Climate Audit in a comment, Hu McCulloch has a translation of the French paper:
Hu McCulloch
RE 310, 313, I actually did a rough translation of Michel Leroy’s Meteo France site classification manual last year, which is the source of the 5 degree error figure for Class 5.

Here’s how to get one:
I still haven’t heard back from Leroy if it’s OK to release my translation. Perhaps he heard I’ve been keeping bad company…. Anyway, if anyone wants an unofficial copy of one or both, pls e-mail me at mcculloch (dot)2(at)osu(dot)edu.

Tenuc
January 28, 2010 8:20 am

DEK (21:05:41) :
“Facts have an amazing way of cutting through political BS. I look at the UAH lower tropsphere data today and see significant warming over the year 2000. Warming. I do not care about your statistical follies. I look at the ground truth. What can be measured and recorded.”
I agree with you that facts are important, but at the end of the day we just don’t know what the global average temperature is, as none of the instruments climate scientists currently use are perfect, including the satellite MSU’s.
Before climate science can advance we need to find a method to determine what’s happening which doesn’t rely on some calibrated proxy method or statistical sleight of hand. Strange we don’t seem to be getting even the basics right, like good thermometer placement? It beggars belief!
Menne et al have behaved in a despicable manner – it will be good to see them roasted when the report using the extended surface-stations project data. Revenge is a dish best served cold.

Rob
January 28, 2010 8:28 am

Phil M (22:27:21) :
Pat Frank (21:52:20) :
Whatever the nuance, two things are certain: 1) Anthony put the data on the internet 2) Someone downloaded it and used it. It wasn’t hacked or stolen.
His reluctance to release data seems at odds with accusations leveled at AGW climate scientists. Particularly since it is, in essence, an inventory of public property.

Anthony need not have released any of the data until he was ready, it

Tim Clark
January 28, 2010 8:52 am

EW (02:50:11) :
Phil M (19:37:08) wrote:
“Anthony posted his data on the internet. Is there anything on the internet that you feel is not in the public domain?”

Why don’t you read their guidelines to see if they adhered to them?
http://history.nasa.gov/footnoteguide.html

Dan
January 28, 2010 9:08 am

Don’t play golf with Dr. Menne. Once he gets done homogenizing his score he’ll be able to card something under par even though he used a hundred strokes to get around.

Richard M
January 28, 2010 9:21 am

DEK (21:05:41) :
Facts have an amazing way of cutting through political BS. I look at the UAH lower tropsphere data today and see significant warming over the year 2000.
What warming? As has been shown several times there is no warming this century. The warming all took place in the 1980s and 90s according to UAH.
Maybe you are just unaware that you need to compare like periods. We are currently in a El Nino situation. If you look closely at the post 1998 period you will the the temps balance almost perfectly when you compare El Ninos to El Ninos and La Ninas to La Ninas.
As for the rest of your rant, since it was based on a misunderstanding, I hope you will revisit your thought process.

Richard M
January 28, 2010 9:25 am

The skeptical science blog seems to have gone amusingly quiet.

JP
January 28, 2010 9:29 am

Found this among comments at skepticalscience. Is Dave right??
Kforestcat at 12:20 PM on 23 January, 2010
Gentlemen
You really ought to read the methods used before you gloat. The individual station anomaly measurements were based on each stations “1971-2000 station mean”. See where the document states:
“Specifically, the unadjusted and adjusted monthly station values were converted to anomalies relative to the 1971–2000 station mean.”
In other words, the only thing this study measures is the difference in instrument error at each station. The absolute error occurring at individual stations because the station had not been properly located is not measured. A poor station with an absolute temperature error of +5 degrees C still has a bias error of +5 degree C – no matter what the variation occurring due to instrumentation type.
I’m a chemical engineer with U.S. government and 20 years of research experience in various areas including environmental mitigation. If one of my phD’s came to me with this nonsense, I’d fire him on the spot.
—–
Another post:
Gentlemen
I’m fully aware of how anomaly data is used ( having used it in my own research) and I know full well what can go awry in the field experiments. We are talking about every day instrument calibration and QA/QC – this is not rocket science. I firmly maintain the Menne 2010 paper is fundamentally flawed and entirely useless.
NASA’s individual station temperature readings are taken in absolute temperature (not as an anomaly as you have suggested). The temperature data is reduced to anomaly after the absolute temperature readings for a site are obtained. For example see, the station data Orland (39.8 N, 122.2 W) obtained directly from the NASA’s GISS web site. The temperatues are recorded in Annual Mean Temperature in degrees C – not as an anomaly as you have suggested. (Tried to attach a NASA GIF as visual aid -but did not succeed).
Bottom line. Menne has to have (and use) absolute temperature data to get the 1971-2000 mean temperature and then divide the current temp with the mean to get the anomaly. We are back to the same problem – Menne is measuring instrument error – he is not measuring error resulting from improper instrument location. The Menne paper is absolutely useless for the stated purpose.
Anyone who actually collects field data, I have, knows they are going to immediately run into two fundamental problems when an instrument is improperly located. 1) they are not reading ambient air temperature and 2) neither temperature readings nor the anomaly can be corrected back to a true ambient because other factors are influencing the readings.
For example: Suppose we have placed our instrument in a parking lot. Say the mean 1971-2000 temperature well away from the parking lot is 85F; but the instrument is improperly reading a mean of 90F. Now on a given day, say the ambient temp is 93 but your instrument is reading 105F (picked up some radiant heat from a car). Ok our:
Actual anomaly is 93F – 85F = 8F;
Instrument anomaly is 105F – 90F = 15F.
The data is trash. There is simply no way to recover either the actual ambient temperatures nor an accurate anomaly reading. What you are missing is that an improperly placed instrument is reading air temperatures & anomalies influenced by unnatural events.
The readings bear no relationship to either the actual temperature nor the actual anomaly – the data’s no good, can’t be corrected, and will not be used by a reputable researcher.
Finally, it’s not entirely surprising that Menne finds a downward bias in his individual anomaly readings at poorly situated sites. Because: 1) a poorly located instrument produces a higher mean temperature; hence, the anomaly will appear lower; and 2) generally there’s a limit to how hot an improperly placed instrument will get (i.e. mixing of unnaturally heated air with ambient air will tend to cool the instrument – so the apparent temperature rise is lower than one might expect).
Had Mennen (NASA) actually measured both absolute temperature and calculated anomaly data using instrumentation at properly setup sites, within say a couple of hundred feet of the poor sites, as a proper standard to measure the bias against – our conversation would be different.
As it stands Menne’s data is useless nonsense and not really worth serious discussion.
Dave

Peter Dunford
January 28, 2010 9:45 am

Yet another great piece. Well done.
NCDC were obviously monitoring this blog, they noticed that the updates stopped at surface stations.org, assumed the number crunching was going on at that point, and decided to get in first. (We have seen efforts to hold awkward papers back so they can get their retaliation in first before, I wonder if that would have happened here?)
I think it’s good NCDC did this, it exposes their methods and the inherent weaknesses to public scrutiny, and their results to ridicule. Well done Menne and Karl. Since your side has all almost the funding, our side needs all the ammunition you can give us.

RSG
January 28, 2010 9:49 am

Anthony,
Love your website and appreciate your excellent work on the USHCN. Tragic that Menne and company have stooped to such antics. It only proves the desperate state of their position. I hope their sleezy move gets wide exposure….and sooner vs. later.
In discussing MMTS vs. Stevenson Screens/liquid-in-bulb I am reminded of your experiment comparing Stevenson Screens with bare wood, whitewash, and latex paint. Was not your actual air temperature taken with an MMTS? If so, did it not show that all 3 Stevenson Screens registered higher daytime temps than the “actual” air temp (MMTS)…on the sunny day you sampled (A Typical Day in the Stevenson Screen Paint Test)?
Is there a paper or study out there formally comparing the 2 methods (MMTS vs. Stevenson Screen)?

REPLY
: working on one, but search for Hubbard et al who I think did one.

George E. Smith
January 28, 2010 9:50 am

I take it from all the comments that the un-named authot of this essay, is actually Anthony ?
I always wonder who wrote what, when I don’t see an Author’s name at top or bottom.
Anyway, I love that picture of the Stevenson Screen growing hair from the recycling remains of the dearly departed.
What i really found fascinating was the process, of using the data of surrounding “measuring” stations, to fudge the data at the “Surrounded station”.
I always thought the concept of gathering data was to actually measure something, somewhere and report what you measured.
Perhaps when taking treasured photographs of the Grand Canyon, one should also take shots of some surrounding “ho-hum” locations, that aren’t as interesting as the GC, and then Photo-Shop those in with the GC to more fairly represent the beauty of that big hole in the ground.
After all, the surrrounding countryside, may get it’s self esteem deflated, by showing off the grandeur of that wonder of nature. I wonder what Gaia thinks ?
REPLY: Good point, I’ve added my name. Our WP engine doesn’t automatically add authors, added now -A

Peter Dunford
January 28, 2010 9:53 am

I meant “almost all”, not “all almost” in my last post.
Phil M (22:27)
That’s ridiculous. Nobody on the luke-warm-to-skeptic side has complained that data has not been made public as it is gathered, let alone prior to publishing a paper. The complaints arise when it takes a up to a decade to get data out in the open, long after expensive policies are being put in place relying on the studies that used the data.

Nigel S
January 28, 2010 9:54 am

Phil M (06:14:06)
Gosh you are busy.
I don’t think anyone is proposing to destroy the global economy on the strength of their dashboard thermometer readings but the thermometer on Hansen’s desk is a lot more dangerous.

Crust
January 28, 2010 10:01 am

Putting aside decorum issues, the main problem as I understand it is the data set used in this study. It’s a 43% subsample, which at first blush doesn’t sound so bad, but 1) it’s not quality controlled and 2) it’s far from a random subsample; in particular rural areas are underrepresented. Looking at the charts in the paper, it’s remarkable how well the trend series constructed from the good sites and the poor sites track each other. So I don’t think it’s plausible there’s a problem with too small a sample size per se (if there were, either or both series would be noisier and track poorly even if the trend overall was similar). But that doesn’t rule out a bias issue. There could be some bias that for some reason is more or less confined to rural stations. Or there could be some bias with the data collection that is corrected by quality control.

rw
January 28, 2010 10:03 am

One of the neatest things about the surface station project is that to counter these results, one has to contradict the most important principles of experimental science. In particular, that of founding one’s inferences on the best measurements possible. Now we have people in the AGW camp essentially saying, “Oh, quality of measurement doesn’t matter in this case.”
So it looks like WUWT has these guys rapidly painting themselves into a corner.
(Unless climatologists have created a new scientific paradigm.)

January 28, 2010 10:04 am

tfp
You wrote:
“One rule for NASA one for Watts”
wrong one rule for both:
“When NASA release data without the processing code all hell breaks loose in the blogosphere.
But it’s fine if Watts publishes the raw photos without the processing to give rating.”
Here is where you dont get it
The PROCCESSING, the METHOD, is not a piece of code. It is a set of RULES
Let make it easy.
1. I post 1000 photos.
2. I give you a rule for rating those photos: Score 1 point if there is an Apple in the picture; score 0 if there is no apple.
Count the apples.
I have posted my data and my method.
Now, 40% of the way through of counting the apples, I post the results
I’ve looked at 400 pictures and I’ve counted 100 apples.
You take my RESULTS and go off to write a paper.
I point out to you that you had better double check the counting I did.
I point out to you that I’ve got more pictures to go through.
I point out to you that we could work together.
I point out to you that the data (photos) are there and the method
is there, YOU could count them all by yourself if you like.
Did you think that people would not notice how you ignored my explantion that the method ( the processing steps) has been posted.
Photos = Data
LeRoy rating rules = Method
Final rating = Result.
GISS temperature;
temperatures = DATA
Source code = Method.
Pretty chart = Result.
In the case of GISS they post all three. In the begining they only posted
DATA and RESULT.
In the case of Anthony he posted his data (photos)
posted his method ( rating system)
and will publish his result, journals willing.
Then that result will be used to feed another method.

George E. Smith
January 28, 2010 10:05 am

“”” captdallas2 (04:22:39) :
[Response: Much more relevant is that Watts still, after years of being told otherwise, thinks that the global temperature analyses are made by averaging absolute temperatures. – gavin] “””
Well a lot of us talk in a similar vein; as if we think that scientists are actually reading Tmperatures that somehow relate to the SI recognised Kelvin Scale. Yes Gavin we know they don’t do that.
Has it ever occurred to you Gavin; that maybe that is wherein lies the problem.
If you make observations of the size of elephants, and report your data in Pyramid Inches; please don’t be surprised when people question your results.
The whole idea of having internationally recognised units and scales for Physical properties, is just so people can communicate useful information to each other.
“limatologists” or “Climate scientists”; whichever they prefer to call themselves, have nobody else to blame, but themselves, when they insulate themselves from the scientific community, with their own custom “ancient astrology” ways of discussing what they imagine is actually real science.

Bridget H-S
January 28, 2010 10:07 am

“Veronica (01:56:48) :
Anthony, where can I find a really good summary of how surface stations ought to be set up in order to get decent readings? And who is mapping the UK ones for you?”
I have asked this question several times – Hern/Hurn (?) is the one I would love to check out – now known as Bournemouth International Airport. It reminds me of Stansted Airport – when I was a child we could drive to the end of the runway and wait for hours for a plane to land – it was so exciting and such a treat. Sometimes a disappointment when nothing came in. Imagine the effect of all those extra aircraft and all those extra buildings.
Perhaps Veronica and I should start the process – but what do we do with the pictures? I am no scientist/engineer/statistician – there must be someone in the UK who can take this on. Show yourself!
REPLY: an attempt was started for a UK surfacestations in 2008, but because the majority of stations were at airports, and because of airport security concerns, the were forced to scuttle the project. I don’t recommend a retry. Taking photos of the airport infrastructure in Britain is a fast track for trouble these days. – Anthony

Leo G
January 28, 2010 10:16 am

Richard M – {The first thing that popped into my head was the Menne paper is actually another blow to the surface stations credibility. It is completely accepted that UHI is real in the science community. If these poorly sited stations show no warming bias that means there MUST be a cooling bias somewhere in the process.}
Eggsactly my first thoughts also, but then I realized that what they are trying to show, is that sighting has no overall bearing on long term trends. If we had one box in a perfect spot, another under a tree and a third on a runway tarmac. Number one reads “true” temps. # 2 reads shade temps and sun temps, and number 3 reads sun and reflected temps. The “anomoly” from each different site, according to what I understand from this paper, should be consistent no matter of the sighting. So if # 1over time, read an average of 15C and has an anomoly that is plus .4C over the last 20 years, then we would expect to see about the same anomoly from the tarmac sighting, though the long term average of the tarmac may be 3-5C higher then site one. It’s not temperature, but anomolous tempshifts from an average.
At least that is how I understand the paper.

Crust
January 28, 2010 10:18 am

PS another issue you raise is homogenization. But the paper includes comparative charts of the unadjusted temperature anomaly which I would have thought would exclude homogenization (I could be wrong). But even if they are applying some statistical procedures, I don’t see that philosophically as a problem so long as the series for good and bad sites are computed only with data for that kind of site (i.e. the homogenization of the bad sites doesn’t depend on data from the good sites in any way or vice versa). (Of course, there may be some problem with the methodology. If so, that’s of course fair game. It’s just a different kind of objection that the sites issue.)

J.Peden
January 28, 2010 10:24 am

This was a shock to me since I was told it was normal procedure for the person who gathered the primary data the paper was based on to have some input in the review process by the journal.
Anthony, you’ll just have to accept that your review also could not possibly measure up to the exacting standards employed by the WWF ‘reviewers’. Then there’s always “Nature’s” reviewers, too, who are perhaps even a little better?
This situation also kind of reminds me of when Matt Drudge hit the scene and the entrenched Evolutionary dead end Media Journalists started complaining about how affronted they were that Drudge could get away with his teeny tiny widdle ~32 year old self only working from his own sources and without those “layers and layers” of Editors to review his work. Drudge simply blasted them on the basis of results vs mistakes, and that was the end of that.

Earle Williams
January 28, 2010 10:26 am

Props to Dr. Menne, Mr. Karl, et al. They managed to get the word ‘robust’ into the article not once, but twice.
Their article fails, though, to answer the question raised by Anthony in his heartland paper. That question is, “Is the surface tempearture record reliable?”
Their answer seems to be “It doesn’t matter!” Well, they do acknowledge that it is not reliable. Then they admit that their method does not require anything resembling a reliable network. Their result is robust!
Indulge an analogy if you will. Is there a difference between Kobe beef and roadkill? The NCDC answer to that question is it doesn’t matter because regardless of meat input the USHCN sausage tastes the same.
The progress of science marches on…

hswiseman
January 28, 2010 10:27 am

This is Steig all over again. Find a hot spot and use a statistical trick to spread it all over the place.
If we accept that Anthony’s high quality sites are in fact high quality, how much adjusting do you really need to do? Maybe TOB, station move, transcription error, observation gaps? After that, the data stands on its own merits, the trends of this data set are true trends of the data set. Its too bad that it might not be spatially representative in all respects, but infilling the grid is an open invitation to mischief. A better comparison of the pristine stations to the ‘record’ would calculate an anomaly value from a collation of adjusted observations of actual USHCN stations that accurately recreates the distribution of observed station quality. Such selected stations should also be subject to a degree of spatial inadequacy statistically equal to the spatial shortcomings in Anthony’s sample.
You can’t make too many claims about the correct average national temperature with this method, but you can can make a meaningful comparison of the observed trend shown in each dataset. If Menne is correct, the two trends should be pretty darn close. Somehow I doubt it.
The shabby treatment dispensed to Anthony by these authors makes me think they swiped their high school science fair idea from the kid next to them in home room.

wondering Aloud
January 28, 2010 11:04 am

Richard M (06:47:00) :
“I did read over a few comments at skeptical science and, instead of seeing anything skeptical about the Menne paper, it was mostly pure worship. The number of outright ridiculous statements was mind boggling.”
Yes, very sad really, objectivity has clearly left the building over at the now very poorly named Skeptical Science. All they do now is bow to the party line at Real Climate. You can count on them parroting anything Gavin says no matter how obviously bogus.

George
January 28, 2010 11:38 am

I was thinking about the rush to publish this and I think that part of the motivation was to get something in print on the subject before Anthony was able to publish his findings so Menne could be one of Anthony’s peer reviewers.

1 6 7 8 9 10 12