With 33% of the USHCN weather station network now surveyed, the site quality rating is now applied, see the USHCN Station Master List file in HTML and XLS format.
The rating system for site quality was borrowed verbatim from the new Climate Reference Network being put into operation by NCDC and NOAA to ensure quality data. Their siting criteria can be found here.
I welcome input on this work in progress. The site rating will now be a running total in the spreadsheet and always available online as new stations are added to the survey. What is important to note is that the majority of stations that have a rating of 4 are MMTS/Nimbus equipped stations, which according to NCDC’s MMS equipment lists, make up 71% of the USHCN network. It appears that cable issues with the electronic sensors have forced them closer to buildings, roads, etc because NOAA COOP managers don’t often have the budget, time, or tools to trench under roads, sidewalks etc to reach the site where Stevenson Screens once stood. While this isn’t always the case, a pattern is emerging.

For background, see this first: Conference presentation given at CIRES/UCAR on 8/29/07 describing this project and the methods used to assign station site quality ratings, along with examples of many site issues seen thus far.
Click to view the slideshow I presented at UCAR
Immediately after the conference, a senior official at NCDC requested a copy of the above slide show, which I provided to him on CDROM. After receiving it, in a follow up email he inquired as to distribution rights which I granted within NCDC and NOAA for the purpose of review. That was last week. Thus far no issues have been raised with the presentation content. Since no issues were raised at the conference or in the two weeks afterwards (two weeks as of today) I have decided to release it publicly.
Note that of the 33% surveyed, only 13% meet the CRN site criteria (Rating of 1 and 2)for an acceptable location to accurately measure long term climate change free of localized influences.
The CRN site rating system is described here:
Climate Reference Network Rating Guide – Class 1 and 2 are considered best, 5 is the worst.
Class 1 – Flat and horizontal ground surrounded by a clear surface with a slope below 1/3 (<19deg). Grass/low vegetation ground cover 3 degrees.
Class 2 – Same as Class 1 with the following differences. Surrounding Vegetation 5deg.
Class 3 (error 1C) – Same as Class 2, except no artificial heating sources within 10 meters.
Class 4 (error >= 2C) – Artificial heating sources <10 meters.
Class 5 (error >= 5C) – Temperature sensor located next to/above an artificial heating source, such a building, roof top, parking lot, or concrete surface.”
Discover more from Watts Up With That?
Subscribe to get the latest posts sent to your email.
Although Woodstock VA gets a rating of 2, it has only been in its present location for a little over one year. Perhaps it would be helpful to add another column after the rating to list number of years the site has been in exactly the same location.
Anthony-
I inferred this from your text, but could you be explicit in your post as to whether 5 is the worst or the best?
Anthony, I looked at your presentation. It is really an excellent bit of work – congratulations. I presume the next step is to sort the temperature data from these sensors into these five categories to see what correlations might exist between the category and the mean and standard deviations of the measured historic temperature changes.
Wow, I didn’t expect this. Only 4% meet all specifications for a good station.
GTTofAK, actually it’s a total of 13% as ratings of 1 and 2 would be considered acceptable as I understand it.
Anthony:
Great presentation. You let the pictures speak for themselves. I think the success of this effort can be assured by continuing to do just that. Present the information, photographs, etc., while doing the associated research off-line and not making any controversial unsubstantiated claims. Save the conclusions for when the majority of the data is in, and then do not extrapolate beyond the obvious. I suspect your audience was stunned with what they saw, and are even now making lots of phone calls to the pictured sites asking “What the….?” The downside of this is they have now been warned of what is coming, and the spin doctors are revving up in preparation for the end game. The wheels are beginning to fall off the AGW bandwagon, and given tha amounts of money involved, it is not going to be a pretty sight.
Side note to Coyote: 1 is best, 5 is worst.
HOly CaNOle, Rev!
That “1” is best and “5” worst is clear from the site criteria. Where does it say that anything over a “2”
is not acceptable?
At some point we need to see the average temperature increase for stations in each class, eg the average temp increase for 1, for 1 +2, for 1 +2 +3, etc.
BCL, That is a valid question, but perhaps you aren’t familiar with the concept of “signal to noise ratio”.
In any measurement environment, there will always be a certain amount of noise in the measurement, whether you are measuring temperature, voltage, sound, or water flow.
The object of any properly designed measurement system is to have a signal to noise ratio that is far lower than the least significant digit or resolution of measurement. In this case with temperature, while the readings done by observers are at 0.1°C resolution when the observer makes the reading, they are rounded by the observer in recording it into the data logbook which is then sent to NCDC. Thus the resolution of the USHCN data is 1°C with no tenths.
Therefore, as any scientist or engineer knows (or should know), you want the noise component (error) to be LOWER than the least significant digit of the measurement, the resolution of 1°C in this case.
Since Class 1 and 2 have less than 1°C errors by this scheme, they would be considered acceptable. Class 3, 4, and 5 have errors (noise) that is equal to or greater than the 1°C resolution of the temperature measurement, and thus may mask or bias upwards or downwards, the signal that is trying to be extracted (the trend over time). So far the trend value that has been put forth for the worldwide surface temperature record is a positive 0.6°C, but given that we have a measurement system that may have many stations with errors greater than that value it calls the accuracy into question.
If you go look at the new NCDC approved CRN climate monitoring sites, with photos of them visible in this document:
http://www1.ncdc.noaa.gov/pub/data/uscrn/documentation/site/photos/StationsByState.pdf
…you’ll see that they are Class 1 and 2 stations. The CRN was setup correctly, and I would trust the data it produces. Unfortunately, there hardly is any data yet and the CRN network is scheduled for completion in 2008.
Is Waldo hiding inside the Class 3 through 5 error bars? (and subsequent fitting done on such data? Is Waldo a sort of curve fitting Mahout – e.g. able to fit an elephant?)
Anthony,
Where in the official siting criteria does it talk of “signal to noise” ratio?
In other words, is the criteria of acceptability that of the NOAA itself, or YOUR criteria (perhaps backed by “Any Scientist or Engineer”) applied to their stations?
(Put aside what the criteria SHOULD be. Who knows? Maybe you’re right about that.)
By the way, you WERE right about the deleting of site information. I had conflated “location” with “Address”. I had to take that post down.
BCL thank you for removing the erroneous post on your website on NCDC’s pulling locations from MMS. I had noted its absence.
The “signal to noise” measurement issue of getting the lowest noise figure possible in any measurement system is pretty widely accepted, much like the value of Pi, so I can see why NCDC didn’t implicitly spell it out. Even so, it’s not something I came up with. Measurement systems design is chock full of literature on signal to noise criteria. A simple way to think of this would be getting a sound level reading. If you try to measure the decibel level of a sound in a quiet room, its easy and accurate. But try to take the sound meter down to the street and measure the same sound, the street noise will interfere and become a component of the measurement. Filtering it out later to get the true measurement isn’t very easy, and in some cases impossible. The same applies to temperature. If you are trying to track a trend that is less than the minimum resolution of the meter in the middle of other temperatures “noises” such as localized biases, it becomes difficult. It becomes nearly impossible to do accurately when the biases are larger than the trend signal itself.
Noise reduction techniques won’t work in a case like surface temperature becuase there is no reference signal to compare to. The new CRN may assist in that though.
Nonetheless, I’ll see if I can locate Michel Leroy’s 1998 original paper where he discusses the criteria he designed. Its at the WMO, so may not be available right away. If anyone has it, feel free to post.
That aside, the fact that you won’t find any Class 3, 4, or 5 sites in the new CRN and that on the front page of the CRN siting manual I referenced using this site classification scheme it has signatures of NCDC’s Dr. Thomas Karl and others is evidence of the acceptance and value of such a system.
Noise? Noise?
Yes, let’s make some!
It appears that you have a large enough sample size now to do some meta analysis. I.e., what is the trend of just the class 1 and 2 sites? I think that is the $64,000 question.
Yes.
Rate of change in 1s and 2s vs. that of 4s and 5s.
That is the gold speck. Not the levels. The comparative rates of change.
Will 4/5 the results jibe with the rate of exurban creep which has been devouring the rural stations for a century? Will they match the adjustment numbers of the CRN? One would expect so.
Will this swamp the alleged 0.6+ C increase? It’s a distinct possibility. If so, that cuts the legs out from under the entire premise.
And the code. Let’s not forget that. From what I figure, it seems that the GISS adjustments are pretty much ass-backwards. Talk about your double-biases!
Not to mention that little Titanium Oxide V. Calcium Carbonate deal–that looks like the equivalent of a Level 4 violation on the remaining Stevenson stations, right off the top. And I’ll hazard a guess that the above chart doesn’t even take that into account.
Brrr. I think I need a vicuna coat!
So now you’re saying that the new CRN acceptability guidelines are also implicit? They don’t actually STATE that only class 1 and 2 stations are “acceptable”?
It makes a big difference for you to claim that the stations fail to meet THEIR OWN standards, and to claim that they fail to meet YOUR standards (and this whether or not YOUR standard makes sense). I have trouble with you reading a standard of “acceptability” into documents where such a standard is not explicitly set forth.
BCL I think you are missing the basic premise. It doesn’t matter what I think about the site classification criteria used by NCDC, what matters is the level of noise versus the level of signal at the measurement point. The “signal-to-noise ratio” refers to the ratio of useful information to false or irrelevant data. You are trying once again to make a clear cut matter of numbers and measurement look like a personal opinion. The fact is, a noise level greater than the signal will mask the signal you are trying to measure, making it more difficult, and in some cases impossible to extract it from the noise.
There’s also a common misconception that you can “pull the signal out of the noise” with techniques like oversampling. But thats doesn’t work in this case because you cannot go back and repeat the measurements.
Noise is often thought of as a high frequency component compared to a signal. In the case of in situ temperature measurements that’s not the case because bias elements are driven by the same forcing (the sun) and have response times similar to the thermometer, making the signal and the noise nearly identical in frequency (our 24 hour earth solar cycle).
Ask anybody who does scientific measurements if they would accept a measurement location for any data, not just temperature, where the noise level is greater than or equal to the signal they were trying to measure. Just a caveat: measurment environments where an external reference signal is available don’t apply here.
That’s why a weather station location with an error of +/- 1, 2 or 5 degrees C isn’t a good place to look for a signal that changes by tenths of degrees.
If you find a scientist or engineer who will go on record to say that is an acceptable measurement environment, by all means have them get in touch with me.
In the meantime, I have an email in to Michel Leroy requesting a copy of his original paper on the siting classification which should put the issue to rest.
If any other readers want to chime in on this to help Big City Lib understand this concept, feel free.
I’d also point out that in a room of 50+ climate scientists, including one from NCDC, this exact same information was presented, with the implication that a class 3 4 or 5 site would not be an acceptable measurement environment.
Nobody challenged that in questions afterwards, and nobody (except you) has made an issue of it in the 2 weeks since the presentation.
But lets check just in case – anybody out there who thinks Class 3 4 and 5 sites would be acceptable measurement environments?
BCL, think about it this way.
According to the CRN, a class 5 site has a potential error of 5 degrees. Imagine reprogramming the MMTS device to only read out whole temperatures that were divisible by 5. (i.e. when it was 60, 61, or 62.4, it would read out “60.”) If someone told you that using such a device they measured a 0.6 degree rise in average temperature, would you believe it?
Congrats Anthony,
With 33% of the data in, your audit has demonstrated that most of the sites do not meet the CRN standards, and therefore are not suitable to be included in Climate Change analyses. I think that if that was indeed the goal, you’ve succeeded with only 33% of the site surveyed.
So my question now, is what is the purpose for continuing the site surveys? I don’t expect the further sites to change the distrubtion significantly.
May I humbly offer, in a proactive way, the raison d’etre to complete the survey is to help the CRN identify those sites which are candidates as”gold standard” for climate change analysis? Such sites would be in your categories 1 & 2, but with additional criteria, that have come out of the experience of performing site surveys.
Using Walhalla as an example, Sites that are rural, have historical imagery and/or other data to verify land use hasn’t changed significantly over the last 50-60 years, have a good record any instrument change or relocation, etc… I think from your experience you can come up with a good list of additional criteria to be imposed on the “gold standard” sites.
Also, proactively, you could suggest to the NOAA, additional things that need to be done to maintain the “gold standard” site list such as performing annual site checkups to verify compliance.
All in all you’ve done a wonder!
Leon
ps, did you get the Walhalla aerial images?
Bigcitylib: I wholly concur with Anthony’s comments. You cannot get meaningful data from ANY scientific measurement if the signal to noise ratio is poor. It really is impossible.
There is a lot more involved with obtaining good data such as counting statistics, measurement techniques and sample size. However, what Anthony is doing is just looking at the measuring instrument. This is usually one of the first things a decent scientist should look at to determine if he/she is going to have a successful data collection.
There are two types of noise to ratio problems: intrinsic and extrinsic.
Scientists will explicitly tell what device they used for the measurement because an engineered/manufactured device will have its signal to noise ratio in the device’s specifications, which can be referenced by other scientists. This is an example of an intrinsic factor that must be considered in designing the data collection. Intrinisic factors are generally the easiest to control.
However as Anthony shows in his pictures, there are also extrinsic factors that can infuence the data collection, e.g. the thermometer sits next to an air conditioner. Extrinsic factors can be difficult to control. It is one of the reasons so many “correlation” studies are worthless because not all the extrinsic factors have been accounted for.
If I allowed the error factors within my work, as seen in Class 3-5 sites, I would get creamed by the regulators, my boss and anyone who reviewed my work.
Let me pose the question back to you – would you trust the data from a class 4/5 site? How about looking at it another way – would you want to measure your child’s temperature with a thermometer that may read 5 degrees high or low? And then make grave decisions from the (faulty) data? Oiks!
>
One would want to test it. The data has been terribly manhandled, and it all needs a good going over.
Naturally, a rating-by-rating evaluation of the raw data may prove helpful as a guideline.
Unless we get the data right and adjusted as well as we openly can, we won’t be getting anywhere on this issue.
P.S., I’ve looked in on your site and I note: a.) You identify yourself, and b.) You seem to tolerate adverse opinion pretty well; yes, you can dish it out but you can take it.
I’ll add my congratulations to Anthony here and to Steve at Climate Audit for the valuable work you’re both doing. As someone who has seen BCL at Canadian political blogs, BCL always misses the basic premise. (sigh) I’ll leave it at that before I become too impolite.
RE: Posted by: Anthony Watts | September 13, 2007 10:48 AM
I am a scientifically degreed, high technology professional with over 25 years of experience. Anthony’s description of SNR is one of the best I’ve seen in quite a while. I wholeheartedly concur.
Consider what’s going on here.
The gummint is replacing possibly well sited old Steve stations with the Type-4 sited Nimbus variety.
In goes the bad air out goes the good.
The big deal here is that so many of these these are RECENT changes. The implications for the recent historical “trend” is obvious.
I think the thing about this that really throws me for a loop here is the lack of decimal points.
Not POINT ONE degree to the warm, ONE degree. Not POINT TWO degrees, TWO degrees. Not POINT FIVE degrees, FIVE degrees!
And not an up-down variance, but primarily a bias to the warm! (If the “noise” is all one tone, in one direction it may be easier to filter it out, right?)
And the “significant difference” we’re supposed to be shutting down the engine of the world on account of is, what, a measley POINT 6 or 7 degrees!
Say, WHAT?
THAT margin of error doesn’t even feed the bulldog for one of my half-assed historical models!
“It is an outrage. I shall tell everybody.”