Surface temperature uncertainty, quantified

There is a new paper out that investigates something that has not previously been well dealt with related to the surface temperature record (at least as far as the author knows). “Sensor measurement uncertainty”. The author has defined a lower limit to the uncertainty in the instrumental surface temperature record.

 

Figure 3. (•), the global surface air temperature anomaly series through 2009, as updated on 18 February 2010, (http://data.giss.nasa.gov/gistemp/graphs/). The grey error bars show the annual anomaly lower-limit uncertainty of ±0.46 C.

 

UNCERTAINTY IN THE GLOBAL AVERAGE SURFACE AIR TEMPERATURE INDEX: A REPRESENTATIVE LOWER LIMIT

Patrick Frank, Palo Alto, CA 94301-2436, USA, Energy and Environment, Volume 21, Number 8 / December 2010 DOI: 10.1260/0958-305X.21.8.969

Abstract

Sensor measurement uncertainty has never been fully considered in prior appraisals of global average surface air temperature. The estimated average ±0.2 C station error has been incorrectly assessed as random, and the systematic error from uncontrolled variables has been invariably neglected. The systematic errors in measurements from three ideally sited and maintained temperature sensors are calculated herein. Combined with the ±0.2 C average station error, a representative lower-limit uncertainty of ±0.46 C was found for any global annual surface air temperature anomaly. This ±0.46 C reveals that the global surface air temperature anomaly trend from 1880 through 2000 is statistically indistinguishable from 0 C, and represents a lower limit of calibration uncertainty for climate models and for any prospective physically justifiable proxy reconstruction of paleo-temperature. The rate and magnitude of 20th century warming are thus unknowable, and suggestions of an unprecedented trend in 20th century global air temperature are unsustainable.

INTRODUCTION

The rate and magnitude of climate warming over the last century are of intense and

continuing international concern and research [1, 2]. Published assessments of the

sources of uncertainty in the global surface air temperature record have focused on

station moves, spatial inhomogeneity of surface stations, instrumental changes, and

land-use changes including urban growth.

However, reviews of surface station data quality and time series adjustments, used

to support an estimated uncertainty of about ±0.2 C in a centennial global average

surface air temperature anomaly of about +0.7 C, have not properly addressed

measurement noise and have never addressed the uncontrolled environmental

variables that impact sensor field resolution [3-11]. Field resolution refers to the ability

of a sensor to discriminate among similar temperatures, given environmental exposure

and the various sources of instrumental error.

In their recent estimate of global average surface air temperature and its uncertainties,

Brohan, et al. [11], hereinafter B06, evaluated measurement noise as discountable,

writing, “The random error in a single thermometer reading is about 0.2 C (1σ) [Folland,et al., 2001] ([12]); the monthly average will be based on at least two readings a day throughout the month, giving 60 or more values contributing to the mean. So the error

in the monthly average will be at most 0.2 /sqrt60= 0.03 C and this will be uncorrelated with the value for any other station or the value for any other month.

Paragraph [29] of B06 rationalizes this statistical approach by describing monthly surface station temperature records as consisting of a constant mean plus weather noise, thus, “The station temperature in each month during the normal period can be considered as the sum of two components: a constant station normal value (C) and a random weather value (w, with standard deviation σi).” This description plus the use of a 1 / sqrt60 reduction in measurement noise together indicate a signal averaging statistical approach to monthly temperature.

I and the volunteers get some mention:

The quality of individual surface stations is perhaps best surveyed in the US by way of the commendably excellent independent evaluations carried out by Anthony Watts and his corps of volunteers, publicly archived at http://www.surfacestations.org/ and approaching in extent the entire USHCN surface station network. As of this writing, 69% of the USHCN stations were reported to merit a site rating of poor, and a further 20% only fair [26]. These and more limited published surveys of station deficits [24, 27-30] have indicated far from ideal conditions governing surface station measurements in the US. In Europe, a recent wide-area analysis of station series quality under the European Climate Assessment [31], did not cite any survey of individual sensor variance stationarity, and observed that, “it cannot yet be guaranteed that every temperature and precipitation series in the December 2001 version will be sufficiently homogeneous in terms of daily mean and variance for every application.”

Thus, there apparently has never been a survey of temperature sensor noise variance or stationarity for the stations entering measurements into a global instrumental average, and stations that have been independently surveyed have exhibited predominantly poor site quality. Finally, Lin and Hubbard have shown [35] that variable field conditions impose non-linear systematic effects on the response of sensor electronics, suggestive of likely non-stationary noise variances within the temperature time series of individual surface stations.

The ±0.46 C lower limit of uncertainty shows that between 1880 and 2000, the

trend in averaged global surface air temperature anomalies is statistically

indistinguishable from 0 C at the 1σ level. One cannot, therefore, avoid the conclusion

that it is presently impossible to quantify the warming trend in global climate since

1880.

The journal paper is available from Multi-Science publishing here

I ask anyone who values this work and wants to know more,  to support this publisher by purchasing a copy of the article at the link above.

Congratulations to Mr. Frank for his hard work and successful publication. I know his work will most certainly be cited.

Jeff Id at the Air Vent has a technical discussion going on about this as well, and it is worth a visit.

What Evidence for “Unprecedented Warming”?

0 0 votes
Article Rating
101 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
carbon-based life form
January 20, 2011 2:54 pm

Here is a link to a letter to the APS editorial page from this author that may be the genesis of his published study. http://www.aps.org/units/nes/newsletters/fall09.cfm

etudiant
January 20, 2011 2:54 pm

Bravo!
At last some glimmer of reality. Now stand by for heavy flack.

Al Gored
January 20, 2011 2:59 pm

Wow. Finally. Uncertainty officially acknowledged at this level. Combined with all the station problems you documented Anthony, and related adjustments, this should pretty much put all the quibbling about one-tenth degree changes, etc. to an end… but it won’t of course.

richard verney
January 20, 2011 3:00 pm

Well it is good to see that this has been published. This paper is bound to get quite some reaction. I know many who have long considered that the claimed warming is indeterminal from the noise. Given all the uncertainties, it is rediculous to claim measurements to 1/100ths of a degree.
Now if they could only clean up the UHI problem and bias through station drop out, we may be better able to consider the significance of any observed temperature change and to put it in its proper context.

SomeJerk
January 20, 2011 3:06 pm

It won’t stop the alarmists because it isn’t about AGW. It never was. It has always been about crisis, any crisis, that can be used to motivate the people into whatever action the political higher-ups want.

Duster
January 20, 2011 3:16 pm

More significantly, the study concludes that the trend in surface tmeperatures since 1880 is statistically indistinguishable from a 0-degree C trend. That is, although it has often been widely acknowledged, even among skeptics of AGW, that the globe has warmed over the last century, there is no statistically significant support at even a one-sigma level in the data for such warming. There may in short have been no warming pattern at all, simply shorter term noise. The fault may not be in our star but in our numbers.

ThomasU
January 20, 2011 3:16 pm

Wow, this “settled science” is always good for a big surprise. There has never ever been a thorough scientific assessment of station errors?! Extraordinary! I have learned a lot since I first became aware of “WUWT”, and I knew very well that “climate science” is full of holes. But to read now that nobody ever even bothered to assess the quality of the temperature measurements properly is still a somewhat shocking. The whole “climatism” (or “alarmism” or “warmism”) scam has to be brought to an end, the sooner the better!
I can only thank Anthony Watts for bringing all these informations to the wider public. Here in Germany we still have a long way to go before the scam ends. There seems to be an “unholy alliance” of MSM, politics, NGOs, “climate scientists” and “green energy” profiteers which all work hard to promote ever more regulations, taxes, fees and research grants. Until now the people unwillingly bear the burden…

mike g
January 20, 2011 3:17 pm

Just curious, now that the Hudson Bay has frozen over, have BBQ conditions set in for the UK?

kadaka (KD Knoebel)
January 20, 2011 3:17 pm

Has Mount Romm erupted?
It’d be a big volcanic eruption , lots of sulfate aerosols dispersed up high, lots of dust from the disintegrating gray dense stony matter. Mount Romm will be venting steam for weeks, spewing spurts of fire and brimstone for months.
I feel a chill in the air, like the world has gotten colder…

coaldust
January 20, 2011 3:19 pm

I don’t believe man is reponsible for the temperature rise, but we do see the following evidences that it has risen:
Less ice in the Arctic
Similar temperature measurements from satellites
Deviations with good explanations (Pinatubo, El Niño, La Niña)
probably more…
IMO the argument that we don’t know that the temperature has risen is not a good one. The evidence just doesn’t point that way.
These blow holes in CAGW though:
MWP
Cloud albedo feedback uncertainty
The way the “science” was hijacked
probably more…

Mark Cooper
January 20, 2011 3:20 pm

Here is some relevant info about Metrology. I sent this to Mr McIntyre in 2008 but he never replied, and eventually I got around to sticking it on a blog for posterity…
http://pugshoes.blogspot.com/2010/10/metrology.html

rsteneck
January 20, 2011 3:21 pm

[snip – sorry, your topic is way off topic this sort of thing is better published here. I’m sure they’ll take it. -Anthony ]

January 20, 2011 3:25 pm

It’s as bad as we thought.

richard verney
January 20, 2011 3:27 pm

Whats the betting that in about 2 weeks time, Dr Spencer will be reporting a global temperature anomaly of about zero against the 30 year trend.

François
January 20, 2011 3:36 pm

Anything about large lakes surface temperature uncertainty? NASA’s JPL gives figures way above Frank and Alto’s noise variance.

Keith Minto
January 20, 2011 3:42 pm

Good thoughtful article, however….

The ±0.46 C lower limit of uncertainty shows that between 1880 and 2000, the
trend in averaged global surface air temperature anomalies is statistically
indistinguishable from 0 C at the 1σ level. One cannot, therefore, avoid the conclusion
that it is presently impossible to quantify the warming trend in global climate since
1880.

…….may be hard to sell to journalists and the general public.In Fig3 above, it certainly looks as if the temperature has risen since 1960. The general public may think that the strongest signal is the mean, with the error bars indicating a gradient of uncertainty.

David L. Hagen
January 20, 2011 3:45 pm

The full uncertainty (Type A + Type B) of ±0.46 C vs Type A of ±0.2 C is typical of the difference between a full uncertainty analysis versus the commonly reported “accuracy”.
NIST provides an introduction:
Essentials of expressing measurement uncertainty
For details see: B.N. Taylor and C.E. KuyattGuidelines for Evaluating and Expressing the Uncertainty of NIST Measurement Results NIST TN 1297

Scott
January 20, 2011 3:50 pm

It’ll be interesting to see what Tamino’s response to this is (if any). Isn’t this supposed to be right up his alley?
-Scott

latitude
January 20, 2011 3:59 pm

A 130 years of data, and a straight line is still within the noise………….
==============================================
coaldust says:
January 20, 2011 at 3:19 pm
I don’t believe man is reponsible for the temperature rise
=================================================
Man is not responsible at all coal…..
…..if man had anything to do with it, it would be a lot warmer

Curiousgeorge
January 20, 2011 4:35 pm

My main issue with “Average” ( or mean), in this or many other contexts, is that most people who are not statistically literate equate that term with “Majority”. Not true of course, but even fewer people have a clue what mode or median is, much less how those and other terms (skew, kurtosis, etc. ) are used to describe a distribution. Which would be far more meaningful than a simple point estimate.

January 20, 2011 4:43 pm

Thanks Anthony, and of course Patrick Frank,
I am qouting from this article at http://www.oarval.org/ClimateChange.htm and http://www.oarval.org/CambioClima.htm (Spanish).

Dave
January 20, 2011 4:50 pm

To Anthony and the Surface Stations team… thank you, and congratulations.
By the way, I hear that Vegas bookies have your team leading the hockey team by (whatever is statistically more significant than) 6 Sigma!!!

JRR Canada
January 20, 2011 4:51 pm

So is there any science left in the CAWG concept? Its incontravertable eh.?
However you spell absolutely for sure the only answer.

Dr A Burns
January 20, 2011 4:52 pm

Temperatures now measured electronically to 0.1 degrees but have been recorded to only +/- 0.5 degrees
http://www.srh.noaa.gov/ohx/dad/coop/EQUIPMENT.pdf … page 11
This doesn’t seem to have been considered.

George E. Smith
January 20, 2011 5:00 pm

Well the paper is about sensor errors. How about the errors that occur for about 70% of the surface area; which happens to be water. I don’t see any attempt to correct the erroneous data that was gathered prior to 1980, which in 2001 (Jan) was found to not correlate with the Temperatures that are measured on land ie from the atmosphere rather than from the water temperatures, as were the older oceanic data readings.
These global Temperature anomaly proxies have more changes than any carillon on earth.

terrybixler
January 20, 2011 5:01 pm

Draw a line at zero and it barely escapes the error bars in 2010 (the hottest adjustment ever) , interesting.

Mike Edwards
January 20, 2011 5:02 pm

mike g says:
January 20, 2011 at 3:17 pm
Just curious, now that the Hudson Bay has frozen over, have BBQ conditions set in for the UK?

Nice of you to ask, Mike G. January has seen the UK get a more customary spell of wet & windy weather, with temperatures a more tolerable 8 to 10 C (46-50F), with a short spell up to 13C (almost springlike). Things have returned to a chilly dry theme now, which looks like it will last for a week or so, with night frosts and daytime temps in the 2 – 6 C range.
So no BBQ, just more like a regular winter than the icy blast we received before Christmas…

xyzlatin
January 20, 2011 5:08 pm

Mark Cooper says:
January 20, 2011 at 3:20 pm
Here is some relevant info about Metrology. I sent this to Mr McIntyre in 2008 but he never replied, and eventually I got around to sticking it on a blog for posterity…
Mark, I found your blog awhile ago but lost it again, thanks for posting the link. Your article was really good and I will print it out and show it to everyone who thinks that it is possible to show from thermometer records that the temperature has risen at all. I have known for many years of the errors in thermometers, and I have just been unable to fathom why everyone is arguing about a supposed rise of .7 deg and so on.
I have heard many skeptics agreeing that the temperature has risen, I think that they don’t want to be put in the category of “flat earther” which is worse than “denier”, yet the basis of argument has to be a complete analysis of the instruments measuring the data, and then the people taking that measurement.
Mark, you excellently sum up the problems in your blog article.
1.Human Errors in accuracy and resolution of historical data are ignored
2.Mechanical thermometer resolution is ignored
3.Electronic gauge calibration is ignored
4.Mechanical and Electronic temperature gauge accuracy is ignored
5.Hysteresis in modern data acquisition is ignored
6.Conversion from Degrees F to Degrees C introduces false resolution into data.
You are obviously someone who has knowledge about this subject, how about posting more on WUWT?
ps. Mark, it is impossible to post a comment on your blog unless one already has a blog. It might be the reason there are no comments (I know I found it impossible to put a comment).

It's always Marcia, Marcia
January 20, 2011 5:48 pm

No statistically significant warming in 130 years.

Beesaman
January 20, 2011 5:49 pm

As someone who spent twenty years as an instrument engineer I’d love to see the field instruments that are doing better than ±0.5 Deg C. I’d also like to see the calibration records of the standards used and the calibration reports and methodology.
I noticed NOAA give a tolerance of ± one degree for the MAX/MIN temperatures and for the Palmer Soil thermometer even more, to quote from,
http://www.srh.noaa.gov/ohx/dad/coop/EQUIPMENT.pdf
V. CALIBRATION – CONTINUED:
When comparing Palmer Model 35B readings with those of a check
thermometer, remember two things:
A) tolerance of the Palmer (approx. 2 degrees F.) and the
check thermometer (generally 1% of scale) may be additive.
B) a seemingly slight difference in exposure between the two
may contribute to a variation in the readings. A spread of
up to 4 degrees F. between the two readings can be
considered satisfactory. If using method 3, a reading
between 29 degrees F. and 35 degrees F. can be considered
sufficiently accurate at the ice-point.
Ho hum…

January 20, 2011 6:16 pm

I don’t want to bring a skunk to this picnic, but look again at the figure. It says it is equally likely that global temperatures, since 1880, have cooled by as much as 0.2ºC (the difference between the high limit for 1880 and the low limit for 2009) -OR- warmed by as much as 1.8ºC (the difference between the low limit for 1880 and the high limit for 2009).
Since the error bars are at least approximately normally distributed, the best bet would be that the actual warming is somewhere in the middle, nearer to 0.8ºC than either of the extremes. Yes, it COULD be that it has cooled 0.2ºC since 1880, or, conversely, that it has warmed an astounding 1.8ºC, but the best bet is near the middle.
Now, I think the most likely net warming since 1880 is 0.5ºC, so this paper seems to be 0.3ºC warmer than me.
The message of this paper, for skeptics like me, is that the uncertainty is about as large as the supposed 0.8ºC official Team warming estimate. That supports the view that we should not adopt any drastic, economy-wrecking public policy changes in the “environmental” domain based on the error-prone temperature record we have in hand.

GregO
January 20, 2011 6:26 pm

xyzlatin Thanks for giving Mark the H/T and Mark – right on.
After Climategate I got interested in so-called man-made global warming and step one for me was to check out the thermometers. Took me awhile to find this site and when I found out about Anthony’s surface station project it just blew me away. Sure there are other proxies like sea-level – guess what – that’s tough to measure too!
There just isn’t much delta T to work with; if there really is run-away warming we’d all be debating how much, how fast, and what to do right now. Problem remains, the delta T is just tiny compared to our measuring technique/technology. There just isn’t much to get excited about that can be directly measured.
So now the argument is that there is this mysterious “heat in the pipeline” ala Trenberth that is somehow lurking, just waiting to destroy mankind. Really? Just show me the data. I’ll make up my own mind about how bad it is.

geo
January 20, 2011 6:37 pm

I feel a small touch of pride today. 🙂
Anthony, were you consulted in advance? Your blessing secured? I ask, because I well remember your attitude to how you were treated by NOAA for use of the data that you were the driving, organizing, and evaluating force behind.

vigilantfish
January 20, 2011 7:25 pm

Wow Anthony, congratulations. Nice to see your real science is getting some official recognition! This is nice news to counterbalance the claptrap coming from the UN weather agency, the World Meteorological Organization, that claims 2010 is now ties as the hottest year in history.
http://www.forexyard.com/en/news/2010-ties-for-hottest-year-UN-2011-01-20T115020Z-FACTBOX

Curiousgeorge
January 20, 2011 7:27 pm

@ xyzlatin says:
January 20, 2011 at 5:08 pm
Mark Cooper says:
January 20, 2011 at 3:20 pm
Here is some relevant info about Metrology. I sent this to Mr McIntyre in 2008 but he never replied, and eventually I got around to sticking it on a blog for posterity…
Mark, I found your blog awhile ago but lost it again, thanks for posting the link. Your article was really good and I will print it out and show it to everyone who thinks that it is possible to show from thermometer records that the temperature has risen at all. I have known for many years of the errors in thermometers, and I have just been unable to fathom why everyone is arguing about a supposed rise of .7 deg and so on.

I agree. Marks article was exactly on the money. Having had some experience with metrology myself in a industrial environment tho, I can testify that the users of measurement instruments absolutely hate it when anybody raises the issue, and immediately point the finger at the calibration laboratory. The reason is that on the floor ( or in the field) they have no capability ( and usually no training or tools ) to establish or compensate for that instrument/human error. They have to believe what their eyes tell them whether it’s a tape measure, caliper, torque wrench, or thermometer. So when the data gets to the analyst, he has no way of determining the inst/human error either. Questioning the readings, just makes everyone look bad to the boss, who has other management concerns besides some piddly problem with instrument error, and no one will admit to human error. In some industries, admission of this uncertainty carries legal and contractual risk as well. It’s a real can of squirmy worms.
There are statistical techniques for dealing with this; notably a Gage Repeatability and Reproducibility (GRR) study, which is a form of ANOVA. But again that requires a level of training and so on, which may not be available, and the results can be hazardous to your business health.

Wally
January 20, 2011 7:34 pm

Not only is the temperature trend uncertain, any modeling using these temperatures is even more uncertain.

magellan
January 20, 2011 7:36 pm

Mark Cooper says:
January 20, 2011 at 3:20 pm
Here is some relevant info about Metrology. I sent this to Mr McIntyre in 2008 but he never replied, and eventually I got around to sticking it on a blog for posterity…
http://pugshoes.blogspot.com/2010/10/metrology.html

Mark Cooper: AMEN! I’ve been howling about this very same thing for some time. Metrology takes a back seat in climate science when it should be the first and foremost concern prior to any measurements are taken or reports written no matter what field of work. How any scientist or organization can be qualified to use data without studying, understanding and applying fundamental metrology is mind boggling IMO. If industry ran like climate science, most would have been put out of business 20 years ago.
I chuckle when reading statements like “data was put through quality control” or such. That they can even with a straight face publish resolutions to .001 or even .1 for that matter is ludicrous, not to mention the error bars which has bothered me for some time. Then GISS extrapolates/interpolates and otherwise morphs what used to resemble “data” into nothing more than near meaningless values. Met O does the same thing with HadCRUT results by modeling into their own image. These should add to the uncertainty of the measurements on their own, but we’re told don’t worry, it’s all been accounted for. Yeah right.
Where are the calibration procedures and maintenance records for the surface stations? Where are the GR&R studies? Where are the system audits?
I’ll be purchasing this paper. Thanks Pat Frank for bringing this to light. It’s long overdue.

Jeff Alberts
January 20, 2011 7:44 pm

Ira Glickstein, PhD says:
January 20, 2011 at 6:16 pm
I don’t want to bring a skunk to this picnic, but look again at the figure. It says it is equally likely that global temperatures, since 1880, have cooled by as much as 0.2ºC (the difference between the high limit for 1880 and the low limit for 2009) -OR- warmed by as much as 1.8ºC (the difference between the low limit for 1880 and the high limit for 2009).
Since the error bars are at least approximately normally distributed, the best bet would be that the actual warming is somewhere in the middle, nearer to 0.8ºC than either of the extremes. Yes, it COULD be that it has cooled 0.2ºC since 1880, or, conversely, that it has warmed an astounding 1.8ºC, but the best bet is near the middle.

The real problem, Ira, is that there is no “it”. Some locations have warmed, some have cooled, and some have remained relatively static over that time span. But, because most of the sensors are now in built-up areas, the average gets boosted.
Nothing “Global” has occurred, lots of regional though, in all directions.

GaryW
January 20, 2011 8:01 pm

I’m not sure folks are understanding the full implications of this paper. Remember, the climate models are adjusted to match the digital version of a temperature record. They are being tweaked to match noise. The trend projected from this record is only about as much as the noise level. In instrumentation terms, that is no trend. That does not give us much confidence in climate model output.
Besides, +/- 0.46 degrees accuracy is awfully optimistic. That is lab instrument accuracy.

magellan
January 20, 2011 8:06 pm

Beesaman,
Are you familiar with the HO83 issue?
http://climateaudit.org/2007/08/22/the-ho-83-hygrothermometer/

stephan
January 20, 2011 8:27 pm

Once temperatures are seeing to be not rising due to adjustments etc…plus a bit of truth probably due to natural causes, the WMO will have to be closed down and reassessed in its totality after their incredibly stupid presss relaese that 20101 has been the hottest year in history LOL

SABR Matt
January 20, 2011 8:35 pm

Remember the IPCC report that the uncertainty in global temperature was +/- 0.05 (!) C?
I am far more inclined to believe THIS scale…though I actually think the uncertainty may be higher.

eadler
January 20, 2011 8:36 pm

The conclusions of Patrick Frank are counter to the generally accepted consensus. It is common sense to demand that some logic be presented that shows why the generally accepted theory is wrong. This was not done in the fragment of the article posted above. We don’t have the author’s arguments, which show that his estimate of uncertainty is correct, and the common estimate of random error is wrong. We only have his conclusions.
People who call themselves skeptics should be skeptical of this, until they see the entire argument and which demonstrates that the author’s conclusions are correct. If copyright forbids direct reproduction of the key argument, it would have been a good to see the argument paraphrased by someone.
It will be interesting to see some critical analysis of the author’s thesis y experts. I am not going to purchase the article to read it for myself.
It seems to me that if there is a systematic error in a thermometer, as long as it does not vary systematically with time, it should not contribute to a significant error in a trend, which is what is used to calculate the global temperature anomaly.
REPLY: Well bucko, buck up and buy the paper or quit your whining. – Anthony

January 20, 2011 9:30 pm

Beesaman says:
January 20, 2011 at 5:49 pm
I noticed NOAA give a tolerance of ± one degree for the MAX/MIN temperatures and for the Palmer Soil thermometer
Unbelievable! And here we are sitting on the edge of our seats over what’s going on month to month. Richard Lindzen said it nicely:
“…..each of these points has a fairly large error bar….. we’re turning normal variations into the ancient notion of an omen. We’re scanning this small residue for small changes and speaking of them as though they were ominous signs of something or other.”

James Smyth
January 20, 2011 9:33 pm

I was just reading that Metrology article and had a thought. In the mid 80’s (I think it was) there was a movement to change the rounding convention of .5 to the closest even number, rather than up to the next number. The idea was that rounding up was biasing numbers up, but parity doesn’t really matter. I wonder if a) someone’s reading of a thermometer is influenced by the current rounding convention(***), and b) whether anyone has ever studied the effect of a convention shift on anything important.
(*** the Metrology article seemed to indicate that recordings of even number are mor common than odd ones, but its not clear that was from rounding convention, thermometer design, or some other inate preference for even number)

January 20, 2011 9:49 pm

Thanks, everyone, for your comments and interest.
Ira, the meaning of the error bars is that we don’t know where the true temperature trend is, within the (+/-)0.46 C envelope. The physically real trend just has a 68% chance of being within it.
What’s going on is that the individual thermometer readings are bouncing around, and deviating from the “true” air temperature mostly due to systematic effects. The systematic effects are mostly uncontrolled environmental variables. Depending on how the thermometers bounce, one could get this trend, or that one, or another, all of which would deviate from the “true” trend and all of which would be uncertain by (+/-)0.46 C. But, readings were taken, so we’ve gotten one of all the possible trends. It has some shape. But one should resist the temptation to read too much into it.
We know the climate has warmed from other observations — in the article I mention the northern migration of the arctic tree line as indicating warming. But putting a number on that warming is evidently impossible.

January 20, 2011 9:56 pm

Jeff Alberts says:
January 20, 2011 at 7:44 pm
… The real problem, Ira, is that there is no “it”. Some locations have warmed, some have cooled, and some have remained relatively static over that time span. But, because most of the sensors are now in built-up areas, the average gets boosted.
Nothing “Global” has occurred, lots of regional though, in all directions.

Jeff, you are certainly correct that the average measured temperatures have been boosted by urban heat island effects in developed areas. However, I do not see how the paper that is the subject of this thread supports your statement that “Nothing ‘Global’ has occurred”.
All the paper seems to do (and I have only read the abstract and looked at the figure) is to show that the error sources are about as large as the thing the climate Team claims to have measured. Assuming the Abstract faithfully represents the paper, and further that the paper is valid, which I accept for argument’s sake, there are three conclusions, none of which seem to help our skeptic position:
(1) One possible conclusion supported by this paper, is that there has been no net average warming since 1880, and possibly even, as I showed, 0.2ºC of cooling! Most commenters grabbed onto that as a refutation of CAGW. (I personally reject CAGW due to other evidence, but I cannot see how this paper, per se, sheds any light on the argument against CAGW.)
(2) A second, equally possible conclusion, supported by this paper, is that there has been an astounding amount of average warming since 1880, of as much as 1.8ºC. That could be used to support CAGW. My fellow skeptics need to know that before they cite this paper as a refutation of CAGW. Do they not?
(3) Either of these extreme conclusions from this paper are unlikely. Furthermore, the best bet based on the paper, namely the average of the two extremes, about 0.8ºC, is 0.3ºC more average warming than I think occurred since 1880, for the very reasons you state, the location of many measurement stations in areas encroached by urban development, and also the “adjustments” made by the official climate Team that appear to be purposefully cooling datapoints prior to 1940 or 1950 and warming the points afterwards, with no good justification.
So, given the above, what value is this paper to the skeptic position? The only one I can think of is, as I wrote: “… the uncertainty is about as large as the supposed 0.8ºC official Team warming estimate. That supports the view that we should not adopt any drastic, economy-wrecking public policy changes in the ‘environmental’ domain based on the error-prone temperature record we have in hand.”
Do you disagree with that?

January 20, 2011 10:17 pm

Anyone care to comment on the uncertainly bars on the satellite temperatures and give us a number on them?

Lonnie Schubert
January 20, 2011 11:18 pm

I trust Mosh will weigh in. My experience tells me that temperature is almost never known at better than ±½°F. ±½°C for weather data seems sound and conservative to me. I suspect ±2°C is more typical, particularly when working with averages, even when working with anomalies.

ge0050
January 20, 2011 11:30 pm

http://pugshoes.blogspot.com/2010/10/metrology.html
“Temperature cycles in the glass bulb of a thermometer harden the glass and shrink over time, a 10 yr old -20 to +50c thermometer will give a false high reading of around 0.7c”
Isn’t 0.7C about what is claimed for global warming over the past 100 years?

DirkH
January 20, 2011 11:48 pm

Hansen will have to double his adjustment efforts.

Robert L
January 21, 2011 12:15 am

Are you all aware that the commonly used airport METAR temp measurements are rounded up to nearest degree? introducing an artifical warming of 0.5°.
While anomaly analyses could deal with this I can also imagine that there may be circumstances where it is not properly accounted for (eg if a weather bureau switched from using it’s own recording at an airport to using metar data without realising the rounding difference).

Peter Plail
January 21, 2011 12:51 am

I am sure it has been covered before but can anyone tell us how independent of surface station temperatures are satellite measurements.
As I understand it satellites don’t measure temperatures directly, but measure other factors from which temperatures can be derived. Is this an absolute calculation or does it depend on calibrating against “known” measured surface temperatures?
To put it more directly, does this mean that satellite temperatures have their own errors on top of the surface temperature errors, so are even more inaccurate?

January 21, 2011 1:23 am

. . . the monthly average will be based on at least two readings a day throughout the month, giving 60 or more values contributing to the mean. So the error
in the monthly average will be at most 0.2 /sqrt60= 0.03 C . . .

Allow me to quibble here.
I may be wrong, but using many measurements to increase accuracy only applies to measurements of the same thing. They claim accuracy of the monthly average at 0.03 °C, but not a single one of those 60 measurements is of the monthly average. Each reading is of a temperature at a different time, each with a 0.2 ° error. That does nothing to reduce the monthly error, which should be just the average error of the individual readings, or 0.2 °C.
The monthly error is understated by an order of magnitude.

January 21, 2011 1:26 am

Calibration of meteorological thermometers.
Can somebody tell me how the calibration of electronic meteorological thermometers is performed?
An electronic thermometer consist basically of two parts: Sensor (sometimes platinum resistor (Pt100, 100 ohms at 0°C), and an electronic unit that converts the resistance variation to voltage and displays it as temperature.
My main concern is that often calibration is performed by putting the sensor in liquid with some well known temperature, and the electronic unit is adjusted until the reading is correct at two or three different temperatures. The electronic unit is however kept at a more or less constant temperature during calibration.
The problem is that the electronic unit does also act as a thermometer because all have some temperature drift. This drift can be considerable if the electronic unit is installed outside where temperature variations are great.
Is this the case with electronic meteorological thermometers?
(Besides this the sensor is newer totally linear. Even a platinum sensor (Pt100) is somewhat unlinear and the electronic unit must have a built-in linearizer).
Agust

January 21, 2011 1:49 am

Slightly off subject but, I always love the way the y-axis on warmist graphs has to be stretched beyond what is measurable to show anything discernible.

EFS_Junior
January 21, 2011 1:54 am

Pat Frank says:
January 20, 2011 at 9:49 pm
Thanks, everyone, for your comments and interest.
Ira, the meaning of the error bars is that we don’t know where the true temperature trend is, within the (+/-)0.46 C envelope. The physically real trend just has a 68% chance of being within it.
What’s going on is that the individual thermometer readings are bouncing around, and deviating from the “true” air temperature mostly due to systematic effects. The systematic effects are mostly uncontrolled environmental variables. Depending on how the thermometers bounce, one could get this trend, or that one, or another, all of which would deviate from the “true” trend and all of which would be uncertain by (+/-)0.46 C. But, readings were taken, so we’ve gotten one of all the possible trends. It has some shape. But one should resist the temptation to read too much into it.
We know the climate has warmed from other observations — in the article I mention the northern migration of the arctic tree line as indicating warming. But putting a number on that warming is evidently impossible.
_____________________________________________________________
Hello Pat (if you are still reading this forum),
Is there any possability of obtaining a copy of your paper directly from you?
I’m very curious about the derivation of the temperature noise relationship shown in your ;
4. Summary and Conclusions
http://wattsupwiththat.files.wordpress.com/2011/01/frank2010_summary.png?w=640&h=151
The sigma/square root of N versus sigma derivation, is this more fully explaind elsewhere in your paper?
Anyway, if you can, could you post your paper here or provide an email address where those who are interested in the details might request a copy?
Thanks.

January 21, 2011 2:41 am

Anthony, congrats on getting your real science in Surfacestations noted and complimented. Mark Cooper, your info on metrology is excellent and worthy of wide promulgation. I would liked to have left a compliementary comment, but as I do not have a blog this was not possible.
I have said this before but it’s worth repeating; I found WUWT after being very rudely accused of being a troll on the Guardian CiF blog by a feral Greenpeace activist when I asked a sincere and innocent question – is there a common standard for siting and housing environmental thermometers? This came from my long involvement with agriculture and other industries where measurement of various phenomena, such as the moisture content in grain to be harvested, are vital to economic survival. Those measurements, too, are subject to minor error for all sorts of reasons.
As an illustration of the problems inherent in any form of measurement, I once assisted in the construction of the steel work of a free-standing internal staircase for a public building. The treads and risers, crafted in very expensive native timbers, were built by a separate team of expert woodworkers. We installed the steelwork, then the woodworkers arrived with the beautiful oiled woodwork; nothing fitted! The two teams had agreed to purchase new steel tape measures especially for the contract, but, sadly, the teams bought different brands, both with excellent reputations, one made in England and one made in America. The tapes had a difference of 5mm over 4 metres! Resolution of the problem was by the woodworkers carting the stairs back to their factory, planing the outer rails down to reduce the width (using one of the steel worker’s measuring tapes to check) then re-oiling. The second attempt at installation went without a hitch.
I am still amazed that such a simple concept as that of errors in readings from many kinds of measurement devices has been consistently ignored by most of the clisci community.

Curiousgeorge
January 21, 2011 4:29 am

@ Agust Bjarnason says:
January 21, 2011 at 1:26 am
Calibration of meteorological thermometers.
Can somebody tell me how the calibration of electronic meteorological thermometers is performed?

Your best bet is NIST – http://www.nist.gov/index.html . You will find much information at their website – simply search on Thermometer calibration . Or contact them via phone or email . They are the US authority for all instrument calibration.

C P
January 21, 2011 6:11 am

The thing that amazes me is that I attended a physics colloquium on Global Warming at the University of Colorado in about 1990 that showed global temperature data with uncertainty bars. The data range was the same as shown here and the uncertainty bars were very similar except that the uncertainty decreased by about 50 % from the oldest to the newest data. That presentation of the data (just like this one) made it clear that there was no provable trend in the data.
I wonder when and why climate scientists decided to stop showing the uncertainty bars?

JJ
January 21, 2011 6:42 am

eadler,
“It seems to me that if there is a systematic error in a thermometer, as long as it does not vary systematically with time, it should not contribute to a significant error in a trend, which is what is used to calculate the global temperature anomaly.”
That statement assumes that the instrumentation of the system doesnt change over time. That isnt the case. The number of instruments and the mix of instrumentation has changed over time. If thermometer type B has a consistent warm bias vs type A, and if the mix of thermometers shifts from predominantly type A to type B, the trend will bias warm. That is one way that consistent individual error can still affect the trend in the aggregate.
That said, what Pat has thus far failed to do is demonstrate how his assertion about the uncertainty of the individual measurements operates in the aggregate to produce the observed trend. Unless the individual errors are correlated across all stations over time (thus far not shown) the uncertainty about the trend will be much smaller than the uncertainty of the individual measurements. I doubt this paper has much longevity.

beng
January 21, 2011 7:32 am

This is OK statistically — it deals w/uncertainty in the GISS numbers themselves (right or wrong). But GISS, as we know, doesn’t realistically account for UHI & changes in local site conditions. Bottom line, GISS in it’s present form can’t tell us much of anything.

Bernd Felsche
January 21, 2011 8:09 am

Keith Minto raises a salient point:

may be hard to sell to journalists and the general public.In Fig3 above, it certainly looks as if the temperature has risen since 1960.

So let’s re-plot the data in terms of e.g. the annual temperature range experienced by people in temperate climates, which’d be a scale of ±25°C and a seasonal temperature fluctuation shown as a background band. Then show the “anomaly” (what an intimidating, horrible word to the untrained) curve.
There: You get a sense of proportion. Something that is absent from the palaver and fear-mongering of the alarmists.

Steve Keohane
January 21, 2011 8:28 am

This just what I suspected from looking at the surface station project, i.e. we don’t have the data to tell what is going on relative to any other time because of the error levels of the measurement system.

eadler
January 21, 2011 9:05 am

JJ says:
January 21, 2011 at 6:42 am
“eadler,
“It seems to me that if there is a systematic error in a thermometer, as long as it does not vary systematically with time, it should not contribute to a significant error in a trend, which is what is used to calculate the global temperature anomaly.”
That statement assumes that the instrumentation of the system doesnt change over time. That isnt the case. The number of instruments and the mix of instrumentation has changed over time. If thermometer type B has a consistent warm bias vs type A, and if the mix of thermometers shifts from predominantly type A to type B, the trend will bias warm. That is one way that consistent individual error can still affect the trend in the aggregate.
That said, what Pat has thus far failed to do is demonstrate how his assertion about the uncertainty of the individual measurements operates in the aggregate to produce the observed trend. Unless the individual errors are correlated across all stations over time (thus far not shown) the uncertainty about the trend will be much smaller than the uncertainty of the individual measurements. I doubt this paper has much longevity.”
The various agencies which chart the global trends have studied the change in instrumentation over time, and have adjusted the temperature trends at the various locations used for the documented equipment changes at each location.

Domenic
January 21, 2011 9:53 am

I have 20+ years in professional temperature measurement management, including all kinds of temp sensors, from mercury thermometer to thermocouples to RTDs to resistance platinum to infrared detectors.
Another source of error not yet mentioned here is the ‘time constant’ error inherent in any temp sensor’s ability to measure. For example, the time constant of a glass mercury thermometer, or a bimetallic (common many years ago) is quite long, a minute or more, depending on circumstances (just from memory). A platinum RTD, with very thin sheathing will change in just seconds.
This is significant because it has a direct effect on ‘record low’ and ‘record high’ data sets. I suspect that a lot of ‘record high’ data comes from this error source.
This error source can make it ‘appear’ as the temps are swinging more wildly now than they have ever been. But that is not necessarily true. The wild swings in temp were always there, but the measuring systems could not capture it. Now they can.
For example, when the sun moves in and out of the clouds, outdoor air temperatures can change quite quickly. The amount of that change recorded by the sensor is completely dependent on its time constant. So, ‘record high’ daily data is very susceptible to this. Record lows would not be as sensitive to time constant error because the sun is not dominant at night…so temps simply do not jump around as much.
As I don’t know all the details of what specific temp sensors AND THE RECORDING SYTEMS have been used historically in all the reporting stations, and whether they are using fast electronic ‘peak picker’ temp sensors systems now to get daily highs or lows, I cannot really know with any certainty how much the error is.
My guess though, is that the are now using fast electronic ‘peak picker’ systems to measure daily highs and lows as it is simply so easy to so do now. But keep in mind, what is easy to do now, was very difficult to do many years ago.
And there are additional possible errors in the entire measurement systems between today and in the past.
As a rough guess, in my professional opinion, the error bars for global temp data are much, much greater than most can imagine, more likely to be around +/-2 deg C.
And this does not include the UHI effect, which I have been pleased to see carefully examined here. That is indeed a VERY large effect. I have observed it many times first hand.

BillT
January 21, 2011 10:18 am

years ago i posted on another forum that the margin of error is far greater than any claimed “warming”.
this work confirms that but i would go further in saying that even the plus or minus .46 is too LOW.

JJ
January 21, 2011 10:29 am

Eadler,
“The various agencies which chart the global trends have studied the change in instrumentation over time, and have adjusted the temperature trends at the various locations used for the documented equipment changes at each location.”
In the broad view, that is not accurate. To the contrary, station metadata globally are so incomplete and unreliable that the latest version of GHCN abandons metadata based adjustments in favor of various model-based corrections. Those corrections are the source of some controversy.
That said, my point was that there are ways that a fixed bias in single instruments can translate into a trending bias when those individual instruments are aggregated spatially and temporaly. Instrument mix was just an easy to understand example.
Note too that I was agreeing with your larger point. Pat has not demonstrated how he believes the uncertainty of individual measurements translates to uncertainty in the trend across time of the spatially weighted mean of those individual measurements. He asserts that it is possible to fit many different trend lines through the uncertainty band about the invidiual measurements, which is true. It is also true that statistics is not about the possible, but the probable. Missing is the justification for the assertion that all of those posssible trend lines are equally probable. Again, from what it presented thus far, this paper does not appear to have legs. WUWT readers should be cautious about investing too much in its conclusions until they have been further vetted.

January 21, 2011 11:46 am

Thank you so much for this.
O/T: I wish I had one thin dime for every time Arctic conditions are cited as evidence of AGW while Antarctic conditions are omitted. Bias much?

January 21, 2011 12:39 pm

[snip – if you have something substantial to say, say it]

Günther Kirschbaum
January 21, 2011 1:00 pm

Galileo Galilei, Albert Einstein, Patrick Frank…
All men who saw things nobody before them had seen.

Ron C.
January 21, 2011 1:44 pm

The more I read about these global temperature trends and annual comparisons, the more they resemble sausages. Like the butcher says, “You really don’t want to know what is in there.”

Doug Proctor
January 21, 2011 1:57 pm

Uncertainty in the temperature records and trends is the weak point in all the temperature-based climate change models. Uncertainty is based on two parts, precision and accuracy. The first involves how finely we can measure something, and the other how well what we measure is “true”. This report says that what we are measuring as a temperature change is not greatly different the uncertainty of that measurement, or the error bar. Yet the CAGW position is not the size of the error bar, whether it is 0.46K or even more. The huge error bar of the Hockey Stick was irrelevant in the political and public belief that the global temperature was rising to an alarming level. What everyone saw was the change of the mid-point of the error bar. Unless there is more reason to pick the top of the error bar in the past, and the bottom of the error bar in the present, the same result will occur even if reports like this double the width of the error bar is doubled: the global temperatures will be seen to have gone up by what Hansen and Gore say, about +0.8K.
However, as many have described on WUWT, this rise of about 0.8K is made up of about 0.4K of raw data and 0.4K of adjustments. The adjustments are an attempt to counter the probable error in the raw data described in this paper. This report speaks more to the raw data than it does to the adjustments; the focus needs to move now to the adjustments.
Below I have discussed the five areas in which I see uncertainty affects the data record and the trends deduced from it: instrumental precision, instrumental accuracy, observer precision, observer accuracy and observer consistency. Each one requires a response, whether that is to do nothing or something, and introduces a place for Hansen et al to have adjusted the record. The nut to be cracked lies within these areas.
1. Instrumental precision, i.e. the sensor’s ability to register fine points of difference, is immaterial when the trends, i.e. the anomalies, are reviewed, as long as the level stays the same over the temperature range and is the same across the network. With enough data, as I understand error analysis, a square root function can drops the +/- bar of the group far below that of the individual station. The magnitude of the “imprecision” has decreased with time if you believe that instrumentation is more sensitive now than in the past. Bulked up, improved precision through better instruments will not significantly affect temperatures or trends. There should be no adjustment for this.
2. Instrumental accuracy is more significant to the records than precision due to technological advances. Even assuming that the levels of accuracy are randomly distributed within the network, accuracy uncertainty in 1880 would be worse than that in 2010. Both raw temperatures and trends would be affected. The changes, moreover, would be non-random, progressive over time with either a warming or cooling pattern. This is because we “improved” sensors over time rather than reinvent them, retaining their basic way of operating. The modern digital/computer system is a reinvention and could have introduced a step change but at the time of introduction calibrations were made to get its readings to conform to “known” reality. Future functioning may, however, have still revealed a step-change due to new styles of measurement, with the step-change either a warming or a cooling event. Justification for adjustments for changes in instrumental accuracy changes would come through a comparison of historical and modern equipment. They would not be progressive but step-changes, and once done, should not have to be done again, unless the reference equipment was changed. There would have to be studies for this adjustment.
Note that the UHIE effect introduces a type of instrumental inaccuracy, in that local heating distorts the “natural” world we are trying to measure. The instrument is reading properly, but what it is reading is not an accurate reflection of what you are trying to measure. It is a proxy for the untainted-world, requiring adjustment. Both data and trends would be affected and progressively, with a warming bias, as urbanization has increased through time. The effect should be to decrease modern temperatures while leaving older temperatures alone.
3 & 4. Observer precision is today about the same as yesterday, at about 0.5K (probably because that is about 1*F for old thermometers). Observer accuracy may easily be +/- 5C – AT TIMES as in reading and then writing a number down, it is easy to put an 18 instead of a 13. These types of error, though, are fairly easy to recognize as outliers, and discounted. Further, over time they should be random and average out with enough data. Observer precision and accuracy should not affect either the temperature records or trends over time. No adjustment should be necessary.
5. Observer consistency is a different matter. I understand that through time readings are taken earlier today than before, such that a reading of 14.0C in 1880 would have been 13.9C under the 2010 protocol. Historical records, if this is true, would push older temperatures down relative to what the raw data shows. The adjustment here would be a one-time, step-wise function using local data correlating day-of-the-year variation in daily temperatures. A broad-brush might be easier to do, but if you really want to reduce data error, a very short time comparison would be necessary.
The 0.4K of adjustments by Hansen/GISS has been to decrease older temperatures relative to the present day. This is reasonable only if you believe a) that older sensors were less accurate than those of today and 2) the older sensors had a warm bias, and 3) were read at a later time in the day than presently. The magnitude of such necessary adjustments is empirically determinable. These problems cannot, however, in good faith be given as much significance for the modern era as for the past. The changes from 1880 to 1980 have to be far more significant than from 1980 to 2010.
The recent changes are also different from the older ones. From 1980 to 2010 the profound changes are in station numbers and locations, and the magnitude of the UHIE. A more urban record and a larger urban presence introduce higher temperatures. To correct for these, the modern temperatures (and anomalies) will have to be pushed down, i.e. cooling recent data. Older temperatures should be unaffected. But, strangely enough, what we see is the reverse, right up to the adjustments revealed in 2010 to the 2007 data and trends.
Recent records have been made (even if in a minor way) warmer and older, still cooler. Since 2000 additional adjustments have pushed pre-1980 data and trends further down and post-1980 temperatures down and modern ones up. This can only be justified if you claim that the prior, 1980 reference equipment was warm biased relative that of 2010. It can be CREATED, however, by changing the basic data involved, i.e. the current database has lost warmer pre-1980 data stations (or introduced more pre-1980, cooler stations) than previously used. This would be a nefarious way to improve your results, but not semi-unjustifiable. A neat “trick”, to be sure, but the group got away with tricks like this before.
If, as suggested, the raw data increase of about 0.4K has an uncertainty due to technical issues of +/-0.46K. By itself this changes nothing in the warmist argument. A 0.8K +/- 0.46K temperature increase since 1880 is enough to get them in a CO2 frenzy as it matches their models “well enough”. So, the focus has to be on the rationale for the 0.4K of adjustments. I suggest this is by checking justification for:
1) older, less accurate instrumentation,
2) older, warm-biased instrumentation,
3) older, warmer time periods when temperatures were recorded, and
4) a (probably progressive) change in the raw data being used.
Uncertainty in those four areas will then be attributable to the other 0.4K of temperature rise. As these are not multiple repeats of the same data type, the errors add.
It could be that instrumentation plus adjustment uncertainties are nearly equal to the 0.8K claimed temperature rise. This situation has a possible answer: that the current world is changing mainly by swapping heat around, rather than increasing or decreasing its total heat content. The temperature anomalies are nothing but a measurement of the temperature “noise” of the world. It is where the “noise” is located that determines whether the region (like the Arctic) is warming or (like the Antarctic) cooling.

George Steiner
January 21, 2011 2:44 pm

It is worse than you thought. It is a climatological fiction that there is auch a measure as average globla temperature. It is a mathematical artifact only. You can only say what the temperature is at the measurement point. Simpleminded engineering types such as I would not dream of implying what the temperature is at the distance of 10 feet, 1000 feet, 1000 miles from the measurement point. Climatologists have for twenty years milked this one number and built a house of cards on it.
But as a house it stood up remarkably well. Anthony Watts thinks that there is such a number, Lindzen thinks there is such a number. All everybody arguing about is how big it is. Or rather how much it deviates from an arbitrary reference.
Let me predict the near future. The house of global average temperature will colapse.

Scott Covert
January 21, 2011 3:03 pm

Some stations are better than others. I think this article is judging “best case”.
Military readings are often done by 19 or 20 something young people, The readings are probably more accurate during fair weather and poorly measured or outright “fudged” when it is unpleasant to go out and check the Stevenson screen. Military sensors are used mostly to determine short term weather conditions and were never intended for use as climate data.
Small stations at lighthouses, observatories, schools, etc are usually read by students, interns, and “your cousin Bob” whom have no idea how the data will be used and have no idea what accurate readings are.
The Surface Station Project showed how poorly and unscientifically the stations are maintaned.
I thing 0.46C is REALLY generous.
I think the readings we are getting today are much better due to automation but there is still UHI and siting issues that blow 0.46C out of the watter.

sky
January 21, 2011 4:55 pm

As others have already noted, this paper adresses the measurement ACCURACY at a SINGLE station. But what is of real interest on a global climatic basis is the RESOLUTION of changes in the AGGREGATE of all stations. The year-to-year VARIABILITY of the aggregate mean for relatively small samples of stations repeatedly shows agreement well within ~.25K at the 2-sigma level. That provides a more appropriate bound for the sampling and measurement uncertainty associated with climate indices. This, of course, does not apply to the effects of any time-dependent BIAS in the data, which is the ultimate source of uncertainty in estimating secular trend.

Domenic
January 21, 2011 5:36 pm

I also have some professional CO2 measurement experience.
The amount of errors in temperature measuring systems is actually quite minor compared to the amount of possible errors in CO2 measurement systems.
Take any CO2 data presented with a large dose of skepticism, unless the complete parameters regarding the measurement system are openly disclosed.

Dave Springer
January 22, 2011 2:02 am

Frank
So the actual global average temperature change 1880-2000 could conceivably be any value of 0C to +1.75C. Unless one can show why there’s a systematic error favoring the lower end of the range or the higher end of the range pretty much everyone is going to presume the middle of that range is accurate. If it were just a single instrument the presumption wouldn’t be made but we’re talking thousands of instruments with a good deal of corroboration & agreement with the commonly employed observation methods from different types of instruments ranging from high precision/accuracy laboratory measurements to satellites, to say nothing of other means of corroboration like sea level rise, glacier retreat, sea ice changes, and so forth. It also happens to agree with what the physics of GHGs predict we should see if the physics are correct.
This attempt to indict the presumed temperature rise isn’t going to fly. Not one tiny bit. A large number of skeptics might buy it but they’ll pretty much buy anything that is agreeable with their skeptical opinions just like AGW faithful will buy anything that agrees with theirs. Nice try but no cigar.

January 22, 2011 3:19 am

Hi Pat,
I just read the article of Antony regarding your paper “UNCERTAINTY IN THE GLOBAL AVERAGE SURFACE AIR TEMPERATURE INDEX: A REPRESENTATIVE LOWER LIMIT” Congratulations for that.
I have purchased it right now, but had´nt time to read it carefully. But from the abstract I see that you came to an very similiar result as I, when I issued my dissertation about data quality of recent historical temperature and sea level measurements for the time until 1880 at german Leipzig University in march 2010. It is written in german language, and till today not reviewed by external peers, so I couldn´t publish it before peer review (to become a PhD) is finished.
Different from you, I did´nt feel able to calculate any boundary or lower/upper limit for the uncertainty, but found out that the remaining systematic and coarse errors within the data will at least exceed the total increase of last century.
I found three major components:
1. Urban heat Island effect and associated errors guessing it may be in the range of +0.4 til +0.6 K. Based in part on Antony´s surface station project and papers of many others.
2. Same value might have the systematic error obtained by using (up to) 100 (!) but mainly 3 different algorithms to calculate the daily mean temperature. This had never been corrected and might show again 0.4 to 0.6 K- Due to the fact that we cannot find out wether this error is positive only or not, one might guess it may be ± 0.2 till ± 0.3 K. F.e. German Met office changed 2002 from “Mannheimer Stunden” algorithm to hourly (24 x) mean value calculation. The result show an increase of + 0.1 K between the 2 algorithms, since that time.
3. The rest consists of a collection of systematic errors (bias): which can be named:
Painting error (Antonys work), variations of sensors – thermometer replacement to electronic higher sensible ones- measurement height of weather shed (how high is the thermometer installed above ground: this varies historically from 1.2 m to 3.2 m), ground changes on places where weather sheds had been installed. Differences between SST and MAT (completely unknown, especially in higher latitudes with air temperature below zero °C), and finally coverage error due to sparse numbers of stations with small coverage of area) on both land as well as sea.
All uncertainties may sum up to 1 K or more. But “very likely” almost sure not less.
As a side result my work shows that mean global temperature – if you insist on calculating this nonsense figure- should be at least 1 to 1.5 K lower than Phil Jones (guessed) number of 14° C. This is due to the fact, that weather shed temperature is nearly always higher than true temperature outside. Which will be neither measured nor estimated. WMO itself defines temperature is that within the shed.
As soon as I have read your paper I will tell you more. Please feel free to ask for more details
best regards
Michael Limburg
Vizepräsident EIKE (Europäisches Institut für Klima und Energie)
Tel: +49-(0)33201-31132
http://www.eike-klima-energie.eu/

EFS_Junior
January 22, 2011 8:54 am

Michael Limburg says:
January 22, 2011 at 3:19 am
Hi Pat,
I just read the article of Antony regarding your paper “UNCERTAINTY IN THE GLOBAL AVERAGE SURFACE AIR TEMPERATURE INDEX: A REPRESENTATIVE LOWER LIMIT” Congratulations for that.
I have purchased it right now, but had´nt time to read it carefully.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Or, if you read the Air Vent;
http://noconsensus.wordpress.com/2011/01/20/what-evidence-for-unprecedented-warming/
“This analysis is now out in Energy and Environment [6], and anyone who’d like a reprint can contact me at pfrank830 AT earthlink DOT net.”
And get a copy for free, as I did, and have already read the whole thing yesterday, as I did. 🙁

DCC
January 22, 2011 12:18 pm

L. Hagen:
“NIST provides an introduction: Essentials of expressing measurement uncertainty”
Bad link to NIST.

January 22, 2011 7:20 pm

Thanks to everyone for your continued interest. There are lots of cogent comments here, but I hope you don’t mind that I choose out only a few for reply.
eadler, you’re right to be skeptical. In the article, I point out that the problem with the common estimate of random station error, is that it’s a guesstimate and is in fact not known to be random. That changes its statistics. To answer your other question, the systematic error due to uncontrolled environmental variables does vary with time — virtually minute by minute as Domenic pointed out. Domenic also mentions the very interesting problem produced by a change in instrumental time constant. The LIG thermometers were phased out after 1990. As he notes, one wonders, now, how much of the recent higher temperatures can be assigned to the much smaller time constant of modern electronic sensors.
Ira, the uncertainty bars don’t say anything about where the temperature trend may be. They say that we don’t know where the trend is within their limits.
There is a huge number of possible trends within those limits. So, supposing that the limits allow warming of 1.8 C or a cooling of 0.2 C picks out two possibilities among very many. But those two, individually, are very improbable given the huge number of possible trends. So, no one can make any hay from them, and we needn’t worry about them.
Agust, Kenneth Hubbard and Xiaomao Lin at University of Nebraska have published on the accuracy of electronic sensors in the field. Anthony has blogged about their work, and his article is worth a careful look.
JJ, the problem is two-fold. Guesstimated errors are not known to be random and propagate as 1/sqrt[N/(N-1)], which approaches the magnitude of the original guesstimate as N becomes large.
Likewise, systematic error is unlikely to be correlated globally, but it’s not random either. Nor are systematic error variances known to be normally distributed. So, that error propagates as 1/sqrt[N/(N-1)] as well. The problem with the global temperature record is that it’s been promoted to enormous scientific and political importance without ever having been globally validated. We’ll see about longevity.
Guenther, let’s not go too far. It’s just basic error analysis. On the other hand, Jim Hansen is on record saying that, “If anybody could show that the global warming curve was wrong they would become famous, maybe win a Nobel Prize.” If Jim Hansen wants to recommend Anthony, Warwick Hughes, myself, and Joe D’Aleo to Stockholm, I wouldn’t mind splitting the windfall. Maybe then I could finally buy a house and move out of my apartment. 🙂
Doug, I think you’ve made a very good analysis. I’ve been looking at Sea Surface temperatures, and especially in the earlier record, the problems you describe show up in serious spades there.
Scott, you’re right, the (+/-)0.46 C lower limit is a “best case.” Anthony’s work shows the true error is certainly much larger.

Scott
January 22, 2011 7:25 pm

EFS_Junior says:
January 22, 2011 at 8:54 am

And get a copy for free, as I did, and have already read the whole thing yesterday, as I did. 🙁

Junior, glad to hear you read the whole paper. I’ll try to get to it when I get more time (I’ve been hammered with work the last few months and will through at least the first week of Feb). I’d be curious to hear your thoughts on it.
Quickly glancing through the summary and some comments here, the results look interesting, but I don’t know if there’s much impact to it with the exception that uncertainties may now be better quantified than previously. The slope in temps is obviously unchanged, and presumably the uncertainty in that slope has increased a bit and that’s all. Is that a valid interpretation?
Thanks,
-Scott

January 22, 2011 7:51 pm

Sky wrote, “As others have already noted, this paper adresses the measurement ACCURACY at a SINGLE station.”
This isn’t strictly correct. The paper addresses the measurement accuracy found during a very careful and extended field calibration experiment that extensively tested multiple sensors and shields. That isn’t a single station. It’s an experiment that has relevance to the accuracy of surface temperature measurements at all stations. To deny that relevancy is to deny the principle of replication that is the very basis of empirical science.
But Sky’s incorrect point opens the door to some history about the paper. Before Submitting to E&E, I submitted a longer paper to the AMS Journal of Applied Meteorology and Climatology. That paper was later divided into two shorter papers for E&E. The second will be published in a later issue.
But at JAMC, the original manuscript went through three rounds of review, which took a full year, before being rejected on Sky’s grounds. All that took 38 pages of reviews and responses, which far from the recent 88 page standard of the OLMC Steig critique, but it was a personal best. 🙂
The story is classic climate science. In the first round, the paper was conditionally rejected, but reviewers “A” and “C” gave constructive comments; especially reviewer “A.” Reviewer “B” was critical to the point of hostility, essentially calling it the worst paper he’d ever read, that it contained “scientific horrors,” and that I was guilty of a “bait-and-switch” tactic; guilty of dishonesty, in other words.
The re-submission was heavily revised, especially in light of the helpful comments of reviewer “A.” In reply after the second review, the editor said that previous reviewer “C” had declined a re-review. This usually means that a reviewer has read the revised ms, and is now in accordance with publication. But the editor substituted a new third reviewer, who now was reviewer “B”. The hostile reviewer, previously “B,” became reviewer “C.”
In this case, both previous reviewer “A” and the new reviewer “B” recommended publication. Reviewer “A” also wrote that, “The material seems appropriate for the issues at hand. The author’s revision is clear, well-organized, and to the point.”
Reviewer “B” wrote, “The study by author in this paper is a good attempt to explore the possibilities of uncertainty existing in the global surface temperature data sets. Author also did an excellent literature review regarding the data quality and averaging uncertainty analysis. Author’s presentation is clear and well-organized.”
Reviewer “C” wrote that, “the [author’s] uncertainty bars assigned to the global temperature curve (Fig. 6) are unjustified because they are derived from the uncertainty estimates from a single station. The author therefore makes no accounting for the reduction in error that results from the overwhelming redundancy in the global observing system … For this reason, I must again recommend rejection of the paper.”
It’s clear that for Reviewer “C,” the random nature of measurement error is an assumption, when good empirical science requires that it must be demonstrated. That point is, and was, addressed in detail in the submitted paper and in my response to that reviewer. But somehow reviewer “C” either didn’t read that part of the study, or else decided to ignore it. He never actually addressed the argument presented in the paper. He also clearly did not understand the concept of a widely applicable lower limit of error obtained by testing an instrument under ideal operating conditions; something no other reviewer had any apparent trouble understanding.
The third round solely concerned the comments of reviewer “C.” In this round, he repeated the objections he’d made earlier. For example, he wrote, “Unfortunately, in spite of the equations and assumptions that the author presents, his conclusion can only be reached by dismissing the huge redundancy in the observational networks as well as the close agreement of temperature trends from sensors with different design and error characteristics.”
In reply, I wrote, “Once again: when noise is not known to be stationary, the degree of measurement redundancy is virtually irrelevant to the magnitude of propagated uncertainty. This straightforward rule of measurement statistics has consistently escaped the Reviewer’s grasp. The analysis within [the manuscript] makes no assumptions about noise error. Rather, there is an assumption of stationary noise within Brohan, et al., 2006, which is shown to be unjustifiable in light of the published record. The Reviewer here has supposed an assumption not in evidence, and has ignored a falsified assumption in plain view.”
At the end of that, things got a little irregular. Three of four reviewers were now evidently favorably disposed, and one was opposed. Instead of making a critical judgment, the editor solicited the further views of two Associate Editors.
In his final letter, the manuscript editor just paraphrased the views of the two AEs. He said one AE thought the study should be published elsewhere (i.e., no mention of any scientific problem) and the other AE agreed with the “single station” objection and thought it was fatal. The manuscript editor agreed with the second AE.
When I asked for a copy of their actual comments, the editor declined to reveal them. So, I had no opportunity to evaluate the scope of their actual criticisms, and compose a proper reply to them. In all my prior experience publishing in scientific journals, I’ve never been disallowed from seeing the actual reviewer comments before, and consider that to be a very irregular practice.
So, in short, a majority of the reviewers and one AE passed the study. Apparently they understood the criticism of an unwarranted assumption about random error, and understood the concept of an instrumental lower limit obtained under ideal conditions.
One reviewer apparently understood neither concept, along with one AE and apparently the editor. At the end, the editor went with the minority view and rejected the manuscript.
At that point, I thought of perhaps submitting to an AGU journal, but on considering that organization had taken a public and partisan stand on AGW, I couldn’t see risking another year of difficult reviews.
So, I went to E&E expecting the usual scientific standard of a relatively dispassionate critical review, and that’s what I got from their two reviewers. Neither of them raised any objection to the idea of an instrumental lower limit. And, in fact, one of E&E’s reviewers found an error in one of the equations that had gotten by three rounds at JAMC.

January 22, 2011 8:07 pm

sky also wrote that, “The year-to-year VARIABILITY of the aggregate mean for relatively small samples of stations repeatedly shows agreement well within ~.25K at the 2-sigma level. That provides a more appropriate bound for the sampling and measurement uncertainty associated with climate indices.
This is the usual approach in surface climate science, to examine the numbers themselves and look at their aggregate variance as a guide to their validity without paying any attention to the instruments. This reveals nothing about any systematic error that may be in the data.
Systematic error does not cancel in an average. It just averages as the data. There hasn’t been any survey for how instrumental systematic error may correlate across small regions. So, to suppose that correlated trends imply low-error data is to assume what should be demonstrated.
Whether trends approximately reproduce locally doesn’t reveal anything about instrumental accuracy in the subordinate data. The only way to assess the magnitude of systematic error is to test an instrumental measurement against a known standard.

January 22, 2011 8:26 pm

Dave, why would any scientist assume that the middle of an uncertainty range is accurate. It’s no more accurate than the fringes. If not the scientists, who is “everyone,” and why should their uninformed opinion matter?
we’re talking thousands of instruments with a good deal of corroboration & agreement…” Are we? We’re talking about instruments that have been unassessed with respect to systematic error, whose noise variances are not known to be random, and that have been shown in the US to be predominantly poorly sited. Elsewhere, apart from Europe, is likely to be worse. What is the meaning of agreement among such instruments? Asserting conclusions based on correlations among poorly functioning instruments is hardly science.
We all know that invoking melting glaciers, etc., is a specious argument, so don’t even try. No one here is denying that the climate has warmed. The question concerns metrics.
We also know that “the physics of GHGs predict” nothing about small changes in climate, so please don’t try that around here, either.
As to the rest, we’ll see.

Scott
January 22, 2011 9:56 pm

Pat Frank says:
January 22, 2011 at 7:51 pm
Hi Pat,
Thanks for all the gory details on your peer review experience. I’ve published 7 peer-reviewed publications as first author, and I’ve never had an experience like you describe. It really goes to show how political the process can be (most of my work had little or no political relevance on a larger scale, so I avoided those problems).
Heck, several times I had two reviews, one favorable and the other not so much and the editor chose to publish anyway. It seems that when politics aren’t involved the general approach is to publish and if the work has some issues it will become known by the readers easily enough. Basically a “if in doubt, put it out there anyway” kind of philosophy. I think this philosophy is the way to go but clearly didn’t happen in your case.
Thanks again for your details,
-Scott

EFS_Junior
January 23, 2011 12:04 pm

Scott says:
January 22, 2011 at 7:25 pm
EFS_Junior says:
January 22, 2011 at 8:54 am
And get a copy for free, as I did, and have already read the whole thing yesterday, as I did. 🙁
Junior, glad to hear you read the whole paper. I’ll try to get to it when I get more time (I’ve been hammered with work the last few months and will through at least the first week of Feb). I’d be curious to hear your thoughts on it.
Quickly glancing through the summary and some comments here, the results look interesting, but I don’t know if there’s much impact to it with the exception that uncertainties may now be better quantified than previously. The slope in temps is obviously unchanged, and presumably the uncertainty in that slope has increased a bit and that’s all. Is that a valid interpretation?
Thanks,
-Scott
_____________________________________________________________
I’ve now read through this “paper” twice, from beginning to end and every where in between. I also have most of the cited referenced, particularly reference #17, from whence the gross assumptions are made in this so called “derivation.”
Short answer? Typical E&E stuff.
Long answer? Separation of the total temperature error term (only considers the +/- term, the “paper” never discusses bias offset errors, and assumes all errors are symmetric about a zero mean) into two mutually exclusive, independent, and unrelated terms is bogus, unnecessary, and without merit.
The total error does NOT propagate through the temperature array system as sigma, it also does not propagate through the system as sigma/SQRT(N), as usual, the truth lies somewhere in between these two limits. I could post the right answer, once I’ve processed more data for different N (I have N =1, N = 7 (needs a little more work), and N = 23 so far, I need larger N before I can make a definitive statement). For N = 23 the total error is reduced by a factor of ~2 (autocorrelation is your friend, but it also complicates things, as I have to first remove all artifacts of autocorrelation, I done so successively for the 23 HAWS (all station correlations of anomalies between any two stations has been done, the resulting plots all look Gaussian (need to do some binning to be sure) with zero mean).
Note that the 23 HAWS include the original DAILY raw Canadian data sets;
http://www.climate.weatheroffice.gc.ca/climateData/canada_e.html
The anomaly period I used was 1951-2010 (N = 60) (had to wait a few weeks from my previous anomaly base line of 1951-2009), the data sets are all of (IMHO) very high quality with all temperatures reported at 0.1C resolution (for all years I downloaded to date, 1982-2010 inclusive). My analysis have been through several iterations, to arrive at a complete and entirely internally consistent procedure for handling all individual data sets prior to extracting the low frequency temperature signature.
I have used the 23 HAWS in relationship to Arctic sea ice extent/area/volume, these stations all represent the Canadian Archipelago and Hudson Bay. There is an overall method to my madness! 🙂
The seven highest in latitude (all at or above 72N) HAWS (Alert, Eureka, Isachsen, Resolute, Mould Bay, Sachs Harbor, and Pond inlet) are there for the purpose of the core MYI Arctic sea ice field which lies predominantly directly above these stations.
The results from the seven HAWS and 23 HAWS would shock you, particularly the period from the ~early 70’s to present, currently these areas are warming at ~0.25C/year (+/- 0.15 C/year). I won’t extrapolate these numbers into the future though, because, if true, we are in for some very serious problems with Arctic sea ice extents/areas/volumes.
My sincere hope is that these current temperature rates slow down, seriously.
See this post where I briefly describe my own analysis of 23 HAWS stations;
http://wattsupwiththat.com/2011/01/22/the-metrology-of-thermometers/#comment-580961
Bottom line? If the total error was as claimed, I would never be able to extract a very real low frequency temperature trend line signature with a resulting R^2= 0.99.
I’ve been doing this type of stuff, on and off for ~35 years, at the USACE ERDC CHL (US Army lab of the year 3 of the last 4 years), moored ship motion in LA/LB harbours (BTW, largest combined harbour complex in the world) (I do prefer the Canadian spelling)), where the ships are excited by the low frequency component of the total ocean wave signature, and believe me when I say that the ratios of the low frequency wave signatures are very much smaller (by several orders of magnitude) than the ratios of the low frequency temperature signatures published to date.
There’s no getting around the basic fact that the low frequency temperature signatures are indeed very real and very accurate, regardless of how one goes about describing the accuracy of the total temperature signatures.
In closing, my analysis of Arctic sea ice extents/areas/volumes and HAWS temperatures is definitely journal publication worthy, and whatever journal they do make it into, it won’t be E&E.
[Edited. Robt]

EFS_Junior
January 23, 2011 12:35 pm

Scott says:
January 22, 2011 at 9:56 pm
Pat Frank says:
January 22, 2011 at 7:51 pm
Hi Pat,
Thanks for all the gory details on your peer review experience. I’ve published 7 peer-reviewed publications as first author, and I’ve never had an experience like you describe. It really goes to show how political the process can be (most of my work had little or no political relevance on a larger scale, so I avoided those problems).
Heck, several times I had two reviews, one favorable and the other not so much and the editor chose to publish anyway. It seems that when politics aren’t involved the general approach is to publish and if the work has some issues it will become known by the readers easily enough. Basically a “if in doubt, put it out there anyway” kind of philosophy. I think this philosophy is the way to go but clearly didn’t happen in your case.
Thanks again for your details,
-Scott
_____________________________________________________________
Actually Scott I agree with you 100% that this “paper” should have been given in a much higher profile in the well respected peer reviewed climate science journal. Or any other journal for that matter.
It does spur debate, and follow-up publications, either supporting or rejecting the basics presented in any paper.
Getting it published in E&E, it sort of lies dormant relative to the broader climate science community majority.
The AGW skeptics eat it up though, as is quite evident in the metrology thread.
Where is true agnostic skepticism when you need it?
I really do wish that others in the AWG community (or the climate science community in total (and yes that includes all the skeptics too)) would take the time to read this “paper” and go through the “theory” in greatest detail.
As it sits now, apparently no “respectable”climate scientist (and that includes all climate scientists, skeptical or not) is likely to take serious notice.
Sad but true IMHO.

January 23, 2011 2:32 pm

I’d like to call attention to Michael Limburg’s post above about his Ph.D. work. It looks quite comprehensive, and if so, publication of the results will finally disabuse everyone of the notion of climatologically useful accuracy in the 20th century global surface air temperature record.
Scott, thanks. Steve McIntyre has published that kind of story about his review processes. Ross McKitrick has several similar stories — I’d guess one for each submission that challenged the prevailing climate wisdom. So does Richard Lindzen, and we’ve all read the story of the OLMC 88 pages. For my Skeptic article, one reviewer gratuitously accused me of scientific dishonesty. If, after this is all over, some social scientist were to interview climatologists, I’d suspect a large number of presently invisible but discouraged scientists would be found, who had been similarly oppressed. The evidence seems to be that unfair reviews are at least a common minority practice in climatology, but one which does not bring a reprimand or disqualification from journal editors. It should do. After all, climate science is about physics, not a branch of philosophy where peer review is apparently often about who one supports. I certainly have never, ever experienced such attacks in Chemistry, and have heard of none.
EFS, thanks for your sympathy, despite the quotation marks.

January 23, 2011 2:59 pm

By the way, here’s the title and abstract of the second paper, to also be published in E&E and then ignored. This work was the other part of the original longer paper rejected by JAMC.
Title: “Imposed And Neglected Uncertainty In The Global Average Surface Air Temperature Index”
Abstract: “The statistical error model commonly applied to monthly surface station temperatures assumes a physically incomplete climatology that forces deterministic temperature trends to be interpreted as measurement errors. Large artefactual uncertainties are thereby imposed onto the global average surface air temperature record. To illustrate this problem, representative monthly and annual uncertainties were calculated using air temperature data sets from globally distributed surface climate stations, yielding (+/-)2.7 C and (+/-)6.3 C, respectively. Further, the magnitude uncertainty in the 1961-1990 global air temperature annual anomaly normal, entirely neglected until now, is found to be (+/-)0.17 C. After combining magnitude uncertainty with the previously reported (+/-)0.46 C lower limit of measurement error, the 1856-2004 global surface air temperature anomaly with its 95% confidence interval is 0.8(+/-)0.98 C. Thus, the global average surface air temperature trend is statistically indistinguishable from 0 C. Regulatory policies aimed at influencing global surface air temperature are not empirically justifiable.”
The (+/-)0.17 C “magnitude uncertainty” in the normal mean is the standard deviation of the 1961-1990 yearly anomalies around the CRU mean anomaly normal of the same period. This gives a measure of the climate ‘jitter’ during the normal period, and amounts to an uncertainty in the magnitude of the normal mean. As such, it must be propagated into the annual anomalies that are calculated by subtraction from the normal mean. This statistical practice, after all, is merely the standard for calculating empirical uncertainties in physical science.
Anyway, it was also interesting to discover that the ‘jitter’ standard deviation of annual anomalies over the total of years from 1856 – 2004, relative to the normal mean, was (+/-)0.28 C. The paper points out that if the governing climate regime was unchanging for the duration, then (+/-)0.28 C is a 1-sigma measure of the “natural variability” of climate during this entire period.
The 95% confidence interval of recent natural variability for 1856-2004 is then (+/-)0.56 C, which all by itself discounts causality in 70% of the total warming since 1856. That is, provisionally crediting the mean temperature trend, it’s hardly different from a random process that exhibits the pseudo-trends expected from a stochastic process plus persistence.

EFS_Junior
January 23, 2011 4:57 pm

Pat Frank says:
January 23, 2011 at 2:32 pm
I’d like to call attention to Michael Limburg’s post above about his Ph.D. work. It looks quite comprehensive, and if so, publication of the results will finally disabuse everyone of the notion of climatologically useful accuracy in the 20th century global surface air temperature record.
_____________________________________________________________
Are you sure about that? I seriously doubt that the global mean temperature curve will just go away because of one additional PhD. thesus.
_____________________________________________________________
Scott, thanks. Steve McIntyre has published that kind of story about his review processes. Ross McKitrick has several similar stories — I’d guess one for each submission that challenged the prevailing climate wisdom. So does Richard Lindzen, and we’ve all read the story of the OLMC 88 pages. For my Skeptic article, one reviewer gratuitously accused me of scientific dishonesty. If, after this is all over, some social scientist were to interview climatologists, I’d suspect a large number of presently invisible but discouraged scientists would be found, who had been similarly oppressed. The evidence seems to be that unfair reviews are at least a common minority practice in climatology, but one which does not bring a reprimand or disqualification from journal editors. It should do. After all, climate science is about physics, not a branch of philosophy where peer review is apparently often about who one supports. I certainly have never, ever experienced such attacks in Chemistry, and have heard of none.
_____________________________________________________________
You should try to get SM’s attention over at CA. Perhaps Anthony can help you out over there. One thing’s for certain with CA, spirited debate.
_____________________________________________________________
EFS, thanks for your sympathy, despite the quotation marks.
_____________________________________________________________
Go ahead and take out the quotation marks then, I do need to turn down my snark meter below 11teen!

January 23, 2011 10:34 pm

This will be a point-by-point rebuttal of EFS_Junior’s criticism.
First, as we go through this, keep in mind that my paper discusses unaddressed uncertainty in the standard calculation of global average surface temperature, not in recent trends in regional temperatures that form the base of EFS_Junior’s critique.
However, to EFS_Junior’s answer about why my analysis is wrong:
Point 1: “Long answer? Separation of the total temperature error term … into two mutually exclusive, independent, and unrelated terms is bogus, unnecessary, and without merit.”
Reply 1: Let’s look at what’s done in Brohan, 2006: Under Section 2.3.1 “Station Errors,” they write T_actual = T_ob + e_ob + C_H + e_H + e_RC, where T_ob, e_ob are observed temperature and its error, C_H, e_H are homogenization and its error, and e_RC is miscalculation or misreporting error.
So, Brohan, et al. separated their temperature errors into three (not two) mutually exclusive, independent, and unrelated terms, and presumably that is bogus, unnecessary, and without merit.
EFS_Junior mentioned in his preamble that he has a copy of my article reference 17, which is Bevington and Robinson, “Data Reduction and Error Analysis for the Physical Sciences.” Pages 1&2 in B&R say this: “Our interest is in uncertainties introduced by random fluctuations in our measurements, and systematic errors that limit the precision and accuracy of our results…(italics in original). They go on to distinguish between precision and accuracy, and on page 3 between systematic errors and random errors.
On page 3, systematic errors are described as due to observer mistakes, equipment failures, and experimental conditions. Random errors are described as fluctuational. Bevington and Robinson thus distinguish these error types as mutually exclusive, independent, and unrelated as to both source and behavior.
But, according to EFS_Junior, separation of error terms into mutually exclusive, independent, and unrelated terms is bogus, unnecessary, and without merit.
In the JCGM 100:2008 “Evaluation of measurement data — Guide to the expression of uncertainty in measurement,” put out by Working Group 1 of the Joint Committee for Guides in Metrology (JCGM/WG 1), random error is orthogonally distinguished from systematic error.
Random error is “stochastic temporal and spatial variations of influence quantities [and its] expectation value is zero.” In contrast, systematic error is defined as the “effect of an influence quantity on a measurement result,” that produces a mean value different from the true value and further that, “systematic error and its causes cannot be completely known.”
Once again, error types are separated into mutually exclusive, independent, and unrelated terms, which we now know is bogus, unnecessary, and without merit..
In both Bevington & Robinson (p. 11) and in the JCGM manual (Section C.2.20), the variance about a mean goes as s^2 = [(1/(N-1)]*[sum-over-N(x-x_bar)^2]. As, by definition, systematic error yields an error mean different from zero, the dispersion of systematic error about its mean goes as s^2 and does not diminish as 1/(sqrtN). This is the approach taken in my paper, and it is the documented correct approach.
EFS_Junior’s critique against separating total error into its component types is wrong.
Further, evaluating separate sources of error is standard practice in experimental science. Doing so is really the only way to determine the contribution each sort of error makes to a measurement. EFS_Junior, in his critique, is dismissing perhaps the most critical method of assessing the meaning and significance of an experimental result.
In Brohan 2006, e_ob is represented as the guesstimated (+/-)0.2 C average of Folland, et al, 2001, which is discussed in detail in my paper.
The contribution of systematic error to e_ob, due to uncontrolled environmental variables, is not discussed at all in Brohan, 2006. This analytical lacuna is what sparked my inquiry.
Point 2: EFS_Junior also wrote that, “[he] only considers the +/- term, the “paper” never discusses bias offset errors, and assumes all errors are symmetric about a zero mean.”
Reply 2: The first part of my paper describes basic sorts of random error, and with that as context goes to show that the average (+/-)0.2 C error taken by Brohan, 2006, and Folland, 2001 isn’t random at all.
After that, Section 3.2.2 discusses systematic error and its sources in surface air temperature measurements. Bias mean offsets and standard deviations for three sensor systems are calculated and presented in Table 1. The top of page 978 has this sentence: “All these systematic errors, including the microclimatic effects, vary erratically in time and space [40-45], and can impose nonstationary and unpredictable biases and errors in sensor temperature measurements and data sets.
Figures 1 and 2 show fits to three data sets of systematic error in temperature measurements. For clarity of graphical presentation, the bias means were subtracted to produce an artificial mean of zero. Maybe that misled EFS_Junior, although it’s explicitly mentioned in both figure legends.
That is, I directly and obviously discussed bias error offsets and nonstationary error, which by definition does not have a mean of zero. So, I did not assume all errors are symmetrical about a zero mean, and it’s obvious that I did not.
Somehow, EFS_Junior apparently missed all of this, even after two readings.
Let’s also add that if I had believed all errors are symmetrical about zero, I’d have had to also believe that all these errors would diminish as 1/sqrtN, and would go to zero at large N. The lower limit of error would then be uniformly zero, and my paper would have no point. So, EFS_Junior is claiming that my paper reflects an assumption about error that completely contradicts the actual and explicit analysis of error present in the paper. That, after two readings.
More about bias errors: under “Systematic Errors” page 3 of Bevington and Robinson gives an example where subtracting a bias offset error actually increased the inaccuracy of a result, because the bias correction itself was an estimate. Climate scientist colleagues of EFS_Junior regularly “correct” their data sets by subtracting estimated bias offsets, and then go on to assume improved accuracy.
Bevington and Robinson, in contrast, go on to advise that one must explicitly take account of new uncertainties that may be introduced by bias corrections. How is that done in bias-corrected temperature data sets, when the uncontrolled systematic effects on the initial measurements are unknown?
Point 3: EFS_Junior wrote, “The total error does NOT propagate through the temperature array system as sigma, it also does not propagate through the system as sigma/SQRT(N), as usual, the truth lies somewhere in between these two limits.”
Reply 3: I didn’t calculate “total error.” I calculated a lower limit of sensor measurement error, principally due to systematic effects right at the instrument.
Total error also includes discontinuities due to instrumental changes, extrapolations due to sparse data, missing data, dropped stations, moved stations, station spatial inhomogeneity, observer read bias, time-of-observation bias, albedo changes, the hotly-debated UHI, and what else? So what if these things propagate differently than 1/sqrtN, or differently from sigma itself?
My paper isn’t concerned with them. That’s why “lower limit” is included in the title and in the text.
Systematic instrumental error does propagate as 1/[sqrt(N/(N-1)], and at large N the adjudged guesstimated (+/-)0.2 C station error of Brohan, 2006/Folland, 2001 also does propagate as sigma.
All the rest of the errors must be calculated in their own particular way, and summed with the basic instrumental error to get the “total error.” All the other sources of error would merely add to instrumental error error. EFS_Junior’s point 3 is irrelevant.
Finally, the point about the HAWS data.
Point 4: EFS_Junior wrote, “for the 23 HAWS (all station correlations of anomalies between any two stations has been done, the resulting plots all look Gaussian (need to do some binning to be sure) with zero mean).” and “Bottom line? If the total error was as claimed, I would never be able to extract a very real low frequency temperature trend line signature with a resulting R^2= 0.99.”
Reply 4: This claim deserves some comment. EFS_Junior wrote that, “the data sets are all of (IMHO) very high quality with all temperatures reported at 0.1C resolution (for all years I downloaded to date, 1982-2010 inclusive).”
The newest sensors used in Canada were reported in E. Milewska and W. D. Hogg (2002) Atmos-Ocean 40, 333–359, to be “Yellow Springs International (YSI) Model 44212 thermistors in a Stevenson screen.” The Campbell Science manual (pdf download) reports the manufacturer specification average accuracy for these probes to be (+/-)0.1 C. Well and good.
K. G. Hubbard and X. Lin (2002) GRL 29, 1425 (doi:10.1029/2001GL013191) reported on the field resolution of the PRT HMP45C probe (pdf download), also inside a Stevenson screen. The HMP45C probe has a closely comparable manufacturer’s specification of (+/-)0.2 C accuracy at 20 C.
Systematic error reduced the field resolution of the HMP45C in a Stevenson screen to an average bias offset of 0.34(+/-)0.53 C (article Table 1). Subtracting 0.34 C from all the recorded temperatures would not remove the dispersion uncertainty of (+/-)0.53 C. The (+/-)0.53 C propagates as 1/(sqrt[N/(N-1)] into any average of that data.
Maybe in Canada the resolution of thermistor probes is not degraded by systematic effects in the field while housed inside Stevenson screens, but unless that has been demonstrated, EFS_Junior’s claim of (+/-)0.1 C accuracy is no more than special pleading.
It’s worth quoting Milewska and Hogg a little more extensively: “
Climatological records of high temporal resolution have been generating interest recently, because of their direct applicability to climate change impact studies. Finding adjustment factors for these daily and sub-diurnal observations – synoptic, hourly – is a challenging task. A single bias adjustment value will not work well on any day or time of the day, it might even introduce additional uncertainty by over or under correcting on a given day. The magnitude of the adjustment factor depends on meteorological conditions and thus should vary from one day to another. For example, wind speed and cloudiness are the two main controlling weather elements that determine the value of the adjustment in the case of temperature. Larger factors may be required for calm clear nights, when siting biases are magnified due to the increased response of the ground surface to radiative cooling… (bolding added)”
Milewska and Hogg seem to know about B&R’s caution concerning bias offset subtraction. One is led to wonder whether EFS_Junior worried about whether any bias removal might degrade his data further when he compiled his temperature series. Did he know even the average magnitude of any systematic biases in his data? Well, but of course the reported accuracy was (+/-)0.1 C.
Milewska and Hogg don’t report any attempt to quantify the field resolutions of their YSI sensors in Stevenson screens by comparison with a high-precision sensor like the R. M. Young aspirated probe, but do mention the manufacturer’s estimated field resolution. Thus, Milewska and Hogg: “The sensors are generally reliable with manufacturer specified accuracy, which is the closeness of the agreement between the result of a measurement and the “true” value, of (+/-)0.3°C.
So, Canadian AWOS and RTD sensors in Stevenson screens are admitted to have a field accuracy of (+/-)0.3 C. If we believe Hubbard and Lin’s actual field test, they’re likely to have an accuracy of no better than (+/-)0.5 C. But, they were reported to (+/-)0.1 C, and we’re told that was apparently good enough to trust uncritically.
EFS_Junior reports that arctic “areas are warming at ~0.25C/year…“, which is 0.05 C inside the estimated inaccuracy envelope reported by Milewska and Hogg, and half the 1-sigma uncertainty reported by Hubbard and Lin. So, EFS_Junior’s “(+/- 0.15 C/year)” is clearly over-optimistic.
Note that the (+/-)0.3 C is an accuracy measure, not a precision measure. There is no reason to think that in the field the true inaccuracy is symmetric about a mean of zero. An assumption that the field inaccuracy diminishes as 1/(sqrtN) is empirically unjustified.
Finally, point 5: EFS_Junior wrote, “Bottom line? If the total error was as claimed, I would never be able to extract a very real low frequency temperature trend line signature with a resulting R^2= 0.99.”
Reply 5: EFS_Junior emailed me about this earlier. In reply, I pointed out that if sensor measurement error = e_tot = e_sys + e_ran, then averaging daily temperatures would make e_ran diminish as 1/(sqrtN). However, e_sys remains undiminished, and would just factor into the mean. The systematic error would be completely invisible, and hide its contribution within the data. That is, without prior methodological testing, data + systematic error looks just like data + no error. Apparently EFS_Junior found this very standard caution unconvincing.
In the case of data distorted by resident e_sys, any trend that emerges would be contaminated by the uncompensated systematic error. This error may even reflect the degradation of the sensor over time. Any “true” trend may be greater or smaller than the observed trend, but whatever the true trend is, it’s unknown when systematic error is uncompensated.
Even comparingf trends from adjoining stations, to show “homogeneity” of data, gives no reason to dismiss the impact of systematic error. Uncontrolled environmental variables can be regional as much as local. Regionally extensive environmental variables can impact regional sensors in analogous systematic ways, to impose analogous systematic errors. In data averaged over longer times, one might expect the systematic effects of regional environmental variables to emerge most strongly. It’s conceivable that the systematic effects that follow meteorological trends (e.g., of trends in insolation or wind) could impose analogous but spurious low-frequency trends on independent data sets from stations scattered across a contiguous region.
This possibility has never been tested — at least to my knowledge after searching the literature. So, EFS_Junior’s claim of “never be able to extract…” is without empirical merit.
On the other hand, I do know of one sensor test of a different sort that does indicate the possibility of regionally extensive environmental systematic effects on temperature sensors. At some point I intend to write this up, along with some related analysis.
The rest of EFS_Junior’s critique was polemical, with an appeal to authority and considerable gratuitous disrespect directed toward E&E. There’s no need to reply to that.

January 23, 2011 10:43 pm

EFS_Junior: “I seriously doubt that the global mean temperature curve will just go away because of one additional PhD. thesus.”
Newtonian physics was overturned by one additional patent clerk. Other examples abound.

January 24, 2011 9:38 am

I’m reposting this from tAV. Anthony expressed a similar sentiment at the head of this post:
I’d like to add that though I’ve offered to send reprints on request, I’d ask that those of you who have access to academic accounts, or who have fine personal incomes, to please purchase the article from Energy and Environment, here.
Energy and Environment has proved to be one of the few remaining journals in climate science where one can anticipate a uniformly dispassionate and even constructive critical review. In the principled stand of its editor, Dr. Sonja Boehmer-Christiansen and its Multi-Science publisher, Dr. William Hughes, E&E has been thoroughly in support of open and transparent science when so many have abridged it.
The journal merits support, and deserves to recover their fair profit for publishing my article.
Thanks very much,
Pat

EFS_Junior
January 24, 2011 2:38 pm

Pat Frank says:
January 23, 2011 at 10:34 pm
Reply 5: EFS_Junior emailed me about this earlier. In reply, I pointed out that if sensor measurement error = e_tot = e_sys + e_ran, then averaging daily temperatures would make e_ran diminish as 1/(sqrtN). However, e_sys remains undiminished, and would just factor into the mean. The systematic error would be completely invisible, and hide its contribution within the data. That is, without prior methodological testing, data + systematic error looks just like data + no error. Apparently EFS_Junior found this very standard caution unconvincing.
In the case of data distorted by resident e_sys, any trend that emerges would be contaminated by the uncompensated systematic error. This error may even reflect the degradation of the sensor over time. Any “true” trend may be greater or smaller than the observed trend, but whatever the true trend is, it’s unknown when systematic error is uncompensated.
Even comparingf trends from adjoining stations, to show “homogeneity” of data, gives no reason to dismiss the impact of systematic error. Uncontrolled environmental variables can be regional as much as local. Regionally extensive environmental variables can impact regional sensors in analogous systematic ways, to impose analogous systematic errors. In data averaged over longer times, one might expect the systematic effects of regional environmental variables to emerge most strongly. It’s conceivable that the systematic effects that follow meteorological trends (e.g., of trends in insolation or wind) could impose analogous but spurious low-frequency trends on independent data sets from stations scattered across a contiguous region.
This possibility has never been tested — at least to my knowledge after searching the literature. So, EFS_Junior’s claim of “never be able to extract…” is without empirical merit.
_____________________________________________________________
You need to simlify your reply above. As it stands now I don”t even know what you mean.
The bottom line is that 23 HAWS stations all show (roughly) the same low frequency trend lines. And that’s just a plain cold harf fact. 23 HAWS stations all showing the same systematic errors? I think NOT!
Further you claim that systematic error eould be “would be completely invisible” well than, that works for me! This is the same as stating that systematic error averages out to zero for each station or to the ensamble average of any N stations.
Good to know that, thanks for providing the necessary confirmation.
Your second paragraph makes no sense whatsoeven, and is quite obviously circuitous. If this argument were actually true, than no analyses could ever be conducted on any raw data set whatsoever. It’s akin to saying collect the raw data, than do nothing with the raw collected data. Sad, really sad.
Your third paragraph plays with the concept of region, so is the Northern Hmisphere (NH) a region? Because I’m quite sure that I can show some autocorrelation/cross correlation with any two NH regional stations simply due to similar seasonal and diurnal behaviors (note that these would have to be cross correlated first to determine the diurnal lag coefficient). This also plays somewhat into the LLN, as it would be virtually impossible (p ~ 0) to take multiple stations from any region (no matter how small or how large) with the expectation (p ~ 1) that all these stations used the exact same sensors over time, have identical systematic errors over time, etceteras.
Autocorrelations exceeding 0.8, 0.9, 0.99, 0.999 with slopes of 0.98, 0.99, 1.00, 1.01, 1.02 for any two stations, is empirical proof that there is an underlying relationship between them, that whatever errors exist in these measurements, that these errors are not systematic in nature, to the degree, or in the manner, that the author claims.
It’s akin to saying “you can’t do that because x, y, z, …” but than doing so anyway, and than producing results with R^2 > 0.99 (and no, the “correlation does not causation” argument holds no water here, as the comparison is between temperature records from two random stations, one is not causing the other, both are measuring a nearly identical response to TBD external forcings).
In your 4th paragraph/sentence you again make a baseless statement, as empirical evidence abounds as to the sameness of the global temperaure trends (land based and satellite eras, a random selection of a small group of land based records, say, for example, 10 < N < 100), that these trends are very similar for the 23 HAWS as well as to larger networks (but with lower trend lines for the global mean temperature trends), that these 23 HAWS trend lines are, in the aggrigate, quite similar in shape to the global trend lines.
In short, saying something is "without empirical merit" flies directly in the face of the of the wealth of empirical evidence that does exist with respect to low frequency temperature trendlines.
So obviously, the author clearly does not understand the issues at hand, thus the author's so called "theory" is bogus, unnecessary, and truly without merit.
In closing, any temperature record has a total error associated with it, whatever that error is, it does not stop one from extracting very real low frequency information with an associated very high degree of statistical confidence.
If this were not true, if this were not the case, than all low frequency temperature trand lines would indeed exhibit truly random behaviors, with R^2 approaching zero in all cases. That because we can demonstrate well defined low frequency trend lines with associated high degrees of statistical confidence, that we can choose large N, and still obtain well defined low frequency trend lines with associated high degrees of statistical confidence, suggests umabigiously, based on the LLN, that these emperical results are real with probability approaching one (p ~ 1).

January 26, 2011 9:58 pm

Reply, Part 1.
This Part 1 of what will be a point-by-point reply to EFS_Junior’s follow-up critique of my article. I’ll begin with what he did discuss, and finish with what he did not. EFS_Junior’s comments will be enquoted.
EFS Point 1. “You need to simplify your reply above. As it stands now I don’t even know what you mean.”
Reply 1. I wish you’d been more specific. But assuming it’s the bit about e_tot: the total instrumental error in any single measurement, e_tot, will be the sum of the random error e_ran and the systematic error e_sys. Each kind of error contributes a spurious magnitude to the observation.
In measuring a temperature, for example, the magnitude of the observed temperature “t_i” is a sum of the “true” magnitude, “tau_i” plus the magnitudes of all the errors affecting that particular measurement.
So for any measured temperature, t_i = tau_i + e_ran_i + e_sys_i, where “i” is the measurement index (i = 1,2, …, n).
As usual, e_ran is stationary by definition. When multiple measurements of temperature are averaged, the total of e_ran will diminish as 1/sqrtN in the mean temperature, T_bar. The details of this are discussed in Cases 1-3 in my paper.
However, e_sys is not stationary. It typically arises from uncontrolled variables that are of unknown magnitude and unknown variability. Therefore, e_sys can vary between sequential measurements, doesn’t have a mean of zero, and permanently impacts the magnitude of the measurement.
The e_sys of each observation enters into any mean of multiple observations. In the mean, the total of e_sys goes as (1/(N-1)*sqrt[sum-over-N(e_sys_i)^2], and never diminishes to zero. At large N, the total of e_sys approaches (e_sys)avg, and the mean temperature is T_bar(+/-)(e_sys)avg.
The only way to know the magnitude of e_sys(avg) in a set of measurements is to have done prior tests of the methodology, using the same instrument to measure precisely known standards under conditions a close as possible to those to be used for the experimental measurements.
This is the message in the statistical sources mentioned in my previous post.
EFS Point 2. “The bottom line is that 23 HAWS stations all show (roughly) the same low frequency trend lines. And that’s just a plain cold harf fact. 23 HAWS stations all showing the same systematic errors? I think NOT!”
Reply 2. First, I refer you to the classic paper of Hansen and Lebedeff [1]. I’m sure you have a copy. Please look at their Figure 3, where, Hansen and Lebedeff show pair-wise correlation coefficients of temperature measurements among many hundreds of surface stations.
Here it is, in their own words: “At middle and high latitudes the correlations approach unity as the station separation becomes small; the correlations fall below 0.5 at a station separation of about 1200 km, on the average. At low latitudes the mean correlation is only 0.5 at small station separation. The distance over which strong correlations are maintained at high latitudes probably reflects the dominance of mixing by large-scale eddies. At low latitudes the most active atmospheric dynamical scales are smaller, but apparently there are also substantial coherent temperature variations on very large scales (for example, due to the quasi-biennial oscillation, Southern Oscillation, and E1 Nino phenomena), which account for the slight tendency toward positive correlations at large station separations. We examined the dependence of the correlations on the direction of the line connecting the two stations. For the regions for which this check was performed, the United States and Europe, no substantial dependence on direction was found. For example, in these regions the average correlation coefficient for 1000-km separation was found to be within the range 0.5-0.6 for each of the directions defined by 45 [degree] intervals.”
In short, Hansen and Lebedeff show large correlations of recorded temperature compared between pair-wise surface stations The correlations extend for considerable distances. Temperature correlation becomes more than 0.8 at distances less than 250 km for latitudes greater than 23 degrees. At distances less than 250 km between the equator and 23 degrees, regional correlations are ~0.5.
Next, we turn to the work of Hubbard and Lin [2]. They tested the temperature measurements “of dual temperature systems for ASOS, MMTS, and Gill shield (with HMP45C sensor), as well as one aspirated (ASP-ES) and one non-aspirated (NON-ES) shield from Eastern Scientific Inc. (with HMP45C), and one CRS (with HMP45C)” against the simultaneous measurements from a high-precision R. M. Young probe in an aspirated shield. “CRS” is Cotton Regional Shelter (the Stevenson Screen).
The three-panel Figure 2 in their paper shows the distribution of bias temperatures during day, night, and average for all six of the tested sensors.
Figure 2 legend is: “Statistical distributions of (a) daytime air temperature biases, (b) nighttime air temperature biases, and (c) overall air temperature biases for all air temperature systems in the measurements.” The data include thousands of temperatures (April through August, 2000), recorded at 0.1 Hz for 5 minutes each, representing 30 measurements per recorded temperature. So, random error was diminished by 1/5.5 in each reported temperature.
The average bias envelope for the HMP45C probe in the CRS shield was asymmetric. Ir showed a large excess of too-warm temperatures. The bias mean(+/-)sigma was 0.34(+/-)0.53, however, this isn’t the whole story.
A distribution of temperatures skewed toward warm, inserts a variable warm bias into the temperature record, during the thousands of measurements. This systematic temperature bias was primarily the impact on the sensor of solar loading and wind speeds.
This warm bias is a systematic error in the instrument that will not be revealed by any test looking for errors due to random artifacts, or any changes external to the instrument.
If the HMP45C/CRS warm bias was not distributed randomly in time, it will induce a spurious trend into the temperature data. A spurious warming trend will show up if the systematic bias is more pronounced late in the data set. Or, counter intuitively, if a warm systematic bias is asymmetrically distributed into the early part of the data set, it can produce a spurious cooling trend.
Figure 2 of Hubbard and Lin shows that every single one of the six sensor systems they examined showed an excess bias asymmetry toward too warm temperatures (also shown in their Table 1).
Now we combine the results of Hansen and Lebedeff with those of Hubbard and Lin:
1. High correlations of temperatures exist among regionally adjacent surface stations, extending over at least 250 km.
2. Regional correlations of temperature require regional correlations of climate. The results of Hansen and Lebedeff mean climate is correlated over distance.
3. Local temperatures are governed by local climate (sun and wind, and even snow albedo [3]).
4. Systematic biases are induced into air temperature measurements by the same climatic effects that produce the air temperature itself.
5. Regionally correlated air temperatures therefore require regionally correlated systematic effects.
6. Temperature sensors of regionally adjacent surface stations systematically biased by correlated climates produce correlated systematic biases.
7. Correlated systematic errors will be present in the air temperature records of regionally adjacent surface stations.
8. The tested sensors were of different configurations and all of them displayed systematic error similarly skewed to excess warm temperatures.
9. Surface stations with different types of sensors will produce correlated systematic errors.
Conclusion 1: Systematic errors are inevitably present in air temperature measurements.
Conclusion 2: These systematic errors will certainly propagate into the recorded air temperature time series of surface stations.
Conclusion 3: Systematic errors in surface air temperature records will be correlated among regional surface stations.
Therefore: Spurious temperature trends within the data of one surface station will correlate with similar spurious trends in the temperature data of regionally adjacent surface stations.
Observation: This strong likelihood of regionally correlated temperature biases, which is predicted by published work, has completely escaped the notice of the climate science community.
Finally: Spurious trends due to systematic error that appear in data, and that correlate among the records of adjacent surface stations, will look exactly like real trends.
Examining correlations within temperature-time series trends from adjacent surface stations will not reveal the contamination of the temperature series by systematic error.
Correlations of temperature-time series among adjacent surface stations do not, and in-and-of-themselves will never, disprove the existence of systematic error in the data.
It’s particularly relevant to your HAWS data that Hansen and Lebedeff noted the strongest regional temperature correlations in the highest latitudes. That means the systematic error will also be most strongly correlated at the same high latitudes where HAWS data were collected.
Hansen and Lebedeff compared station temperatures, not anomalies. Their temperatures were large in magnitude compared to the (+/-)0.46 C lower limit of error. Therefore their correlations are real. However, anomalies are of the same magnitude as the (+/-)0.46 C lower limit. This is evidenced by the 0.25(+/-)0.15 C HAWS trend you mentioned.
As the HAWS trend is smaller than the 1-sigma lower limit systematic error, there is no reason to believe it is real.
This leads to:
EFS Point 3. “Further you claim that systematic error eould be “would be completely invisible” well than, that works for me! This is the same as stating that systematic error averages out to zero for each station or to the ensamble average of any N stations.”
Reply 3. “Invisible” means invisible, not ‘averages out to zero.’ Invisible means systematic error will not show up in any statistical test posterior to collecting the data.
Data contaminated with systematic error will look like real data, except that it will be false data. In the case of temperature, the magnitudes will be wrong, and the trends may be spurious. How the spurious data will appear depends on the magnitudes, distributions, and changeability of the uncontrolled variables that affected the sensor while the temperatures were being measured.
Overcoming these problems is why Hubbard and Lin have spent so much effort developing real time filtering methods for surface station temperatures.
EFS Point 4: “Your second paragraph makes no sense whatsoeven, and is quite obviously circuitous. If this argument were actually true, than no analyses could ever be conducted on any raw data set whatsoever. It’s akin to saying collect the raw data, than do nothing with the raw collected data. Sad, really sad.”
Reply 4. We can agree on the “sad.” By now, you should have realized that surface air temperatures contaminated by systematic error are useless for detailed comparisons. You should also have realized by now that field air temperatures are undoubtedly contaminated by systematic error.
It means that the conclusion of my E&E paper obtains, namely that, “no analyses could ever be conducted on any raw [surface air temperature] data set whatsoever” that are more accurate than about (+/-)0.46 C.
Part 2 will be forthcoming.
References:
1. Hansen, J. and Lebedeff, S., Global Trends of Measured Surface Air Temperature, J. Geophys. Res. (1987) 92 (D11), 13345-13372.
2. Hubbard, K.G. and Lin, X., Realtime data filtering models for air temperature measurements, Geophys. Res. Lett. (2002) 29(10), 1425 1-4; doi: 10.1029/2001GL013191.
3. Lin, X., Hubbard, K.G. and Baker, C.B., Surface Air Temperature Records Biased by Snow-Covered Surface, Int. J. Climatol. (2005) 25, 1223-1236; doi: 10.1002/joc.1184.

Doug Proctor
January 27, 2011 8:29 am

During the course of the year the insolation that heats the planet, averaged out at about 288 W/m2, varies by about +/- 20 W/m2 due to orbital eccentricity and albedo. These changes (+/- 15%) are global but not randomly distributed in space or time. What the actual impact on regions is not known (to my knowledge).
We worry about instrumental or measurement accuracy (and precision), yet what of the impact of long-term local variations in albedo and timing of albedo? Do these disproportionately alter temperature readings so as to distort the “global” temperature? We know a 1.5% change in albedo equals the impact of doubled CO2 (by some calculations). Can a 3% change in one area, say, the Arctic, cause a 0.9K apparent increase in a portion of the year’s global temperature?
The impact of non-random variations in time of in-out energy looks like it has far greater possible impact than measurement accuracry and precision. Trenberth looks for 0.85 W/m2 of “missing” heat when the solar insolation value just got knocked down by 1.6 W/m2 (correctly or not). Our ability to ACCURATELY know what is going on is greater than our ability to understand the significance of the variance from the long term “normal”, at least at the level of variance we are seeing.

January 30, 2011 6:19 pm

This is Part II of my reply to the critique of my paper by EFS_Junior. Part I of my reply is here.
EFS Point 5. “Your third paragraph plays with the concept of region, so is the Northern Hmisphere (NH) a region? Because I’m quite sure that I can show some autocorrelation/cross correlation with any two NH regional stations simply due to similar seasonal and diurnal behaviors (note that these would have to be cross correlated first to determine the diurnal lag coefficient).”
Reply 5. “Region” can be empirically defined in terms of the 1987 results of Hansen and Lebedeff [1], who showed air temperature correlations across 1200 km. For convenience, we can define as a region, locales where correlation is, say, 0.68 (1-sigma) or better. For latitudes north or south of 23 degrees, that might be 500 km. For zero to 23 degrees, that might be 100 km. Of course, one can define them as one likes, but an evidence-based qualifier is required.
As pointed out in Reply Part I, instrumental systematic error is produced by the same uncontrolled environmental variables that determine surface air temperature. So, systematic errors in the temperature record will be about as regionally correlated as the temperature measurements themselves.
EFS Point 6: “This also plays somewhat into the LLN, as it would be virtually impossible (p ~ 0) to take multiple stations from any region (no matter how small or how large) with the expectation (p ~ 1) that all these stations used the exact same sensors over time, have identical systematic errors over time, etceteras.”
Reply 6. Again as pointed out in Reply Part I, the field calibrations from Hubbard and Lin [2] show that sensors in six very different shields exhibit the same direction of systematic skew in measured temperatures (see also [3-5]). Therefore, it is an empirically justifiable surmise that regional systematic errors will remain correlated no matter the sort of temperature sensor employed. Your analysis is therefore not appropriate to the question.
EFS Point 7: “Autocorrelations exceeding 0.8, 0.9, 0.99, 0.999 with slopes of 0.98, 0.99, 1.00, 1.01, 1.02 for any two stations, is empirical proof that there is an underlying relationship between them, that whatever errors exist in these measurements, that these errors are not systematic in nature, to the degree, or in the manner, that the author claims.”
Reply 7. Climate stations measure surface air temperatures (among other observables). We know that surface air temperatures are regionally correlated. Clearly, then, any long-term trends in air temperatures will also be regionally correlated. The total average lower limit systematic-plus-station-error of (+/-)0.32 C is much smaller than the magnitude of the recorded air temperatures. Therefore, although the 20th century temperature measurements are contaminated with at least (+/-)0.32 C of error, our knowledge of these air temperatures is not seriously impacted by that lower limit of error.
However, the (+/-)0.46 C lower limit of error in an annual anomaly is not smaller than the air temperature anomalies themselves, which are calculated by subtraction of station air temperatures from a climate normal. That means any anomaly trend in temperatures smaller than (+/-)0.46 C, over the time-range of interest, is not physically distinguishable from zero.
As noted above, the regional correlation of systematic error in surface air temperature, guaranteed by the systematic correlation of surface air temperature itself plus the analogous response errors of various air temperature sensors, will also guarantee that spurious anomaly trends smaller than (+/-)0.46 C will be regionally correlated. This spurious regional correlation will mislead any scientist who neglects the effects of systematic error, or any one who fails to understand that systematic errors follow uncontrolled systematic climate variables.
As the systematic errors entering the measurements of 20th century air temperatures are unknown, it is impossible to correct 20th century air temperatures for these errors. It is only possible to measure contemporary systematic errors by monitoring sensor systems similar to 20th century systems, in order to estimate an average representative uncertainty due to the systematic error present in the 20th century measurements. This experiment would require deploying high precision temperature sensors at multiple (>,=100) locations world-wide to monitor “true” air temperatures – making a precision data-base against which concurrently recorded standard sensor temperature measurements could be compared. See reference [2] for what such an experiment might look like.
EFS Point 8. “It’s akin to saying “you can’t do that because x, y, z, …” but than doing so anyway…”
Reply 8. This point is answered in Reply 7.
EFS Point 9. “In your 4th paragraph/sentence you again make a baseless statement, as empirical evidence abounds as to the sameness of the global temperaure trends (land based and satellite eras, a random selection of a small group of land based records, say, for example, 10 < N < 100), that these trends are very similar for the 23 HAWS as well as to larger networks (but with lower trend lines for the global mean temperature trends), that these 23 HAWS trend lines are, in the aggrigate, quite similar in shape to the global trend lines.”
Reply 9. As noted in Reply 7, regional trends in temperature do not necessarily indicate the credibility of regional anomaly trends. Climate stations globally all use similar temperature sensors. Over the bulk of the 20th century, these were LIG thermometers in CRS shelters, or worse. There is every reason to think that the systematic error contaminating their temperature measurements would vary as the air temperature. When regional and global annual anomalies are calculated, the trends should usually (but perhaps not always) be preserved, except that any trend smaller than (+/-)0.46 C would be indistinguishable from spurious (zero).
Your empirical correlations would be credible indicators of real annual anomaly trends only if all measurement errors were random.
EFS Point 10. “In closing, any temperature record has a total error associated with it, whatever that error is, it does not stop one from extracting very real low frequency information with an associated very high degree of statistical confidence.”
Reply 10. Only when the low frequency trend is greater than (+/-)0.46 C. And that emergence would only grant 68% confidence. You’d need a 0.92 C trend to attain the usual 95% (p<0.05) statistical confidence.
EFS Point 11. “If this were not true, if this were not the case, than all low frequency temperature trand lines would indeed exhibit truly random behaviors, with R^2 approaching zero in all cases.”
Reply 11. Not when the error is systematic and regionally correlated.
EFS Point 12. “That because we can demonstrate well defined low frequency trend lines with associated high degrees of statistical confidence, that we can choose large N, and still obtain well defined low frequency trend lines with associated high degrees of statistical confidence, suggests umabigiously, based on the LLN, that these emperical results are real with probability approaching one (p ~ 1).”
Reply 12. 1/sqrtN reduction of error works only with random error.
Your entire analysis rests upon the assumption that random error exhausts all the instrumental error present in the surface air temperature record. It doesn’t. Unfortunately, the climate scientists who compile surface air temperature-time series have thoroughly neglected the systematic error suffered by temperature sensors. This neglect has led them to false confidence and to publish empirically unjustifiable conclusions.
The unappreciated error in the surface air temperature has further misled climate modelers to place false confidence in the temperature calibration accuracy of GCMs, and even further has led to false precision when so-called proxy paleo-temperature reconstructions are normalized to the instrumental record.
Final Part III is forthcoming.
References:
1. Hansen, J. and Lebedeff, S., Global Trends of Measured Surface Air Temperature, J. Geophys. Res., 1987, 92 (D11), 13345-13372.
2. Hubbard, K.G. and Lin, X., Realtime data filtering models for air temperature measurements, Geophys. Res. Lett., 2002, 29 (10), 1425 1-4; doi: 10.1029/2001GL013191.
3. Hubbard, K.G., Lin, X., Baker, C.B. and Sun, B., Air Temperature Comparison between the MMTS and the USCRN Temperature Systems, J. Atmos. Ocean. Technol., 2004, 21 1590-1597.
4. Lin, X. and Hubbard, K.G., Sensor and Electronic Biases/Errors in Air Temperature Measurements in Common Weather Station Networks, J. Atmos. Ocean. Technol., 2004, 21 1025-1032.
5. Lin, X., Hubbard, K.G. and Meyer, G.E., Airflow Characteristics of Commonly Used Temperature Radiation Shields, J. Atmos. Oceanic Technol., 2001, 18 (3), 329-339.

Brian H
January 31, 2011 5:03 am

Even the 95% confidence level is soft and squishy. Suitable for Psychology and other opinion-contaminated fields. It is “accepted” in Climate Science because it’s the only standard it has a hope of (occasionally) reaching. Real science goes for 5-6 sigma confidence. Never, ever, ever, will Climate Science attempt that.

February 8, 2011 9:37 pm

Part III of my reply.
Part I is here.
Part II is here.
This will be mercifully short, and will only focus on points EFS_Junior neglected in his follow-up.
In his first criticism, here, EFS_Junior wrote that separating error into its independent terms is “bogus, unnecessary, and without merit.”
In Part I, I described three authoritative sources doing exactly that, namely separating instrumental error into independent terms in order to assess their properties and contributions to total error. One of those sources was the surface temperature article by Brohan, et al. (2006), the paper I criticized in my E&E article.
In his replies to my post, EFS_Junior was silent on this point. He did not acknowledge these sources or withdraw his opinion.
EFS_Junior remained silent on his mistaken claim that the instrumental uncertainty sigma does not propagate through temperature time series.
EFS_Junior remained silent on his mistaken claim that my paper did not discuss bias offset errors.
EFS_Junior remained silent on his mistaken claim that my published analysis assumed all errors are stationary (“symmetric about a zero mean”).
He also remained silent on the obvious fact that his own analysis assumed all instrumental error is stationary and therefore averages away.
In his reply, EFS_Junior mistakenly interpreted the invisibility of systematic error to a test for random error, to mean that systematic error averages away.
This not only shows unfamiliarity with systematic error, but a general unfamiliarity with experimental error. Very little of experimental error is random. Non-random error always propagates and rarely autocancels.
EFS_Junior has remained silent on this central point.
EFS_Junior was also very pejorative of E&E, which reviewed and published a paper that he obviously had trouble understanding even after two readings and my explanation. His dismissive attitude toward E&E is therefore unjustified and so far unapologetic.
More on this point: from the Climategate emails, we know that scientists central to AGW have used reviewer privilege and pressured editors (who apparently often complied) to suppress contradictory science. Climate journals have thereby become systematically hostile to manuscripts with results contrary to AGW. This has forced some researchers to publish in alternative journals. The AGW-central scientists then exploit the use of alternative journals to politically discredit the contradictory papers and their authors; suggesting that if the science was truly sound it would be published in a climate journal (as theirs is). We know that, in science, arguments stand or fall by their internal merit, not by the paper they’re written on. So the tactic of the AGW scientists, of specious disparagement, is strictly an abuse of science.
EFS_Junior’s aspersive attitude toward E&E, while surely an honest opinion, seems likely in part due to the dishonest politics of collegial disparagement prevalent in AGW circles.
At the end, the points EFS_Junior failed to address are exactly those that play into the realization that surface air temperatures, especially 20th century surface air temperatures, are systematically inaccurate. Of course, how inaccurate they are will almost certainly remain unknowable because no one ever bothered to independently monitor the uncontrolled environmental variables.
And that lack points up the further truth that 20th century air temperatures were always for no-need-to-be-very-precise local weather use. Never for precision climate studies.
And in that we see that the temperature record has been systematically abused.
Yet one more unappreciated systematic error.