The Metrology of Thermometers

For those that don’t notice, this is about metrology, not meteorology, though meteorology uses the final product. Metrology is the science of measurement.

Since we had this recent paper from Pat Frank that deals with the inherent uncertainty of temperature measurement, establishing a new minimum uncertainty value of ±0.46 C for the instrumental surface temperature record, I thought it valuable to review the uncertainty associated with the act of temperature measurement itself.

As many of you know, the Stevenson Screen aka Cotton Region Shelter (CRS), such as the one below, houses a Tmax and Tmin recording mercury and alcohol thermometer.

Hanksville_looking_north
Hanksville, UT USHCN climate monitoring station with Stevenson Screen - sited over a gravestone. Photo by surfacestations.org volunteer Juan Slayton

They look like this inside the screen:

NOAA standard issue max-min recording thermometers, USHCN station in Orland, CA - Photo: A. Watts

Reading these thermometers would seem to be a simple task. However, that’s not quite the case. Adding to the statistical uncertainty derived by Pat Frank, as we see below in this guest re-post, measurement uncertainty both in the long and short term is also an issue.The following appeared on the blog “Mark’s View”, and I am reprinting it here in full with permission from the author. There are some enlightening things to learn about the simple act of reading a liquid in glass (LIG) thermometer that I didn’t know as well as some long term issues (like the hardening of the glass) that have values about as large as the climate change signal for the last 100 years ~0.7°C – Anthony

==========================================================

Metrology – A guest re-post by Mark of Mark’s View

This post is actually about the poor quality and processing of historical climatic temperature records rather than metrology.

My main points are that in climatology many important factors that are accounted for in other areas of science and engineering are completely ignored by many scientists:

  1. Human Errors in accuracy and resolution of historical data are ignored
  2. Mechanical thermometer resolution is ignored
  3. Electronic gauge calibration is ignored
  4. Mechanical and Electronic temperature gauge accuracy is ignored
  5. Hysteresis in modern data acquisition is ignored
  6. Conversion from Degrees F to Degrees C introduces false resolution into data.

Metrology is the science of measurement, embracing both experimental and theoretical determinations at any level of uncertainty in any field of science and technology. Believe it or not, the metrology of temperature measurement is complex.

It is actually quite difficult to measure things accurately, yet most people just assume that information they are given is “spot on”.  A significant number of scientists and mathematicians also do not seem to realise how the data they are working with is often not very accurate. Over the years as part of my job I have read dozens of papers based on pressure and temperature records where no reference is made to the instruments used to acquire the data, or their calibration history. The result is that many scientists  frequently reach incorrect conclusions about their experiments and data because the do not take into account the accuracy and resolution of their data. (It seems this is especially true in the area of climatology.)

Do you have a thermometer stuck to your kitchen window so you can see how warm it is outside?

Let’s say you glance at this thermometer and it indicates about 31 degrees centigrade. If it is a mercury or alcohol thermometer you may have to squint to read the scale. If the scale is marked in 1c steps (which is very common), then you probably cannot extrapolate between the scale markers.

This means that this particular  thermometer’s resolution is1c, which is normally stated as plus or minus 0.5c (+/- 0.5c)

This example of resolution is where observing the temperature is under perfect conditions, and you have been properly trained to read a thermometer. In reality you might glance at the thermometer or you might have to use a flash-light to look at it, or it may be covered in a dusting of snow, rain, etc. Mercury forms a pronounced meniscus in a thermometer that can exceed 1c  and many observers incorrectly observe the temperature as the base of the meniscus rather than it’s peak: ( this picture shows an alcohol meniscus, a mercury meniscus bulges upward rather than down)

Another  major common error in reading a thermometer is the parallax error.

Image courtesy of Surface meteorological instruments and measurement practices By G.P. Srivastava (with a mercury meniscus!) This is where refraction of light through the glass thermometer exaggerates any error caused by the eye not being level with the surface of the fluid in the thermometer.

(click on image to zoom)

If you are using data from 100’s of thermometers scattered over a wide area, with data being recorded by hand, by dozens of different people, the observational resolution should be reduced. In the oil industry it is common to accept an error margin of 2-4% when using manually acquired data for example.

As far as I am aware, historical raw multiple temperature data from weather stations has never attempted to account for observer error.

We should also consider the accuracy of the typical mercury and alcohol thermometers that have been in use for the last 120 years.  Glass thermometers are calibrated by immersing them in ice/water at 0c and a steam bath at 100c. The scale is then divided equally into 100 divisions between zero and 100. However, a glass thermometer at 100c is longer than a thermometer at 0c. This means that the scale on the thermometer gives a false high reading at low temperatures (between 0 and 25c) and a false low reading at high temperatures (between 70 and 100c) This process is also followed with weather thermometers with a range of -20 to +50c

25 years ago, very accurate mercury thermometers used in labs (0.01c resolution) had a calibration chart/graph with them to convert observed temperature on the thermometer scale to actual temperature.

Temperature cycles in the glass bulb of a thermometer harden the glass and shrink over time, a 10 yr old -20 to +50c thermometer will give a false high reading of around 0.7c

Over time, repeated high temperature cycles cause alcohol thermometers to evaporate  vapour into the vacuum at the top of the thermometer, creating false low temperature readings of up to 5c. (5.0c not 0.5 it’s not a typo…)

Electronic temperature sensors have been used more and more in the last 20 years for measuring environmental temperature. These also have their own resolution and accuracy problems. Electronic sensors suffer from drift and hysteresis and must be calibrated annually to be accurate, yet most weather station temp sensors are NEVER calibrated after they have been installed. drift is where the recorder temp increases steadily or decreases steadily, even when the real temp is static and is a fundamental characteristic of all electronic devices.

Drift, is where a recording error gradually gets larger and larger over time- this is a quantum mechanics effect in the metal parts of the temperature sensor that cannot be compensated for typical drift of a -100c to+100c electronic thermometer is about 1c per year! and the sensor must be recalibrated annually to fix this error.

Hysteresis is a common problem as well- this is where increasing temperature has a different mechanical affect on the thermometer compared to decreasing temperature, so for example if the ambient temperature increases by 1.05c, the thermometer reads an increase on 1c, but when the ambient temperature drops by 1.05c, the same thermometer records a drop of 1.1c. (this is a VERY common problem in metrology)

Here is a typical food temperature sensor behaviour compared to a calibrated thermometer without even considering sensor drift: Thermometer Calibration depending on the measured temperature in this high accuracy gauge, the offset is from -.8 to +1c

But on top of these issues, the people who make these thermometers and weather stations state clearly the accuracy of their instruments, yet scientists ignore them!  a -20c to +50c mercury thermometer packaging will state the accuracy of the instrument is +/-0.75c for example, yet frequently this information is not incorporated into statistical calculations used in climatology.

Finally we get to the infamous conversion of Degrees Fahrenheit to Degrees Centigrade. Until the 1960’s almost all global temperatures were measured in Fahrenheit. Nowadays all the proper scientists use Centigrade. So, all old data is routinely converted to Centigrade.  take the original temperature, minus 32 times 5 divided by 9.

C= ((F-32) x 5)/9

example- original reading from 1950 data file is 60F. This data was eyeballed by the local weatherman and written into his tallybook. 50 years later a scientist takes this figure and converts it to centigrade:

60-32 =28

28×5=140

140/9= 15.55555556

This is usually (incorrectly) rounded  to two decimal places =: 15.55c without any explanation as to why this level of resolution has been selected.

The correct mathematical method of handling this issue of resolution is to look at the original resolution of the recorded data. Typically old Fahrenheit data was recorded in increments of 2 degrees F, eg 60, 62, 64, 66, 68,70. very rarely on old data sheets do you see 61, 63 etc (although 65 is slightly more common)

If the original resolution was 2 degrees F, the resolution used for the same data converted to  Centigrade should be 1.1c.

Therefore mathematically :

60F=16C

61F17C

62F=17C

etc

In conclusion, when interpreting historical environmental temperature records one must account for errors of accuracy built into the thermometer and errors of resolution built into the instrument as well as errors of observation and recording of the temperature.

In a high quality glass environmental  thermometer manufactured in 1960, the accuracy would be +/- 1.4F. (2% of range)

The resolution of an astute and dedicated observer would be around +/-1F.

Therefore the total error margin of all observed weather station temperatures would be a minimum of +/-2.5F, or +/-1.30c…

===============================================================

UPDATE: This comment below from Willis Eschenbach, spurred by Steven Mosher, is insightful, so I’ve decided to add it to the main body – Anthony

===============================================================

Willis Eschenbach says:

As Steve Mosher has pointed out, if the errors are random normal, or if they are “offset” errors (e.g. the whole record is warm by 1°), increasing the number of observations helps reduce the size of the error. All that matters are things that cause a “bias”, a trend in the measurements. There are some caveats, however.

First, instrument replacement can certainly introduce a trend, as can site relocation.

Second, some changes have hidden bias. The short maximum length of the wiring connecting the electronic sensors introduced in the late 20th century moved a host of Stevenson Screens much closer to inhabited structures. As Anthony’s study showed, this has had an effect on trends that I think is still not properly accounted for, and certainly wasn’t expected at the time.

Third, in lovely recursiveness, there is a limit on the law of large numbers as it applies to measurements. A hundred thousand people measuring the width of a hair by eye, armed only with a ruler measured in mm, won’t do much better than a few dozen people doing the same thing. So you need to be a little careful about saying problems will be fixed by large amounts of data.

Fourth, if the errors are not random normal, your assumption that everything averages out may (I emphasize may) be in trouble. And unfortunately, in the real world, things are rarely that nice. If you send 50 guys out to do a job, there will be errors. But these errors will NOT tend to cluster around zero. They will tend to cluster around the easiest or most probable mistakes, and thus the errors will not be symmetrical.

Fifth, the law of large numbers (as I understand it) refers to either a large number of measurements made of an unchanging variable (say hair width or the throw of dice) at any time, or it refers to a large number of measurements of a changing variable (say vehicle speed) at the same time. However, when you start applying it to a large number of measurements of different variables (local temperatures), at different times, at different locations, you are stretching the limits …

Sixth, the method usually used for ascribing uncertainty to a linear trend does not include any adjustment for known uncertainties in the data points themselves. I see this as a very large problem affecting all calculation of trends. All that are ever given are the statistical error in the trend, not the real error, which perforce much be larger.

Seventh, there are hidden biases. I have read (but haven’t been able to verify) that under Soviet rule, cities in Siberia received government funds and fuel based on how cold it was. Makes sense, when it’s cold you have to heat more, takes money and fuel. But of course, everyone knew that, so subtracting a few degrees from the winter temperatures became standard practice …

My own bozo cowboy rule of thumb? I hold that in the real world, you can gain maybe an order of magnitude by repeat measurements, but not much beyond that, absent special circumstances. This is because despite global efforts to kill him, Murphy still lives, and so no matter how much we’d like it to work out perfectly,  errors won’t be normal, and biases won’t cancel, and crucial data will be missing, and a thermometer will be broken and the new one reads higher, and …

Finally, I would back Steven Mosher to the hilt when he tells people to generate some pseudo-data, add some random numbers, and see what comes out. I find that actually giving things a try is often far better than profound and erudite discussion, no matter how learned.

w.

Get notified when a new post is published.
Subscribe today!
5 2 votes
Article Rating
240 Comments
Inline Feedbacks
View all comments
Mark T
January 22, 2011 5:38 pm

Dave Springer says:
January 22, 2011 at 4:58 pm

What you’re taught in school and what you learn in the real world are often two different things. That’s why you don’t step out of school into a senior engineering position.

Hehe. In other words, you don’t understand the theory, do you? I do. I also put it in practice. Make some of these claims to your signal processing buddies. They will laugh at you a bit before explaining that the theory of the LLN works in the real world, too.

I have a pretty good idea of why you don’t know that, Mark.

Really… so what would that be then, Dave? You seem hung up on authority… go check my background at the Air Vent (there is a thread.) Certainly there are enough clues out there to glean that I’m not just some joe blow that spends his life in academia pondering philosophical things. If one was astute enough to pick it up, that is.
Could it be related to the fact that you have continually avoided every question I’ve asked? Afraid of admitting that you really don’t understand the theory that you are applying? You are applying the theory, even if you don’t know how or why.

The thermometers used in the instrument record came from dozens of different manufacturers using different manufacturing methods and different technologies (alcohol vs. mercury for instance) with various sources of error from each due to quality control and whatnot.

So, in other words, they do NOT have identical error distributions? You just proved my point, Dave. Thanks.

Over the course of millions of readings recorded by thousands of people the errors, unless they are systematic in nature, will cancel out.

If they meet the requirements I already laid out above, sure.

The wikipedia article you quoted in effect states just that.

I know exactly how it works, I use it in my job extracting signals from noise. It is a very powerful law when used properly. The article says the errors need to be i.i.d. Did you conveniently avoid that part? You just admitted the errors aren’t i.i.d. above, so are you now changing your tune? Wikipedia is right but no, it’s not right, wait, it’s right.

I haven’t seen anyone come up with a description of systematic error that would produce gradually rising temperatures in this scenario.

Show me where I ever said anything even remotely close to this. Really… I dare you.

No systematic error, random error cancels out, only data of concern is change in temperature over time rather than exact temperature at any one time ergo there is nothing wrong with the raw data.

If you can prove the random errors are a) independent and b) identically distributed, then I will believe you. Until then, sorry, but you are wrong.
Mark
PS: it is easy to test this, btw. Generate a bunch of random data using different means, variances, and distributions then average them together and look at your variance. It will be somewhere between the maximum and minimum and will not decrease asymptotically. If you have MATLAB, play around with rand and randn.

January 22, 2011 5:40 pm

Mark Sonter says:
January 22, 2011 at 5:40 am
“…My memory from running a university metsite 40 yrs ago (University of Papua New Guinea, Port Moresby), is that the ‘almost horizontal’ orientation commented on and queried by a couple of posters, is because the max and min thermometers have little indicator rods inside the bore, which get pushed up to max by the meniscus and stick there (Hg) ; and pulled down to the min and stick there (alcohol). The weather observer resets them by turning to the vertical (upright for the Hg (max), and the rod slides back down to again touch the meniscus); or upside down, for the alcohol (min), wherupon the rod slides back up, inside the liquid, till it touches the inside of the meniscus. Both are then resplaced on their near-horizontal stands…”
So wait – there’s a small movable rod in the bore of a thermometer. If the bore is not even, they would necessarily have to make the rod just a tiny bit smaller than the bore of the thermometer.
Yet, this glass “hardening”, would reduce the size of that bore. There’s a possibility of the rod sticking, making the same reading for a certain number of days (till the observer or someone) suspects the thermometer as faulty and changes it out.

Mark T
January 22, 2011 5:42 pm

Oliver Ramsay says:
January 22, 2011 at 5:36 pm

I’m beginning to warm to this notion; two wrongs don’t make a right but millions of wrongs do.

I hope you are just being sarcastic. If not… sigh.
Mark

Mark T
January 22, 2011 5:47 pm

David A. Evans says:
January 22, 2011 at 5:01 pm

Correct, I’m referring to energy content. That’s what is really being argued. OEC is a difficult one because the energy density of water requires more accurate temperature measurement but the temperature is at least more linearly related to energy.

Indeed. Pielke Sr. argues this regularly. I’ve always wondered why they use this metric instead of one that actually makes physical sense. Hell, OEC isn’t even an average, it’s an measure of total energy in the oceans. If that is changing then certainly something is happening.
Mark

Dave Springer
January 22, 2011 5:49 pm

u.k.(us) says:
January 22, 2011 at 4:55 pm
Dave Springer says:
January 22, 2011 at 4:10 pm
=============
“I call B.S. on this entire entry, and therefore all your recent comments.
From one pilot, to “another”.”
Nope. True story. I only flew Cessna 172’s. At takeoff and cruise I thought wow, this thing must have a more powerful engine in it because it “accelerated” faster on takeoff and cruise speed was substantially faster than I’d seen in any other C172’s for the throttle setting. Gimme a break. That was like my second cross-country solo and I had maybe 30 hours in the left seat total. Takeoff and cruise speeds usually aren’t that critical as you’re trimmed for takeoff and let the plane lift off of its own accord with full throttle and except for navigation, which in this case was visual by landmark during the day, it doesn’t much matter if cruise speed is 110 or 130 knots. My landmarks came up a minute or two late. No big deal. Being on a student cross country solo I wasn’t practicing stalls or doing any low speed high bank angle turns. Except for landings of course starting with turn onto short final. That’s when I got worried and that’s where I split the difference between what felt like the right speed and what the airspeed indicator was reading. I pretty much figured out there was something amiss with the air speed indicator and how the hell I failed to notice it was MPH instead of knots is both embarrassing and scary to this day 20 years later. If I’d been flying by instruments at night that mistake might have ended up with a crash, die, and burn or worse a crash, burn, and die (as my instructor said the order is critical – you really want to die BEFORE you burn up).
The following link discusses when and where and how often you find an MPH indicator in small Cessnas. I didn’t know this before now but MPH was standard in Cessnas manufactured before 1974 and knots were standard after that. The year this happened to me was 1991. The rental fleet at the small airport I was flying out of must have had just one oddball in the group that was pre-1975. My instructor was only about 25 years old and was still in Corps. He was moonlighting as an instructor accumulating hours required for a commercial license so it wasn’t like that was his regular job nor had he been doing it for a long time. Corona, CA municipal airport was where this took place. The solo was cross country through the desert landing at about 3 or 4 small uncontrolled runways. Corona is about 25 miles from USMCAS El Toro where he was stationed then and where I was stationed from 1975-1978. I lived in Corona at the time so it was convenient. All the other convenient airporst nearby were big ones and you’d waste a lot of your expensive rental time waiting and taxiing instead of flying. At Corona once you started rolling the runway was 50 yards away and seldom anyone in front of you.

Dave Springer
January 22, 2011 5:52 pm
Jeff Alberts
January 22, 2011 5:53 pm

So, would gradual, but constant, encroachment of urbanization upon the site be considered systematic?

davidc
January 22, 2011 6:04 pm

The issue of whether a measurement of +/-0.5 can be used to construct a mean value with a much smaller “error” depends on what you mean by “error”.
In statistics it is usual to interpret this as a 95 or 99% confidence interval. As others have said, this varies with 1/sqrt(N) so with enough observations N you can narrow the confidence interval to well below the error in the original measurement. But why choose (say) a 99% CI rather than a 100% CI? Because for most statistical distributions the 100% CI is +/-infinity. But if you could have a 100% CI you would prefer it over a 99% CI, no?
Well, in the case of a uniform distribution, as with reading a thermometer, you can have a 100% CI and it’s a piece of cake to determine it. For readings R1+/-0.5 and R2+/-0.5 the mean is (R1+R2)/2+/-(0.5+0.5)/2; ie +/-0.5. Proceeding in this way with more observarions it’s easy to see that the 100% confidence interval is the same as the “error” in the original observations.
So take your pick: 99% or 100%.

Oliver Ramsay
January 22, 2011 6:16 pm

Mark said “I hope you are just being sarcastic. If not… sigh.”
—-
No, I’m the one doing the sighing. I was aiming for irony but somehow hit ambiguity, apparently completely missing wit on the way.
On the bright side; that’s another screw-up so, I’m getting closer to perfection.

u.k.(us)
January 22, 2011 6:23 pm

Dave Springer says:
January 22, 2011 at 5:49 pm
==============
I, reiterate that it is B.S., that as a student pilot, you felt things the test pilots
missed.

Mark T
January 22, 2011 6:31 pm

Jeff Alberts: yes and no. That’s not really an error in the sense of the point. The increase is real. It is, however, a bias increasing with time.
Davidc: you are only considering the recording error as a result of the minimum gradations. That is not the whole of the error distribution.
Mark

JRR Canada
January 22, 2011 6:53 pm

Thanks for posting Marks work. Mark take a break from arguing with idiots, the work stands pretty well on its own and many of the assumptions stipulated before arguing your work negate the argument offered.Essentially you say the error range of instruments is site and device specific, having calibrated control system devices I understand that well enough. I agree the failure to list the device and its error range is an indication of sloppy work. Also slow down on your response as you are abreviating to the limits of my comprehension.Thanks for the posting and glad you made it to WUWT.

Dave Springer
January 22, 2011 7:03 pm

u.k.(us) says:
January 22, 2011 at 4:55 pm

Dave Springer says:
January 22, 2011 at 4:10 pm
=============
I call B.S. on this entire entry, and therefore all your recent comments.
From one pilot, to “another”.

I love google. Evidently it’s not a too uncommon experience with Cessna ASIs calibrated in MPH vs. KNOT and that causing scary problems for the student pilot who happens to rent one of the former:
http://answers.yahoo.com/question/index?qid=20070923203647AAqLOZ3

Back in the mid to late ‘70s, Cessna quit installing airspeed indicators calibrated in mph and started using ones with kts in their C-150s and 172s. Occasionally it would confuse the solo student if there was a mixed fleet, but they soon learned to pay attention to detail after they scared themselves sufficiently.

I’d call it unnerving but not particularly scary. About the same as when my instructor sneakily turned the fuel selector to OFF, the engine quit, and instead of figuring out my fuel was cut off I set up for no power landing on a dirt road in one of our regular remote practice areas. He let me get lined up on it and didn’t turn the fuel back on until we were down to about 100 feet off the ground. I thought it was a for-real failure but remained quite calm. I got ripped a new arse for that too. But hey, at least I did good on setting up for the emergency landing. That practice area was also where we did spin training. If you’ve never been in a spin (optional with some instructors) in a small plane it’s quite something. One second you’re angled up into the sky with the stall horn screaming at about 35 knots and the next second you’re pointed straight at the ground spinning and accelerating like a bat out of hell. Like an elevator where the cable just broke only worse.

davidc
January 22, 2011 7:25 pm

Mark:
“Davidc: you are only considering the recording error as a result of the minimum gradations. That is not the whole of the error distribution.”
Yes, but if you took this interpretation the “error” coud not be less than the individual measurent error.

Domenic
January 22, 2011 7:26 pm

After looking over the discussions for the types of thermometers used, techniques….
there ARE some fairly easy ways to get a much better trend signal out of the error noises to see if it really exists.
The primary cause, by far, of drift in temperature measurement systems is due to the thermal cycling over time of the thermometer system. This includes both mercury thermometers and all electronic thermometers. A mercury thermometer is very stable over time if it is not thermally cycled very much. Same with electronic thermometers, as both the sensor tips and the electronic components, resistors, etc decay with thermal cycling over time. (Basically that is how automotive components are lifetime tested…by lots of thermal cycling to simulate thermal cycling over the lifetime desired of the component.)
So, if you concentrate on historical temp data from land on small islands only, you will have a much better shot at seeing whether a true trend exists, and its magnitude. The reason is that temperatures on islands are dominated by the surrounding seas that greatly reduce the temperature swings over time.
For example, I live near Miami. The normal yearly air temp delta T is only about 25 F. Daily delta T is normally only about 10 F or so. In other words, our environment here very dominated by the surrounding waters. Thus, the thermal cycling stress on the temperature systems is extremely low compared to inland systems.
Inland systems are thermally cycled with a yearly delta T more in the 50 F to 60 F range, and a delta T of 20 F or more range on a daily basis.
So, the small island thermometers are much more stable over time.
Because they are more stable, it reduces the need for delving into calibration history errors, etc.
It may also serve as a good proxy for ocean temperature changes, as the air temperatures on small islands are completely dominated by the surrounding waters.

EFS_Junior
January 22, 2011 7:30 pm

http://en.wikipedia.org/wiki/Iid
“In probability theory and statistics, a sequence or other collection of random variables is independent and identically distributed (i.i.d.) if each random variable has the same probability distribution as the others and all are mutually independent.”
AFAIK all distributions of errors in temperature measurements are Gaussian with zero mean. Therefore, all Gaussian distributions have the same probability distribution (e. g. Gaussian).
Each station’s temperature measurements are independent from all other stations, if they were not, than the two stations would, in fact, be the same station.
In fact, no two instruments will ever have the exact same “identically distributed” distribution (exact to infinite precision), not even remotely possible, again only possible if, in fact, the two instruments were one in the same instrument.
But wait, there’s more;
“The abbreviation i.i.d. is particularly common in statistics (often as iid, sometimes written IID), where observations in a sample are often assumed to be (more-or-less) i.i.d. for the purposes of statistical inference. The assumption (or requirement) that observations be i.i.d. tends to simplify the underlying mathematics of many statistical methods: see mathematical statistics and statistical theory. However, in practical applications of statistical modeling the assumption may or may not be realistic. The generalization of exchangeable random variables is often sufficient and more easily met.”
“The abbreviation i.i.d. is particularly common in statistics (often as iid, sometimes written IID), where observations in a sample are often assumed to be (more-or-less) i.i.d. for the purposes of statistical inference. The assumption (or requirement) that observations be i.i.d. tends to simplify the underlying mathematics of many statistical methods: see mathematical statistics and statistical theory. However, in practical applications of statistical modeling the assumption may or may not be realistic. The generalization of exchangeable random variables is often sufficient and more easily met.”
“The assumption is important in the classical form of the central limit theorem, which states that the probability distribution of the sum (or average) of i.i.d. variables with finite variance approaches a normal distribution.”
Wait a minute, did the above just say “assumption” and “finite variance” and “normal distribution? Why, yes it did.
Well than, so much for that ole’ nasty IID requirement, it’s been met head on and found to be not so necessary if need be, however, as the specious uncertainty paper claims, all sigmas have zero meas as all instances are shown as +/- meaning symmetric about a zero mean.
If it looks Gaussian, if it feels Gaussian, if it smells Gaussian, if it sounds Gaussian, and if it tastes Gaussian, than it must be Gaussian! Q.E.D.
Now if you want to talk about instrument bias offsets (that vary randomly with time, as they invariably do), I’m game.

David A. Evans
January 22, 2011 7:36 pm

Dave Springer says:
January 22, 2011 at 7:03 pm
My only time in a light aircraft was a jolly in a DH Chipmunk. I insisted on a spin, loop wingover & Immelmann. Loved it! I flew back to the airfield & commented on turbulence. Instructor said PIO. I think he was surprised when I told him I wasn’t moving the stick & was making corrections on trim. I was making no corrections for turbulence.
DaveE.

Dave Springer
January 22, 2011 7:42 pm

u.k.(us) says:
January 22, 2011 at 6:23 pm
Dave Springer says:
January 22, 2011 at 5:49 pm
==============
I, reiterate that it is B.S., that as a student pilot, you felt things the test pilots
missed.
We’re going to get cut off for off-topic but I don’t understand that at all. I noticed something wrong on the first takeoff. The plane was trimmed for takeoff and it lifted off the runway at the same point all the other C172’s I’d flown. Takeoffs are all the same – full throttle, elevator trimmed for takeoff, and let it lift off by itself without pulling back on the yoke. The indicated speed was noticeably higher. Climbout works the same way – full throttle and enough elevator to get a desired rate of climb. Again I noticed the indicated airspeed was noticeably higher for the rate of climb. At that point I thought the engine might have been more powerful. Then my indicated cruise speed was noticeably higher for 2/3 throttle yet engine RPM was the same. At that point I concluded I had a funky airspeed indicator that was reading high and mentally adjusted downward for critical landing speeds. Engine RPM was the final giveaway. If you are accustomed to the aircraft the instruments all connect together in predictable fashion. Throttle vs. RPM vs. rate of climb vs. pitch vs. indicated airspeed. All the instruments had what I expected to see except for the ASI being out of whack with the others. I’m still perplexed as to how I could have possibly failed to notice the ASI had MPH on it instead of KNOTS. It’s rather small print on the face of the dial but still… it’s just an example of how the brain works. Things that always stay the same and aren’t important become “invisible”. In this case it turned out to be something important but it remained outside my notice nonetheless and didn’t find out my mistake until I was back on the ground. Instead I figured out approximately how much the ASI was off the mark and compensated for it by subtracting five to ten knots from the indicated speed.

Dave Springer
January 22, 2011 7:58 pm

Jeff Alberts says:
January 22, 2011 at 5:53 pm
“So, would gradual, but constant, encroachment of urbanization upon the site be considered systematic?”
Not from the thermometer’s point of view. That’s a systematic error in interpretation of the readings not a faulty reading i.e. blaming the increase on CO2 instead of urban heat. The data is fine it’s the explanation of the data that’s wrong.

Dinostratus
January 22, 2011 8:04 pm

Here is an error that may be introduced into the data. I think I once saw it posted as an article.
Say you have a weather station and over time the paint deteriorates exposing the wood. Assuming the wood has a higher emissivity than the paint, over time the temperature of the station increases during the day and decreases at night. Note they don’t quite move by the same amount. Day is a wee bit hotter than night is colder. This would increase the spread of the day/night data and increase the average of the two (if the day data shift up is hotter). Now lets say weekend you go out and paint the station and the temperature shifts 0.3C. Back at the lab you look at the shift and say, “Oh that isn’t right we need to make them the same” so you “adjust” the old data downward or the new data upward. Violla! One now has a temperature station reporting warming over decades instead of a station trending up until it’s painted, shifting discretely down then trending back up.

Dave Springer
January 22, 2011 8:16 pm

Mark T says:
January 22, 2011 at 5:38 pm
“Could it be related to the fact that you have continually avoided every question I’ve asked? Afraid of admitting that you really don’t understand the theory that you are applying? You are applying the theory, even if you don’t know how or why.”
No Mark. It’s related to the fact that when 10,000 computers a day (no exageration, I was senior R&D in Dell laptops 1993-2000 after which I retired because I never needed to work for a living again) that you designed are supposed to be rolling off the production line and an unacceptable number of them are failing some bit of manufacturing test which results in the line being shut down and millions of dollars per day are going out the window you don’t sit on your ass making bar graphs with error bars to figure out what’s gone wrong. Well maybe academics do that. Someone with decades of experience uses practical knowledge and uses it quickly or starts thinking about whether his resume is up to date.
Instead of being obtuse why don’t you use your real name and state what your experience actually is instead of giving me a trail of breadcrumbs to follow to figure it out? What’s up with that? What are YOU hiding?

January 22, 2011 8:27 pm

Dave Springer says:
“…you don’t sit on your ass making bar graphs with error bars to figure out what’s gone wrong.”
…says the unemployed dilettante who spends his time defending the indefensible.☺

Hoser
January 22, 2011 8:30 pm

Dave Springer says:
January 22, 2011 at 1:52 pm
Well, thanks, but that didn’t address my point. Ya think I believe in CAGW? Nah.

Dave Springer
January 22, 2011 8:44 pm

Mark T.
Liquid in glass thermometers are analog instruments not digital.
You remind me of the guys that resort to quantum mechanics to explain how CO2 works (or doesn’t work if that’s your belief) to raise surface temperatures. QM gets so complicated that even the experts disagree. That’s unfortunate because classical mechanics is all that’s needed to explain it and it’s much simpler – so simple that they knew how it worked 200 years ago and John Tyndal proved it experimentally 150 years ago.
Don’t make this overly complicated. The instrumental record in and of itself is adequate for the use to which it’s being put. You of course are quite welcome to try proving otherwise but if I could bet I’d bet you ever becoming famous for succeeding in that proof – you’ll just keep spinning your wheels and going nowhere.

Colonial
January 22, 2011 8:46 pm

Nothing can be done to improve the accuracy of historical data (other than getting it out of the Hockey Team’s control). It is possible, however, to make more accurate measurements going forward. Check out the Analog Devices AN-0970 application note, RTD Interfacing and Linearization Using an ADuC706x Microcontroller. Its first paragraph reads:
The platinum resistance temperature detector (RTD) is one of the most accurate sensors available for measuring temperature within the –200°C to +850°C range. The RTD is capable of achieving a calibrated accuracy of ±0.02°C or better. Obtaining the greatest degree of accuracy, however, requires precise signal conditioning, analog-to-digital conversion, linearization, and calibration.
(Search the http://www.analog.com website for “AN-0970” to find the application note.)
Combine a chip optimized for making temperature measurements with a platinum RTD and other components to create a system that can measure temperature accurately and communicate with portable calibration equipment. Create a portable calibration device with a calibration head designed to slide over the temperature probe to create a controlled environment for running the tests. At the risk of further exacerbating Continuous Anthropocentric Global Whining (CAGW), include a CO2 cartridge to allow cooling of the temperature probe and a heater to allow calibration over the expected temperature range.
Periodically calibrate the calibration device to a NIST-traceable field standard. At the weather station calibration interval, drive around to the various sites. Slide the calibration head over the temperature probe, connect a communication cable to the temperature measurement electronics (unless it communicates via RF or infrared, in which case this step would be unnecessary), and let it run a complete characterization and recalibration. The characterization that’s run prior to recalibration will allow tracking of thermometer drift by serial number.
The thermometer systems wouldn’t have to be overwhelmingly expensive (the government has spent more for toilet seats) and the readings would be at least an order of magnitude more accurate than provided by current measurement methods. Calibration could be done by someone with minimal training. Properly designed, the calibration unit would insist on being recalibrated at appropriate intervals, and individual temperature measurement units would report the time from last calibration along with the data.
Then all we’ll have to do is wait 150 years to accumulate an instrumental temperature record equivalent in length to the one that’s relied on today.