The Metrology of Thermometers

For those that don’t notice, this is about metrology, not meteorology, though meteorology uses the final product. Metrology is the science of measurement.

Since we had this recent paper from Pat Frank that deals with the inherent uncertainty of temperature measurement, establishing a new minimum uncertainty value of ±0.46 C for the instrumental surface temperature record, I thought it valuable to review the uncertainty associated with the act of temperature measurement itself.

As many of you know, the Stevenson Screen aka Cotton Region Shelter (CRS), such as the one below, houses a Tmax and Tmin recording mercury and alcohol thermometer.

Hanksville_looking_north — Hanksville, UT USHCN climate monitoring station with Stevenson Screen - sited over a gravestone. Photo by surfacestations.org volunteer Juan Slayton

They look like this inside the screen:

CRS_Max-Min_thermometers — NOAA standard issue max-min recording thermometers, USHCN station in Orland, CA - Photo: A. Watts

Reading these thermometers would seem to be a simple task. However, that’s not quite the case. Adding to the statistical uncertainty derived by Pat Frank, as we see below in this guest re-post, measurement uncertainty both in the long and short term is also an issue.The following appeared on the blog “Mark’s View”, and I am reprinting it here in full with permission from the author. There are some enlightening things to learn about the simple act of reading a liquid in glass (LIG) thermometer that I didn’t know as well as some long term issues (like the hardening of the glass) that have values about as large as the climate change signal for the last 100 years ~0.7°C – Anthony

==========================================================

Metrology – A guest re-post by Mark of Mark’s View

This post is actually about the poor quality and processing of historical climatic temperature records rather than metrology.

My main points are that in climatology many important factors that are accounted for in other areas of science and engineering are completely ignored by many scientists:

Human Errors in accuracy and resolution of historical data are ignored
Mechanical thermometer resolution is ignored
Electronic gauge calibration is ignored
Mechanical and Electronic temperature gauge accuracy is ignored
Hysteresis in modern data acquisition is ignored
Conversion from Degrees F to Degrees C introduces false resolution into data.

Metrology is the science of measurement, embracing both experimental and theoretical determinations at any level of uncertainty in any field of science and technology. Believe it or not, the metrology of temperature measurement is complex.

It is actually quite difficult to measure things accurately, yet most people just assume that information they are given is “spot on”. A significant number of scientists and mathematicians also do not seem to realise how the data they are working with is often not very accurate. Over the years as part of my job I have read dozens of papers based on pressure and temperature records where no reference is made to the instruments used to acquire the data, or their calibration history. The result is that many scientists frequently reach incorrect conclusions about their experiments and data because the do not take into account the accuracy and resolution of their data. (It seems this is especially true in the area of climatology.)

Do you have a thermometer stuck to your kitchen window so you can see how warm it is outside?

Let’s say you glance at this thermometer and it indicates about 31 degrees centigrade. If it is a mercury or alcohol thermometer you may have to squint to read the scale. If the scale is marked in 1c steps (which is very common), then you probably cannot extrapolate between the scale markers.

This means that this particular thermometer’s resolution is1c, which is normally stated as plus or minus 0.5c (+/- 0.5c)

This example of resolution is where observing the temperature is under perfect conditions, and you have been properly trained to read a thermometer. In reality you might glance at the thermometer or you might have to use a flash-light to look at it, or it may be covered in a dusting of snow, rain, etc. Mercury forms a pronounced meniscus in a thermometer that can exceed 1c and many observers incorrectly observe the temperature as the base of the meniscus rather than it’s peak: ( this picture shows an alcohol meniscus, a mercury meniscus bulges upward rather than down)

Another major common error in reading a thermometer is the parallax error.

Image courtesy of Surface meteorological instruments and measurement practices By G.P. Srivastava (with a mercury meniscus!) This is where refraction of light through the glass thermometer exaggerates any error caused by the eye not being level with the surface of the fluid in the thermometer.

(click on image to zoom)

If you are using data from 100’s of thermometers scattered over a wide area, with data being recorded by hand, by dozens of different people, the observational resolution should be reduced. In the oil industry it is common to accept an error margin of 2-4% when using manually acquired data for example.

As far as I am aware, historical raw multiple temperature data from weather stations has never attempted to account for observer error.

We should also consider the accuracy of the typical mercury and alcohol thermometers that have been in use for the last 120 years. Glass thermometers are calibrated by immersing them in ice/water at 0c and a steam bath at 100c. The scale is then divided equally into 100 divisions between zero and 100. However, a glass thermometer at 100c is longer than a thermometer at 0c. This means that the scale on the thermometer gives a false high reading at low temperatures (between 0 and 25c) and a false low reading at high temperatures (between 70 and 100c) This process is also followed with weather thermometers with a range of -20 to +50c

25 years ago, very accurate mercury thermometers used in labs (0.01c resolution) had a calibration chart/graph with them to convert observed temperature on the thermometer scale to actual temperature.

Temperature cycles in the glass bulb of a thermometer harden the glass and shrink over time, a 10 yr old -20 to +50c thermometer will give a false high reading of around 0.7c

Over time, repeated high temperature cycles cause alcohol thermometers to evaporate vapour into the vacuum at the top of the thermometer, creating false low temperature readings of up to 5c. (5.0c not 0.5 it’s not a typo…)

Electronic temperature sensors have been used more and more in the last 20 years for measuring environmental temperature. These also have their own resolution and accuracy problems. Electronic sensors suffer from drift and hysteresis and must be calibrated annually to be accurate, yet most weather station temp sensors are NEVER calibrated after they have been installed. drift is where the recorder temp increases steadily or decreases steadily, even when the real temp is static and is a fundamental characteristic of all electronic devices.

Drift, is where a recording error gradually gets larger and larger over time- this is a quantum mechanics effect in the metal parts of the temperature sensor that cannot be compensated for typical drift of a -100c to+100c electronic thermometer is about 1c per year! and the sensor must be recalibrated annually to fix this error.

Hysteresis is a common problem as well- this is where increasing temperature has a different mechanical affect on the thermometer compared to decreasing temperature, so for example if the ambient temperature increases by 1.05c, the thermometer reads an increase on 1c, but when the ambient temperature drops by 1.05c, the same thermometer records a drop of 1.1c. (this is a VERY common problem in metrology)

Here is a typical food temperature sensor behaviour compared to a calibrated thermometer without even considering sensor drift: Thermometer Calibration depending on the measured temperature in this high accuracy gauge, the offset is from -.8 to +1c

But on top of these issues, the people who make these thermometers and weather stations state clearly the accuracy of their instruments, yet scientists ignore them! a -20c to +50c mercury thermometer packaging will state the accuracy of the instrument is +/-0.75c for example, yet frequently this information is not incorporated into statistical calculations used in climatology.

Finally we get to the infamous conversion of Degrees Fahrenheit to Degrees Centigrade. Until the 1960’s almost all global temperatures were measured in Fahrenheit. Nowadays all the proper scientists use Centigrade. So, all old data is routinely converted to Centigrade. take the original temperature, minus 32 times 5 divided by 9.

C= ((F-32) x 5)/9

example- original reading from 1950 data file is 60F. This data was eyeballed by the local weatherman and written into his tallybook. 50 years later a scientist takes this figure and converts it to centigrade:

60-32 =28

28×5=140

140/9= 15.55555556

This is usually (incorrectly) rounded to two decimal places =: 15.55c without any explanation as to why this level of resolution has been selected.

The correct mathematical method of handling this issue of resolution is to look at the original resolution of the recorded data. Typically old Fahrenheit data was recorded in increments of 2 degrees F, eg 60, 62, 64, 66, 68,70. very rarely on old data sheets do you see 61, 63 etc (although 65 is slightly more common)

If the original resolution was 2 degrees F, the resolution used for the same data converted to Centigrade should be 1.1c.

Therefore mathematically :

60F=16C

61F17C

62F=17C

etc

In conclusion, when interpreting historical environmental temperature records one must account for errors of accuracy built into the thermometer and errors of resolution built into the instrument as well as errors of observation and recording of the temperature.

In a high quality glass environmental thermometer manufactured in 1960, the accuracy would be +/- 1.4F. (2% of range)

The resolution of an astute and dedicated observer would be around +/-1F.

Therefore the total error margin of all observed weather station temperatures would be a minimum of +/-2.5F, or +/-1.30c…

===============================================================

UPDATE: This comment below from Willis Eschenbach, spurred by Steven Mosher, is insightful, so I’ve decided to add it to the main body – Anthony

===============================================================

Willis Eschenbach says:

First, instrument replacement can certainly introduce a trend, as can site relocation.

Second, some changes have hidden bias. The short maximum length of the wiring connecting the electronic sensors introduced in the late 20th century moved a host of Stevenson Screens much closer to inhabited structures. As Anthony’s study showed, this has had an effect on trends that I think is still not properly accounted for, and certainly wasn’t expected at the time.

Third, in lovely recursiveness, there is a limit on the law of large numbers as it applies to measurements. A hundred thousand people measuring the width of a hair by eye, armed only with a ruler measured in mm, won’t do much better than a few dozen people doing the same thing. So you need to be a little careful about saying problems will be fixed by large amounts of data.

Fourth, if the errors are not random normal, your assumption that everything averages out may (I emphasize may) be in trouble. And unfortunately, in the real world, things are rarely that nice. If you send 50 guys out to do a job, there will be errors. But these errors will NOT tend to cluster around zero. They will tend to cluster around the easiest or most probable mistakes, and thus the errors will not be symmetrical.

Fifth, the law of large numbers (as I understand it) refers to either a large number of measurements made of an unchanging variable (say hair width or the throw of dice) at any time, or it refers to a large number of measurements of a changing variable (say vehicle speed) at the same time. However, when you start applying it to a large number of measurements of different variables (local temperatures), at different times, at different locations, you are stretching the limits …

Sixth, the method usually used for ascribing uncertainty to a linear trend does not include any adjustment for known uncertainties in the data points themselves. I see this as a very large problem affecting all calculation of trends. All that are ever given are the statistical error in the trend, not the real error, which perforce much be larger.

Seventh, there are hidden biases. I have read (but haven’t been able to verify) that under Soviet rule, cities in Siberia received government funds and fuel based on how cold it was. Makes sense, when it’s cold you have to heat more, takes money and fuel. But of course, everyone knew that, so subtracting a few degrees from the winter temperatures became standard practice …

My own bozo cowboy rule of thumb? I hold that in the real world, you can gain maybe an order of magnitude by repeat measurements, but not much beyond that, absent special circumstances. This is because despite global efforts to kill him, Murphy still lives, and so no matter how much we’d like it to work out perfectly, errors won’t be normal, and biases won’t cancel, and crucial data will be missing, and a thermometer will be broken and the new one reads higher, and …

Finally, I would back Steven Mosher to the hilt when he tells people to generate some pseudo-data, add some random numbers, and see what comes out. I find that actually giving things a try is often far better than profound and erudite discussion, no matter how learned.

5 2 votes

Article Rating

240 Comments

Inline Feedbacks

View all comments

Mark T

January 22, 2011 3:46 pm

Oh my God! Anthony, this joker needs to be on the quote of the week.
How can you be a 30-year engineer and not know how science works? Wow… flabbergasted.
Mark

Dave Springer

January 22, 2011 3:57 pm

Mark T says:
January 22, 2011 at 3:35 pm
“What???”
You need to figure out the difference between systematic and random sources of instrument errors and how these are addressed in the real world. Off the top of my head I can describe to you how the computer controlled electro-mechanical flight control surfaces on the Space Shuttle were designed to eliminate both systematic and random errors. The nut of it is redundant systems designed by independent teams who weren’t allowed to crib from each other. Since you suggest people “read up” on things I suggest you do some reading up yourself or better yet get out into the real world where people actually do this stuff for a living and when they’re wrong it results in destruction, loss of life, and an instant end to a career. Pilots for instance (yeah, I’ve got a pilot’s license in case you’re wondering) that don’t how to read instruments have short careers and sometimes short lives too.

Mark T

January 22, 2011 3:58 pm

I mean Dave Springer… “prove it doesn’t work the way I said it does.” Prove I don’t have a leprechaun in my back yard. I said he’s there and thus, it must be true.
Words fail me.
I was actually ignoring Lazy Teenager, though he did have a few correct statements.
Mark

Mark T

January 22, 2011 4:04 pm

Dave Springer says:
January 22, 2011 at 3:57 pm
Mark T says:
January 22, 2011 at 3:35 pm

You need to figure out the difference between systematic and random sources of instrument errors and how these are addressed in the real world.

I don’t have to figure out anything. You have to prove that the random errors are a) independent and b) taken from identical distributions.
Again, I ask, can you do that?

Since you suggest people “read up” on things I suggest you do some reading up yourself or better yet get out into the real world where people actually do this stuff for a living and when they’re wrong it results in destruction, loss of life, and an instant end to a career.

Don’t worry, I work for a living. I suggest you actually attempt to understand the theory you are applying and why you are applying it incorrectly. When you integrate a large number of ADC values that are corrupted by thermal noise, you will most definitely get a sqrt(N) reduction in noise. Averages across different measuring devices, taken at different times, in different locations, with different error statistics, however, does not provide the same guarantee.
That’s how it really works in the real world.
Mark

RoHa

January 22, 2011 4:04 pm

“Until the 1960′s almost all global temperatures were measured in Fahrenheit. ”
Really? I was around then, and I seem to remember that by the 1960s Farenheit was only used in ex-British Empire countries and the US. Everywhere else used Celsius.

David A. Evans

January 22, 2011 4:05 pm

Ian W says:
January 22, 2011 at 11:42 am
E.M.Smith, me and a few others have been arguing the same point. I gave up after a while as you can only bang your head against a wall so many times.
DaveE.

Dave Springer

January 22, 2011 4:10 pm

Speaking of pilots and instrument error there was this time when I was student pilot flying solo and I rented a plane that happened to have an air-speed indicator that showed MPH instead of KNOTS. Silly me. I thought they all read out knots and never noticed the small print reading MPH on the dial. Redundancy saved my butt. Takeoff, landing, and cruise performance didn’t feel right. The plane had an indicated speed that just didn’t agree with what I knew should be happening at certain speeds. I presumed the indicator was wrong but as in inexperienced pilot I knew the seat of my pants wasn’t the most reliable thing in the world either. So I split the difference between what my gut told me was the right speed for various flight conditions and what the instrument told me was the right speed. Upon landing I told my instructor (he was a Marine Corps fighter pilot) about it and he told me that in a rare few aircraft some boneheads install air speed indicators that read out miles per hour instead of knots. After reading me the riot act for not paying closer attention to my instruments he said he was proud of me as I demonstrated I was proficient enough to pilot a plane with a faulty air-speed indicator.

Willis Eschenbach

Editor

January 22, 2011 4:10 pm

As Steve Mosher has pointed out, if the errors are random normal, or if they are “offset” errors (e.g. the whole record is warm by 1°), increasing the number of observations helps reduce the size of the error. All that matters are things that cause a “bias”, a trend in the measurements. There are some caveats, however.
First, instrument replacement can certainly introduce a trend, as can site relocation.
Second, some changes have hidden bias. The short maximum length of the wiring connecting the electronic sensors introduced in the late 20th century moved a host of Stevenson Screens much closer to inhabited structures. As Anthony’s study showed, this has had an effect on trends that I think is still not properly accounted for, and certainly wasn’t expected at the time.
Third, in lovely recursiveness, there is a limit on the law of large numbers as it applies to measurements. A hundred thousand people measuring the width of a hair by eye, armed only with a ruler measured in mm, won’t do much better than a few dozen people doing the same thing. So you need to be a little careful about saying problems will be fixed by large amounts of data.
Fourth, if the errors are not random normal, your assumption that everything averages out may (I emphasize may) be in trouble. And unfortunately, in the real world, things are rarely that nice. If you send 50 guys out to do a job, there will be errors. But these errors will NOT tend to cluster around zero. They will tend to cluster around the easiest or most probable mistakes, and thus the errors will not be symmetrical.
Fifth, the law of large numbers (as I understand it) refers to either a large number of measurements made of an unchanging variable (say hair width or the throw of dice) at any time, or it refers to a large number of measurements of a changing variable (say vehicle speed) at the same time. However, when you start applying it to a large number of measurements of different variables (local temperatures), at different times, at different locations, you are stretching the limits …
Sixth, the method usually used for ascribing uncertainty to a linear trend does not include any adjustment for known uncertainties in the data points themselves. I see this as a very large problem affecting all calculation of trends. All that are ever given are the statistical error in the trend, not the real error, which perforce much be larger.
Seventh, there are hidden biases. I have read (but haven’t been able to verify) that under Soviet rule, cities in Siberia received government funds and fuel based on how cold it was. Makes sense, when it’s cold you have to heat more, takes money and fuel. But of course, everyone knew that, so subtracting a few degrees from the winter temperatures became standard practice …
My own bozo cowboy rule of thumb? I hold that in the real world, you can gain maybe an order of magnitude by repeat measurements, but not much beyond that, absent special circumstances. This is because despite global efforts to kill him, Murphy still lives, and so no matter how much we’d like it to work out perfectly, errors won’t be normal, and biases won’t cancel, and crucial data will be missing, and a thermometer will be broken and the new one reads higher, and …
Finally, I would back Steven Mosher to the hilt when he tells people to generate some pseudo-data, add some random numbers, and see what comes out. I find that actually giving things a try is often far better than profound and erudite discussion, no matter how learned.
w.

island

January 22, 2011 4:16 pm

It may seem off the subject, but there do seem to be a lot of senior experienced people reading this blog and commenting and hence may be interested in this.
So for what it is worth for all those people with high blood pressure: certification of many blood pressure cuffs has an acceptable error of plus or minus 20% to be certified accurate as I recall.
And this is from an Australian report: “Sphygmomanometers can pass validation tests despite producing clinically significant errors that can be greater than 15 mmHg in some individuals.” http://www.racgp.org.au/afp/200710/200710turner.pdf

Mark T

January 22, 2011 4:23 pm

Finally, I would back Steven Mosher to the hilt when he tells people to generate some pseudo-data, add some random numbers, and see what comes out.

I do that all the time. It is verification that when you do meet the i.i.d. requirements the CLT and LLN work. It does not, however, provide you with a warm fuzzy regarding errors that are drawn from distributions with differing statistics. If one type of error is, for example, drawn from a uniform distribution and another from a Gaussian distribution, you cannot expect the errors to cancel. Gaussianity is not required for these two theorems, btw (Gaussianity is actually the result of the CLT.)
The LLN works best when you are measuring the same thing over and over, but it is not required. Simply having the same statistics suffices (hehe, simply is hardly a true comparison, it is very difficult to acheive.) In the absence of identical statistics, your actual error (in an average) will be somewhere between the sqrt(N) you desire and the largest error in your data.
Dave, you MUST have had at least some statistics training while you were getting your engineering degree. I had at least 3 or 4 classes just in my undergrad alone. Surely someone along the line explained this to you?
Mark

Dave Springer

January 22, 2011 4:23 pm

Mark T says:
January 22, 2011 at 3:45 pm
“You’re joking, right? YOU MADE THE CLAIM, you need to prove it, not me.”
Two can play that game, Mark. You claimed I was wrong. You prove your claim.
“Where on earth did you get this from? Who said it is worthless? I only noted that you cannot arbitrarily cancel errors, and I am correct in that statement.”
There is nothing arbitrary in the way that instrument errors cancel out through redundancy.

Mark T

January 22, 2011 4:28 pm

David A. Evans says:
January 22, 2011 at 4:05 pm

E.M.Smith, me and a few others have been arguing the same point. I gave up after a while as you can only bang your head against a wall so many times.

Are you referring to the heat content issue? If so, I definitely agree. The average, though known to be simply a mathematical construct, loses all meaning when averaging different things. Any given temperature can be arrived at from different levels of heat. Why anyone ever argues it makes sense to average temperature has always been beyond me.
Mark

Mark T

January 22, 2011 4:33 pm

Dave Springer says:
January 22, 2011 at 4:23 pm
Mark T says:
January 22, 2011 at 3:45 pm

Two can play that game, Mark. You claimed I was wrong. You prove your claim.

NO! Where on earth did you learn this? You made a claim and I pointed out what YOU have to prove for your claim to be true. You can’t just say “It’s true because I say so.” That’s nonsense. Meet the requirements or shut up.
I have already given you the requirements which you have thus far failed to address. Do you even understand what i.i.d. means?

There is nothing arbitrary in the way that instrument errors cancel out through redundancy.

If they are uncorrelated and drawn from identical distributions, no, there is not. But you need to meet these requirements otherwise you cannot claim they cancel.
So, Dave, are you going to answer any of my questions or are you just going to keep running in circles?
Mark

EFS_Junior

January 22, 2011 4:40 pm

Mark T says:
January 22, 2011 at 2:29 pm
The other thing that is instructive is to compare two thermometers that are within a few km of each other.. over the period of say 100 years. Look at the corellation.
98% plus.
If both thermometers were affected by the same sort of physical process that was causing a degradation of accuracy over time, you would expect them to have highly correlated data, too.
_____________________________________________________________
Please extend this analogue, first to 10’s of thermometer readings, than to 100’s of thermometer readings, and finally to 1000’s of thermometer readings, in terms of having identical systematic errors and in probabilistic terms (e. g. How likely are 1000’s of thermometers likely to have the exact same systematic errors AND show identical low frequency trendlines?). TIA
_____________________________________________________________
Or you can write a simulation of a sensor with very gross errors. simulate daily data for 100 years. Assume small errors. calculate the trend. Assume large errors. calculated the trend.
If you’re drawing your “errors” using independent trials from the same distribution of course this will work. That is a trivial application of the CLT that proves nothing other than the fact that the CLT works if you meet all the requirements.
In general, I don’t think anybody in here actually understands how the CLT or LLN work. A few came close. You do not need a normal distribution for the CLT to work. You need independent and identically distributed (i.i.d.) error distributions for errors to cancel with the sqrt(N). It is my hope that at some point everyone will figure out how much of a limit i.i.d. really is.
_____________________________________________________________
And I don’t think anyone here understands how to extract a very real low frequency signiture from “noisy” data. 🙁
_____________________________________________________________
Generally speaking, independence is not really required (independence is calculated over all time, which is not possible,) just orthogonality (uncorrelatedness) but the errors do need to be drawn from an identical distribution if you want the cancellation property to apply. That implies the same mean and variance, btw. The mean and variance need to exist, obviously, and they also should be stationary (unless they all vary identically over time,) which is not as obvious but easy to figure out. That also implies that if the errors are a function of the thing you’re measuring, e.g., a percentage, then the CLT will not apply. Sorry. Get over it. The same applies to situations in which the error distributions are unknown, which clearly applies to temperature measurements.
Increased uncertainty in the data itself implies increased uncertainty in any calculations done with the data. If the i.i.d. requirement is not met, then you have no choice but to assume the errors do not cancel… anywhere. It sucks, I know, but them’s the breaks. Stay away from statistical endeavors if you cannot wrap your head around this very basic concept.
Mark
_____________________________________________________________
What the heck does LLT stand for?
Standard practice is to spell it out first then yo put it in perenteses (e. g. per http://www.acronymgeek.com/LLT “Language Learning & Technology (LLT)”) for later reference(s).
http://en.wikipedia.org/wiki/Random_errors
“In statistics and optimization, statistical errors and residuals are two closely related and easily confused measures of the deviation of a sample from its “theoretical value”. The error of a sample is the deviation of the sample from the (unobservable) true function value; while the residual of a sample is the difference between the sample and the estimated function value.”
Perfect Deconstructionist logic therefore dictates that nothing is knowable. Q.E.D.

Philip Shehan

January 22, 2011 4:42 pm

Mark T:
Dave Springer wrote:
“Thousands of people reading thousands of different thermometers for hundreds of years won’t give you the confidence to say it was 70.2 degrees +-1 degree on April 4th, 1880 in Possum Trot, Kentucky but it will allow you to say the average temperature for April in Kentucky in 1880 was 0.5 degrees +-0.1 degrees cooler in 1880 than it was 1980.”
And you replied:
“No, they won’t, not unless you can prove the errors are drawn from independent and identically distributed distributions.”
If I understand you both correctly, I’m afraid I must agree with Dave Springer. Random errors being, well random, multiple measurements will cancel them out. And when you are measuring changes in temperature, systematic errors will also cancel out. Take a thermometer than reads 1 degree high. If in 1950 it read 72 F it was actually 71 F. In 2000 it read 74 F it was actually 73F. But the rise is 2F regardless of whether you you correct for the true temperature or not.
And tempratures from proxy measurements such as tree ring growth can be used to check past thermometer readings, or even infer measurements where no measurements were taken.

Mark T

January 22, 2011 4:43 pm

From Wikipedia:

Two different versions of the Law of Large Numbers are described below; they are called the Strong Law of Large Numbers, and the Weak Law of Large Numbers. Both versions of the law state that – with virtual certainty – the sample average converges to the expected value where X1, X2, … is an infinite sequence of i.i.d. random variables with finite expected value E(X1) = E(X2) = … = µ < ∞.

Bold mine. The definition of i.i.d. is:

In probability theory and statistics, a sequence or other collection of random variables is independent and identically distributed (i.i.d.) if each random variable has the same probability distribution as the others and all are mutually independent.

This is taught in engineering school. It is required, btw. You should know this, Dave. Why don’t you?
Mark

Dave Springer

January 22, 2011 4:46 pm

island says:
January 22, 2011 at 4:16 pm

So for what it is worth for all those people with high blood pressure: certification of many blood pressure cuffs has an acceptable error of plus or minus 20% to be certified accurate as I recall.
And this is from an Australian report: “Sphygmomanometers can pass validation tests despite producing clinically significant errors that can be greater than 15 mmHg in some individuals.”

I don’t think I’ve seen the term sphygmomanometer since I took “Human Anatomy and Physiology” 30-some years ago in college. That was my favorite class of all time. It was a required class for registered nurses and we got to play with all the standard medical office instruments in the lab portion. What made the class so great was there were, counting the instructor, only two guys in a class of 30 people. The other 28 were 18-20 year-old girls. Yowzah. That’s a good ratio and unlike the instructor, I was under no professional contraint to avoid intimate contact with the students if you get my drift.
Anyhow, to the best of my recall, we weren’t taught that blood pressure cuffs could be wrong by that much. Of course we didn’t have those automatic jobbies you find in grocery stores and whatnot. Just the regular cuff inflated by a hand squeezed pump and a stethoscope to listen for diastolic and systolic pulse-pressure points. The operator could easily be wrong by 15 points but the instruments could rarely if ever be blamed for that much error. In fact at my last checkup the gal who took my blood pressure before the doctor came in got mine wrong by 20 points. I told her she must be mistaken and she said she’d ask the doctor to take it again. Sure enough when my sawbones took it a few minutes later it was 20 points lower.

Mark T

January 22, 2011 4:51 pm

Philip Shehan says:
January 22, 2011 at 4:42 pm

If I understand you both correctly, I’m afraid I must agree with Dave Springer.

Because you don’t know what you’re talking about, either.

Random errors being, well random, multiple measurements will cancel them out.

Did you bother to read the definition of the LLN before you made this post? Seriously, it’s online and I have now posted definition as well as the requirements. This is getting more and more bizzare with every post. Are you the same sort of person that thinks this way, too:

Two can play that game, Mark. You claimed I was wrong. You prove your claim.

Of course, I did prove my claim, though I did not need to: I provided the definition of the LLN.

And when you are measuring changes in temperature, systematic errors will also cancel out. Take a thermometer than reads 1 degree high. If in 1950 it read 72 F it was actually 71 F. In 2000 it read 74 F it was actually 73F. But the rise is 2F regardless of whether you you correct for the true temperature or not.

Wow. Ignorance is contagious. That’s all I can say regarding this thread.
Mark
And tempratures from proxy measurements such as tree ring growth can be used to check past thermometer readings, or even infer measurements where no measurements were taken.

Mark T

January 22, 2011 4:54 pm

The last bit did not get edited off properly, but it is a rather silly statement. I think I understand why you agree with Dave Springer. Tree rings would be a useful proxy if temperature was the only thing affecting their growth, but sadly, they are actually driven more by water.
Mark

u.k.(us)

January 22, 2011 4:55 pm

Dave Springer says:
January 22, 2011 at 4:10 pm
=============
I call B.S. on this entire entry, and therefore all your recent comments.
From one pilot, to “another”.

Dave Springer

January 22, 2011 4:58 pm

Mark T says:
January 22, 2011 at 4:43 pm
“This is taught in engineering school. It is required, btw. You should know this, Dave. Why don’t you?”
What you’re taught in school and what you learn in the real world are often two different things. That’s why you don’t step out of school into a senior engineering position. I have a pretty good idea of why you don’t know that, Mark.
The thermometers used in the instrument record came from dozens of different manufacturers using different manufacturing methods and different technologies (alcohol vs. mercury for instance) with various sources of error from each due to quality control and whatnot. Over the course of millions of readings recorded by thousands of people the errors, unless they are systematic in nature, will cancel out. The wikipedia article you quoted in effect states just that. I haven’t seen anyone come up with a description of systematic error that would produce gradually rising temperatures in this scenario. No systematic error, random error cancels out, only data of concern is change in temperature over time rather than exact temperature at any one time ergo there is nothing wrong with the raw data. I won’t say the same for the “adjustments” made to the raw data though. That’s what called pencil-whipping and is frowned upon (to say the least) in my circles.

David A. Evans

January 22, 2011 5:01 pm

Mark T says:
January 22, 2011 at 4:28 pm
Correct, I’m referring to energy content. That’s what is really being argued.
OEC is a difficult one because the energy density of water requires more accurate temperature measurement but the temperature is at least more linearly related to energy.
DaveE.

Will

January 22, 2011 5:10 pm

If the trend does not exist beyond the margin of error then the trend does not exist at all.
You cannot measure a 0.7º C average trend if your measuring equipment is not on average accurate to 1.3º C.
But you can certainly claim that you can and no one will be able to ‘prove’ otherwise.
This is why the claimed warming is so small. If it was was outside the margin of error and it was indeed fraudulent, proving that it was a fraud would be a mere formality.
The irony of playing it this safe is that 0.7º C in 100+ years does not equate to an anomalous warming event. Particularly if the margin of error is 1.3º C.
In fact it is comforting to know that even after the entire industrial revolution, including the recent and ongoing industrialisation of India and China, the so called “global warming signal” is indistinguishable from the noise of the method of data collection.
I am therefore satisfied that the AGW hoax has been exposed for the fraud that it is.

dbstealey

January 22, 2011 5:12 pm

Willis Eschenbach says:
“The short maximum length of the wiring connecting the electronic sensors introduced in the late 20th century moved a host of Stevenson Screens much closer to inhabited structures. As Anthony’s study showed, this has had an effect on trends that I think is still not properly accounted for, and certainly wasn’t expected at the time.”
If I may add something: in electronic thermometers using thermocouples [which most do], the welded wires of dissimilar metals, which have a voltage output that changes with temperature, the welded bead does not provide 100% of the voltage output. The wires themselves have a declining effect according to their length from the weld. Therefore, it matters how deep into the oven, or furnace, or Stevenson screen the thermocouple is placed. This is in addition to the length of the wires as noted by Willis.
Also, routine, periodic calibration is essential. The output of thermocouples is in the millivolt/microvolt range. A voltmeter reads the output. Voltmeters tend to drift over time, and there is also the hysteresis effect: when a thermocouple is heated, then allowed to return to ambient, the output is slightly different each time at the same ambient temperature point. This is called hysteresis.
In my experience [and I’ve calibrated thousands of temperature devices to NIST primary and secondary standards], a well constructed mercury thermometer is usually superior to an electronic thermometer. That is why electronic thermometers almost always have a shorter calibration recall period than mercury thermometers.

Oliver Ramsay

January 22, 2011 5:36 pm

Dave Springer said (amongst other things) “…….Over the course of millions of readings recorded by thousands of people the errors, unless they are systematic in nature, will cancel out….”
———————
I’m beginning to warm to this notion; two wrongs don’t make a right but millions of wrongs do.
I’ve been trying to sell this to my wife but she’s not crazy about the smug pomposity that my new belief engenders.