The Metrology of Thermometers

For those that don’t notice, this is about metrology, not meteorology, though meteorology uses the final product. Metrology is the science of measurement.

Since we had this recent paper from Pat Frank that deals with the inherent uncertainty of temperature measurement, establishing a new minimum uncertainty value of ±0.46 C for the instrumental surface temperature record, I thought it valuable to review the uncertainty associated with the act of temperature measurement itself.

As many of you know, the Stevenson Screen aka Cotton Region Shelter (CRS), such as the one below, houses a Tmax and Tmin recording mercury and alcohol thermometer.

Hanksville_looking_north

Hanksville, UT USHCN climate monitoring station with Stevenson Screen - sited over a gravestone. Photo by surfacestations.org volunteer Juan Slayton

They look like this inside the screen:

NOAA standard issue max-min recording thermometers, USHCN station in Orland, CA - Photo: A. Watts

Reading these thermometers would seem to be a simple task. However, that’s not quite the case. Adding to the statistical uncertainty derived by Pat Frank, as we see below in this guest re-post, measurement uncertainty both in the long and short term is also an issue.The following appeared on the blog “Mark’s View”, and I am reprinting it here in full with permission from the author. There are some enlightening things to learn about the simple act of reading a liquid in glass (LIG) thermometer that I didn’t know as well as some long term issues (like the hardening of the glass) that have values about as large as the climate change signal for the last 100 years ~0.7°C – Anthony

==========================================================

Metrology – A guest re-post by Mark of Mark’s View

This post is actually about the poor quality and processing of historical climatic temperature records rather than metrology.

My main points are that in climatology many important factors that are accounted for in other areas of science and engineering are completely ignored by many scientists:

  1. Human Errors in accuracy and resolution of historical data are ignored
  2. Mechanical thermometer resolution is ignored
  3. Electronic gauge calibration is ignored
  4. Mechanical and Electronic temperature gauge accuracy is ignored
  5. Hysteresis in modern data acquisition is ignored
  6. Conversion from Degrees F to Degrees C introduces false resolution into data.

Metrology is the science of measurement, embracing both experimental and theoretical determinations at any level of uncertainty in any field of science and technology. Believe it or not, the metrology of temperature measurement is complex.

It is actually quite difficult to measure things accurately, yet most people just assume that information they are given is “spot on”.  A significant number of scientists and mathematicians also do not seem to realise how the data they are working with is often not very accurate. Over the years as part of my job I have read dozens of papers based on pressure and temperature records where no reference is made to the instruments used to acquire the data, or their calibration history. The result is that many scientists  frequently reach incorrect conclusions about their experiments and data because the do not take into account the accuracy and resolution of their data. (It seems this is especially true in the area of climatology.)

Do you have a thermometer stuck to your kitchen window so you can see how warm it is outside?

Let’s say you glance at this thermometer and it indicates about 31 degrees centigrade. If it is a mercury or alcohol thermometer you may have to squint to read the scale. If the scale is marked in 1c steps (which is very common), then you probably cannot extrapolate between the scale markers.

This means that this particular  thermometer’s resolution is1c, which is normally stated as plus or minus 0.5c (+/- 0.5c)

This example of resolution is where observing the temperature is under perfect conditions, and you have been properly trained to read a thermometer. In reality you might glance at the thermometer or you might have to use a flash-light to look at it, or it may be covered in a dusting of snow, rain, etc. Mercury forms a pronounced meniscus in a thermometer that can exceed 1c  and many observers incorrectly observe the temperature as the base of the meniscus rather than it’s peak: ( this picture shows an alcohol meniscus, a mercury meniscus bulges upward rather than down)

Another  major common error in reading a thermometer is the parallax error.
Image courtesy of Surface meteorological instruments and measurement practices By G.P. Srivastava (with a mercury meniscus!) This is where refraction of light through the glass thermometer exaggerates any error caused by the eye not being level with the surface of the fluid in the thermometer.
(click on image to zoom)

If you are using data from 100’s of thermometers scattered over a wide area, with data being recorded by hand, by dozens of different people, the observational resolution should be reduced. In the oil industry it is common to accept an error margin of 2-4% when using manually acquired data for example.

As far as I am aware, historical raw multiple temperature data from weather stations has never attempted to account for observer error.

We should also consider the accuracy of the typical mercury and alcohol thermometers that have been in use for the last 120 years.  Glass thermometers are calibrated by immersing them in ice/water at 0c and a steam bath at 100c. The scale is then divided equally into 100 divisions between zero and 100. However, a glass thermometer at 100c is longer than a thermometer at 0c. This means that the scale on the thermometer gives a false high reading at low temperatures (between 0 and 25c) and a false low reading at high temperatures (between 70 and 100c) This process is also followed with weather thermometers with a range of -20 to +50c

25 years ago, very accurate mercury thermometers used in labs (0.01c resolution) had a calibration chart/graph with them to convert observed temperature on the thermometer scale to actual temperature.

Temperature cycles in the glass bulb of a thermometer harden the glass and shrink over time, a 10 yr old -20 to +50c thermometer will give a false high reading of around 0.7c

Over time, repeated high temperature cycles cause alcohol thermometers to evaporate  vapour into the vacuum at the top of the thermometer, creating false low temperature readings of up to 5c. (5.0c not 0.5 it’s not a typo…)

Electronic temperature sensors have been used more and more in the last 20 years for measuring environmental temperature. These also have their own resolution and accuracy problems. Electronic sensors suffer from drift and hysteresis and must be calibrated annually to be accurate, yet most weather station temp sensors are NEVER calibrated after they have been installed. drift is where the recorder temp increases steadily or decreases steadily, even when the real temp is static and is a fundamental characteristic of all electronic devices.

Drift, is where a recording error gradually gets larger and larger over time- this is a quantum mechanics effect in the metal parts of the temperature sensor that cannot be compensated for typical drift of a -100c to+100c electronic thermometer is about 1c per year! and the sensor must be recalibrated annually to fix this error.

Hysteresis is a common problem as well- this is where increasing temperature has a different mechanical affect on the thermometer compared to decreasing temperature, so for example if the ambient temperature increases by 1.05c, the thermometer reads an increase on 1c, but when the ambient temperature drops by 1.05c, the same thermometer records a drop of 1.1c. (this is a VERY common problem in metrology)

Here is a typical food temperature sensor behaviour compared to a calibrated thermometer without even considering sensor drift: Thermometer Calibration depending on the measured temperature in this high accuracy gauge, the offset is from -.8 to +1c

But on top of these issues, the people who make these thermometers and weather stations state clearly the accuracy of their instruments, yet scientists ignore them!  a -20c to +50c mercury thermometer packaging will state the accuracy of the instrument is +/-0.75c for example, yet frequently this information is not incorporated into statistical calculations used in climatology.

Finally we get to the infamous conversion of Degrees Fahrenheit to Degrees Centigrade. Until the 1960’s almost all global temperatures were measured in Fahrenheit. Nowadays all the proper scientists use Centigrade. So, all old data is routinely converted to Centigrade.  take the original temperature, minus 32 times 5 divided by 9.
C= ((F-32) x 5)/9

example- original reading from 1950 data file is 60F. This data was eyeballed by the local weatherman and written into his tallybook. 50 years later a scientist takes this figure and converts it to centigrade:

60-32 =28
28×5=140
140/9= 15.55555556

This is usually (incorrectly) rounded  to two decimal places =: 15.55c without any explanation as to why this level of resolution has been selected.

The correct mathematical method of handling this issue of resolution is to look at the original resolution of the recorded data. Typically old Fahrenheit data was recorded in increments of 2 degrees F, eg 60, 62, 64, 66, 68,70. very rarely on old data sheets do you see 61, 63 etc (although 65 is slightly more common)

If the original resolution was 2 degrees F, the resolution used for the same data converted to  Centigrade should be 1.1c.

Therefore mathematically :
60F=16C
61F17C
62F=17C
etc

In conclusion, when interpreting historical environmental temperature records one must account for errors of accuracy built into the thermometer and errors of resolution built into the instrument as well as errors of observation and recording of the temperature.

In a high quality glass environmental  thermometer manufactured in 1960, the accuracy would be +/- 1.4F. (2% of range)

The resolution of an astute and dedicated observer would be around +/-1F.
Therefore the total error margin of all observed weather station temperatures would be a minimum of +/-2.5F, or +/-1.30c…

===============================================================

UPDATE: This comment below from Willis Eschenbach, spurred by Steven Mosher, is insightful, so I’ve decided to add it to the main body – Anthony

===============================================================

Willis Eschenbach says:

As Steve Mosher has pointed out, if the errors are random normal, or if they are “offset” errors (e.g. the whole record is warm by 1°), increasing the number of observations helps reduce the size of the error. All that matters are things that cause a “bias”, a trend in the measurements. There are some caveats, however.

First, instrument replacement can certainly introduce a trend, as can site relocation.

Second, some changes have hidden bias. The short maximum length of the wiring connecting the electronic sensors introduced in the late 20th century moved a host of Stevenson Screens much closer to inhabited structures. As Anthony’s study showed, this has had an effect on trends that I think is still not properly accounted for, and certainly wasn’t expected at the time.

Third, in lovely recursiveness, there is a limit on the law of large numbers as it applies to measurements. A hundred thousand people measuring the width of a hair by eye, armed only with a ruler measured in mm, won’t do much better than a few dozen people doing the same thing. So you need to be a little careful about saying problems will be fixed by large amounts of data.

Fourth, if the errors are not random normal, your assumption that everything averages out may (I emphasize may) be in trouble. And unfortunately, in the real world, things are rarely that nice. If you send 50 guys out to do a job, there will be errors. But these errors will NOT tend to cluster around zero. They will tend to cluster around the easiest or most probable mistakes, and thus the errors will not be symmetrical.

Fifth, the law of large numbers (as I understand it) refers to either a large number of measurements made of an unchanging variable (say hair width or the throw of dice) at any time, or it refers to a large number of measurements of a changing variable (say vehicle speed) at the same time. However, when you start applying it to a large number of measurements of different variables (local temperatures), at different times, at different locations, you are stretching the limits …

Sixth, the method usually used for ascribing uncertainty to a linear trend does not include any adjustment for known uncertainties in the data points themselves. I see this as a very large problem affecting all calculation of trends. All that are ever given are the statistical error in the trend, not the real error, which perforce much be larger.

Seventh, there are hidden biases. I have read (but haven’t been able to verify) that under Soviet rule, cities in Siberia received government funds and fuel based on how cold it was. Makes sense, when it’s cold you have to heat more, takes money and fuel. But of course, everyone knew that, so subtracting a few degrees from the winter temperatures became standard practice …

My own bozo cowboy rule of thumb? I hold that in the real world, you can gain maybe an order of magnitude by repeat measurements, but not much beyond that, absent special circumstances. This is because despite global efforts to kill him, Murphy still lives, and so no matter how much we’d like it to work out perfectly,  errors won’t be normal, and biases won’t cancel, and crucial data will be missing, and a thermometer will be broken and the new one reads higher, and …

Finally, I would back Steven Mosher to the hilt when he tells people to generate some pseudo-data, add some random numbers, and see what comes out. I find that actually giving things a try is often far better than profound and erudite discussion, no matter how learned.

w.

240 thoughts on “The Metrology of Thermometers

  1. Very interesting post, thanks.

    As an engineer, most of the ideas here were pretty familiar to me. I find it almost unbelievable that the climate records aren’t being processed in a way which reflects the uncertainty of the data, as is stated in the article. What evidence is there that this is the case (I think I mean: is the processing applied by GISS or CRU clearly described, and is there any kind of audit trail? Have the academics who processed the data published the processing method?).

    I think the F to C conversion issue is tricky. I think the conversion method you favour (using significant figures in C to reflect uncertainty) would lead to a bias which varies with temperature. What is actually needed is proper propagation of the known uncertainty through the calculation, rather than using the implied accuracy of the number of significant figures. So the best conversion from 60F to C would be 15.55+/-1.1C (in your example above). But obviously, promulgating 15.55 is fraught with the danger of the known 1.1C uncertainty being forgotten, and the implied 0.005C certainty used instead. Which would be bad.

  2. I once designed a precision temperature “oven” which had a display showing the temperature to 0.01C. In practice the controller for device was only accurate to 0.1C at best and with a typical lab thermometer there was at least another 0.1C error. Then the was the fact you were not measuring the temperature at the centre of the oven and drift and even mains supply variation had a significant effect!

    All in all the error of this device which might appear to be accurate to 0.01C could have been as bad as the total so called “global warming signal”.

    I’ve also set up commercial weather stations using good commercial equipment which I believe is also used by many meteorological stations and the total error is above +/-1C even on this “good” equipment.

    As for your bulk standard thermometer from a DIY shop. Go to one and take a reading from them all and see how much they vary … it’s normally as much as 2C or even 3C from highest to lowest.

    Basically, the kind of temperature error being quoted by the climategate team is only possible in a lab with regularly calibrated equipment.

  3. I think Hanksville was just a pretty typical set-up. Some of those stations in the study were in even worse shape from what I saw. I particularly liked the one in the junkyard. Okay… maybe it wasn’t a junkyard, but that was what it looked like.

    As for the Mark’s View post, I read this over at his site and thought it was fascinating. I knew about the calibration requirements for electronic test equipment, but had no idea of the vagaries behind the simple mechanical/visual thermometer.

  4. Are those thermometers nailed horizontal onto the Stevenson Screen?

    Surely a horizontal thermometer, without the gravitational component, will read differently to a vertical one. And if it is touching the wood, then surely there is radiative cooling of the wood at night and warming in the day. Surely the thermometers should be insulated from the screen by a few centimeters. How much error would this give??

  5. The picture of the inside of a Stevenson Screen raised a couple of questions in my mind that perhaps a professional can answer:

    Does the orientation of a glass thermometer affect its reading? I have thought of all thermometers as being used approximately vertically but those in the screen are shown as horizontal.

    Is the vertical position inside the screen relevant? Even though there are ventilation louvres, I would expect some sort of vertical temperature gradient within the enclosure, reporting higher temperatures closer to the top of the enclosure. This, of course would be especially bad when exposed to full sun, with a tendency therefore to over-report temperatures. I would also expect this problem to increase with time as the reflectivity of the white paint drops with flaking, build up of dust and dirt etc.

  6. So in reality the 0.6C feared temperature rise could mean that statistically there has been no temperature change. And all models give the wrong answer because temperature inputs are incorrect.
    Very interesting article is a pdf copy possible please Anthony?

  7. As long as the errors don’t trend in a biased way over time, the fact that there are thousands of sensors should make standard errors small (variance divided by square root no. of observations).

    Of course, one biased sensor in the middle of nowhere would have a disproportionate effect – although I’m not clear how the interpolations are done.

  8. As the earths surface area is +- 196,935,000 square miles and the geological parameters are as diverse as it is possible to imagine, we have intelligent people declaring that the world is heating by as much as 2.0c per 100 years, with official recorded temp. covering perhaps 1% of the total area of the earth. I fully agree with the above article, we as humans are all idiots, some of us think we understand what we are trying to do, but even reading a temp. looks beyond the scope of those who’s job it is.
    Italy is a prime example of human intelligence, knowledge, and understanding, it is in that country for all to see, and Italians have in the past produced some of the worlds most outstanding thinkers, BUT !!!! if you put 100 Italians into a room and ask them to form a political party, at the end of a month you would find that they have formed 100 plus political parties, and thousands of political ideologies non of which address the problems facing the country.
    And Italy is one of the worlds better countries, it had inside baths and running hot water when the English were still learning to make fire. but here we are 3000 years later and 99.9% of the world population still cannot read a thermometer, like i said above we are all idiots.

  9. Anthony,

    You missed a few big errors in thermometers only from experience with them.

    Having them measure in direct sunlight. The material around it absorbs heat and gives a false warmer temperature reading.
    Some themomters are pressure fit or have a fastened and can slide in the sleeve.
    Having the themometer snow covered in heavy blowing wind.
    Moisture on the themometers.

  10. “The correct mathematical method of handling this issue of resolution is to look at the original resolution of the recorded data. Typically old Fahrenheit data was recorded in increments of 2 degrees F, eg 60, 62, 64, 66, 68,70. very rarely on old data sheets do you see 61, 63 etc (although 65 is slightly more common)

    If the original resolution was 2 degrees F, the resolution used for the same data converted to Centigrade should be 1.1c.

    Therefore mathematically :
    60F=16C
    61F17C
    62F=17C
    etc

    In conclusion, when interpreting historical environmental temperature records one must account for errors of accuracy built into the thermometer and errors of resolution built into the instrument as well as errors of observation and recording of the temperature.”

    In GHCN observers recorded F by rounding up or down.
    Tmax to 1degree F
    Tmin to 1 degree F.
    Then the result is averaged and rounded (tmax+tmin/2)

    Now, if you think that changing from F to C is an issue you can do the following.

    calculate the trend in F
    Then
    convert F to C and calculate the trend.

    Also, the “observer error” and transcription errors are all addressed in the literature, see brohan 06.

    The other thing that is instructive is to compare two thermometers that are within a few km of each other.. over the period of say 100 years. Look at the corellation.
    98% plus.

    Or you can write a simulation of a sensor with very gross errors. simulate daily data for 100 years. Assume small errors. calculate the trend. Assume large errors. calculated the trend.

    Result? the error structure of individual measures doesnt impact your estimation of the long term trends. NOW, if many thermomemters all has BIASES ( not uncertainty) and if those biases were skewed hot or cold, and if those biases changed over time, then your trend estimation would get impacted

    Result? no difference.

  11. Stephen;
    You don’t get to just wave away the possibility of systematic error. That’s the whole point of error bars. You can’t know anything about systematic error WITHOUT ACTUALLY CHECKING FOR IT. And such checking has not been done. Ergo ….

  12. Steven Mosher

    result? the error structure of individual measures doesnt impact your estimation of the long term trends. NOW, if many thermomemters all has BIASES ( not uncertainty) and if those biases were skewed hot or cold, and if those biases changed over time, then your trend estimation would get impacted. Result? no difference.

    I agree with you, but only in case all biases occur at same time, same direction, same amount. And that is never ever the case. One even don´t know how much, which and where that kind of bias appear, until you look at every station and have a deep look in its history. What only Antonys volunteers did. But at sea (7/10 of earth surface) situation is even much more worse. Literature is full about this issue. The Metadata neccessary to estimate – and perhaps luckily correct that kind of bias- are not available. So what is left is: You should mention them and increase your range of uncertainty accordingly.
    Michael

  13. Thanks – a very enlightening post!

    I was most interested in the way that looking at the meniscus from different angles will give you different results. Ages ago, we students were shown how to do this, in an introductory practical about metrology.
    In my naivety I assumed that this care was taken by all who record temperatures, and that the climate scientists using those data were aware of the possibility of measuring errors … apparently not.
    So even if sloppiness here or there doesn’t influence the general trend, as Steven Mosher points out above – should we not ask what other measurements used by the esteemed computer models are equally sloppy, and do not even address the question of metrological quality control?

  14. I saw the uniform o.46 number for error over the instrumental record and my first thoughts were on the estimates of sea surface temperature. The oceans cover 70 percent of the globe yet until about 30-40 years ago the temperature measurement followed shipping lanes and the spacial coverage was poor. Land temp measurements were also weighted heavily in more developed regions. So to me it is absurd to think the accuracy can be that consistent based on spacial distribution changes of where the temp has been measured only.

  15. Your article is “spot on”. I have various mercury bulb and alcohol bulb thermometers around my house as well as type K and RTD thermocouples. On any given day you can’t really get better than 1 or 2 deg F agreement between them all.

    As a person versed in both thermodynamics and statistics I find it amusing every time I see precision and accuracy statements from the climate community.

  16. Besides the paint chipping off over time, if it’s really cold and the thermometer reader takes some time to read the thermometer, or gets really close to it due to 1) bad eyesight or 2) to get that “perfect” measurement in the interest of science, either breathing on the thermometer or body heat would affect the reading to the upside wouldn’t it? The orientation of the thermometer in the picture shows that a person would be facing the thermometer and breathing in that direction for however long it took to take the reading. And if it was very cold and windy that sure would make a nice little shelter to warm up for a moment and scrape the fog off the glasses, maybe have a sip of coffee before taking the reading.

    Another example of a measurement errors to the upside. Why oh why are they always to the upside.

  17. Great post. As a practicing engineer involved in design and analysis of precision measurement equipment I am well aware of the challenges in making measurements and interpreting data. This post is spot-on in its observation that many data users put little thought into the accuracy of the measurements.

    Especially in cases where one is searching for real trends, the presence of measurement drift (as opposed to random errors) can create huge problems. The glass hardening issue is therefore huge here.

    Any chance of being allowed to make an icewater measurement with one of these old thermometers? I’m sure that result would be fascinating.

  18. There is one more source of uncertainty which is not mentioned in this excellent article: changes in observation times. Different daily average calculation methods could create a significant warm or cold bias compared to the true 24-hour average temperature of any day. The difference will be different in each station and in each historical measurement site, because the average daily temperature curve is determined by microclimatic effects.

    In most of the cases, climatologists try to account for these biases with monthly mean adjustments calculated from 24-hour readings. However, it is impossible to adjust these errors correctly with a single number, when the deviations are not the same in each observing site.

    Let me show you an example for this issue:

    Between 1780 and 1870, Hungarian sites observed the outdoor temperature at 7-14-21h, 6-14-22h or 8-15-22h Local Time, depending on location. How can anyone compare these early readings with contemporary climatological data? (the National Met. Service defines the daily mean temperature as the average of 24-hourly observations)

    The average annual difference between 7-14-21h LT and 24-hr readings calcuted from over a million automatic measurements is -0.283°C. This old technique causes a warm bias, which is most pronounced in early summer (-0.6°c in June) and negligible in late winter/early spring. Monthly adjustments are within 0.0 and -0.6°c. The accuracy of these adjustments are different in each month, 1-sigma standard error varies between 0.109 and 0.182°C. Instead of a single value, we can only define an interval for each historical monthly and annual mean.

    I am wondering what exactly CRU does when they incorporate 19th century Hungarian data to their CRUTEM3 global land temperature series. Observation time problems (bias and random error) are just one source of uncertainty. What about different site exposures – Stevenson screens haven’t existed that time – or the old thermometers which were scaled in Reaumur degrees instead of Celsius? These issues are well documented in hand-written station diaries, 19th century yearbooks and other occasional publications. Such informations are only available in Hungarian, did Phil Jones ever read them? :)

  19. Great Post, Mark, and thanks for giving it the air it deserves, Anthony. I read this on Mark’s blog a couple of days ago and was sure it was worth wider promulgation.

  20. My memory from running a university metsite 40 yrs ago (University of Papua New Guinea, Port Moresby), is that the ‘almost horizontal’ orientation commented on and queried by a couple of posters, is because the max and min thermometers have little indicator rods inside the bore, which get pushed up to max by the meniscus and stick there (Hg) ; and pulled down to the min and stick there (alcohol). The weather observer resets them by turning to the vertical (upright for the Hg (max), and the rod slides back down to again touch the meniscus); or upside down, for the alcohol (min), wherupon the rod slides back up, inside the liquid, till it touches the inside of the meniscus. Both are then resplaced on their near-horizontal stands.

    By the way, the stevenson screen pictured is in an atrocious condition, and the surrounds are far out of compliance from the WMO specs which require, from memory, ‘100 feet of surrounding mown but not watered grass, and no close concrete or other paved surface’ or similar …

    This alone could surely give a degree or so temp enhancement, and I suspect that this sort of deterioration over time from site spec has on average added somewhat to the global average *as recorded* to give a secular trend which is really just reflecting the increasing crappiness of the average met station over time…

    Not helped by the gradual secular transition to electronic sensors, cos they apparently are almost always within spitting distance of a (!!!) building…. thus giving another time trend pushing up the average.

    Mark

  21. Mosher has it summed up pretty well. The overall error for a single instrument would impact local records as has been seen quite a few times. As far as the global temperature record, not so much. Bias is the major concern when instrumentation is changed at a site or the site relocated.

    Adjustments to the temperature record are more a problem for me. UHI adjustment is pretty much a joke. I still have a problem with the magnitude of the TOBS. It makes sense in trying to compare absolute temperatures (where the various errors do count) but not so much where anomalies are used for the global average temperature record. Perhaps Mosh would revived the TOBS adjustment debate.

  22. Something that is interesting here,

    Although error does tend to “wash out” when using large amounts of data, this still means that the error bands will tend to be large. The larger impact is in fact human caused in the actual measuring. To say that you can detect noise within the noise of measurements such as these boggles the mind. The noise in this case would be the human impact on the climate, which will be indectable from other causes over time. You simply can not torture the data enough to get that signal so to speak.

    But the largest issue is that even with the understanding of error here, many people do not understand the actual implications. The odds are that the temperature increase of 0.7C is correct (this is the actual observed temperature..) but the fact that the error is say +- 1.3 C does say something that we should be aware of.

    It means that above all else, we have probably seen warming which few people doubt. Whether this is natural or man-caused is the actual debate now, and this result seriously puts some damper into the assertion that it is man-caused since with such large error bands you can not be as sure on the trends over smaller periods….and since the signal comes from shorter trends as a rule, this means that over short time periods you will be able to get a very accurate trend, but whether its noise or caused by humans…? No chance.

    The implications are somewhat large in that sense. If you run the models by randomly adding “possible” error and over many reproductions of the models, they should show that if the CO2 signal exists, that the measuring error wouldn’t matter. To put that into practice would involve randomly changing observed temperatures up and down somewhat and fine-tuning the GCM’s based on that new assumption.

    This is something that the GCM’s are weak on since they use temperatures to fine-tune themselves (on other climate variables ranging from solar influences on down…) and to my knowledge the actual error in the instruments has not been calculated in model runs to this date. This is an issue, which does bear some exploring.

    Overall, very good article, although I question the larger possible error and I would hazzard to guess that 1.3C is the actual limit of the error since we would assume that the observer bias and C/F computation has been considered (its fairly obvious and although I do find most climate scientists to be mostly incompetent, I would find it hard to believe that they didn’t figure this one out. The actual error from the instruments would also be obvious, but shucks, its something I can see them over-looking as they simply assume it washes out so to speak.

    The fact that so much error is possible does bear a large study in itself. If we could make the temperature record more accurate, it would help a lot in our studying of the climate overall.

  23. Steven Mosher

    “The other thing that is instructive is to compare two thermometers that are within a few km of each other.. over the period of say 100 years. Look at the corellation.
    98% plus.”

    I never tried it over a period of 100 years but over about 100 days, there was not much correlation between the trend in the max/min readings from the thermometer at my school and that at my club. But perhaps 20 kms is more than “a few”. Or am I cheating because one was in the desert and the other in an urban area?

  24. When I was at school, I recorded the temperatures for years from the school’s Stevenson screen. We were a sub-recording station for the local RAF base, so I assume it was set up correctly. We had 4 thermometers inside, a high precision Tmax, which was mercury, a high precision Tmin which was alcohol, and a mid precision wet & dry. The Tmax and Tmin were angled at about 20 degrees, with the bulbs at the bottom of the slope and the wet & dry was vertical. Two things strike me about the photos of the screen. The thermometers are fixed directly onto a piece of horizontal wood so there is no free airflow round the bulbs. Standing the screen over a gravestone must make the nightime minimum temperature totally innacurate as it will be affected by re-radiated heat from the block of stone below, the warmth rising straight upwards towards the box in the cool night air. At our school site, the screen was located in a fenced off, grassed area which the school gardener was told not to cut. The grass was quite long there, and contained a final thermometer, the grass minimum, which was horizontal on two small supports such that the bulb just touched the blades of grass. This often used to record ground frosts which didn’t appear as air frosts.
    An 8 inch rain gauge and a Fortin barometer completed the equipment – we also recorded cloud cover and type, and estimated cloud base. We used to record all this daily, draw up graphs and charts, and work out the humidity from the wet & dry readings, using a book of tables. It was a very good grounding in Physics, Physical Geography, and the methods of recording and presenting data over a long period. Many, many years before computers!!

  25. I second the request for a pdf document. It would be great in my file.

    This post highlights and confirms with numbers something I have believed for a long time. How does this fit with the past discussions of the fact that temperature recording at airports has changed (the M for minus thing)? More and more errors introduced and unaccounted for, I suppose.

  26. In the 1960s I used to wander round the UK with a team engaged in commissioning turbine-generator units before they were handed over to the CEGB.

    We got to one power station where the oil return drains from the turbine-generator bearings were fitted with dial thermometers (complete with alarm contacts) and witnessed yet another round in the endless battle between the “sparks” and “hiss & piss” departments.

    The electrical engineers in the generator department at head office had specified temperature scales in degrees C, whereas the mechanical engineers in the turbine department (on the floor below) had specified scales in degrees F.

    How we laughed (when the CEGB guys were not looking).

  27. Another thing ignored, or never even thought of, is the response time of modern electronic sensors compared to liquid in glass thermometers. A wind shift resulting in a sudden movement of warm air towards the temperature station, say warm air from the airport tarmac, would register a lower peak temperature on a glass thermometer than it would on a modern electronic sensor. This biases the Tmax upwards with modern instrumentation compared with what would have been measured by past instrumentation.

  28. I am appalled to see the claim that the Climate Scientists do not seem to have include any of these basic metrology considerations in their work. And, to give significance to results that have a resolution greater than the basic resolution of the measuring process, is not even wrong ( to borrow a phrase).

    If you look at just one aspect of the chain of electronic temperature recording , that of the quantisation of the reading of the analog temperature element; there are potentially serious sources of error which must be understood by any serious user of such systems.

    Any electonics engineer will confirm that analog-to-digital conversion devices are prone to a host of potential error-sources … ( try googling for them; you will be amazed ….)
    They are very difficult to calibrate across their range, and also across the range of environmental variations in order to check that the various compensations remain within spec.

    I would add that there is an insidious problem in integrative averaging of many readings, which comes from the non-linearities in each individual measurement system.
    If you use the ‘first-diference’ method to get a long term average ‘trend’ then non-linearity will cause a continual upward (or downward) drift in the result. The problem worsens with more readings . For example, using the ‘average’ from a series of weekly readings ….. and comparing it with the ‘average’ from daily reading will reveal thsi source of drift.

    And when we come to the Mauna Loa CO2 series …we have another area where the claimed ‘results’ of the manipulation of the instrument readings are given in PPM with decimal places! ( How DO you get 0.1 of a part?)

    Anyway, as I understand it, the measurement and reporting of atmospheric CO2 is dominated by a single linear process, from the production and testing of the calibration gases through to the analysis of the average of the results. I should like to see a thorough critical analysis of every stage of this process, to ensure that we are not looking at an artefactual result.

    Go, Michael!

  29. Being that this subject is sort of in my wheelhouse, I would like to add a few items to the uncertainty picture.
    Calibration of mercury/alcohol thermometers:
    The 0 degree C (ice point) is somewhat dependent on the purity of the water.
    Regular tap water from a municipal supply can throw the ice point off by as much as 2 degrees C with the presence of naturally occurring salts/minerals. The same is true to a lesser extent with the boiling point. Also the air (barometric) pressure has a marked effect on the boiling point.

    The accuracy and stability of electronic temperature measurement devices is largely dependent on the purity of the components involved as well as the metallurgical chemistry involved. Oxidation and Nitrogen embrittlement are all factors over time on metal based devices.

    You have basically three classes of temperature measurement contrivances.

    Those that rely on the coefficient of thermal expansion.
    Mercury/alcohol thermometers and bi-metal dial thermometers are examples of CTE devices.

    Those that generate an EMF due to temperature differences.
    Thermocouples are the best examples of these.

    Those that vary resistance with temperature.
    Thermisters and RTD are examples of these.

    Then there are the electronic types that rely on radiation/optics.
    These are non contact and are dependent on the emissivity of the object whose temperature is to be measured. (it is a fourth so sue me) Various techniques are used such as thermopiles and photo detectors. These are generally not as accurate as direct contact devices. Your handy dandy satellites use variants of this technique.

    About the only stable temperature point available to calibrate anything with is the freezing point of gold. It is stable because of its chemistry. Gold does not readily combine chemically with anything.

    You can obtain remarkable resolution from almost all of the contrivances, all it takes is large sums of money, time, and pathological attention to detail.

    Errors for Thermocouples are + or – 1.5 C.
    Errors for RTDs are + or – 1 C
    As stated errors for standard thermometers are + or – 1 C but can be as large as 3 or 4 C depending on factors.

    Regarding climatology, the error of temperature measurement is somewhat cumulative so that over time the uncertainty levels should increase. This is of course ignored by the climate community. That and the ludicrous claim that they can reconstruct temperature to within 0.1 C is an indication that they do not know what they are talking about and are fumbling around in the dark.
    just my $.02 worth which may not be much due to inflation.

  30. Steven Mosher says:
    January 22, 2011 at 3:23 am

    The point that you have missed here Steve Mosher is that the margin of error is practically twice the claimed “global warming signal” of 0.7º C. Add in some biased agenda driven human homogenisation and what have you got?

    The oldest temperature record in the world is only 352 years old. Based on the Central England Temperature record, over the last 15 years there has been a cooling trend:

    http://c3headlines.typepad.com/.a/6a010536b58035970c0147e1680aac970b-pi

    So Steve Mosher’s point sounds all well and good until one actually examines it more closely, at which point it becomes utterly meaningless. The reason it becomes meaningless is entirely because the so “global waring signal” is half the margin of error in the “official” data. It makes no difference which trend you prefer, the warming or the cooling, they are both meaningless because they are both approximately half the margin of error. So the whole temperature issue is a red herring.

    It is an unprovable and un-winnable faux debate that serves the “warmist’s” and the “gatekeepers” both, by keeping everyone distracted from the real issue, CO2.

  31. Jit says:
    January 22, 2011 at 2:31 am
    As long as the errors don’t trend in a biased way over time, the fact that there are thousands of sensors should make standard errors small (variance divided by square root no. of observations).

    Of course, one biased sensor in the middle of nowhere would have a disproportionate effect – although I’m not clear how the interpolations are done

    I have seen this line of reasoning before, and I belive this is an incorrect application of the statistics of large numbers. If you had 100 thermometers measuring temperature in the same small area at the same time you would be correct. This is how the satellite measurements of sea level can get millimeter accuracy with only 3 cm resolution. They take tens of thousands of measurements of the same area in a short space of time. A time series of measurements of a single thermometer in one location doesn’t, I believe, meet the criteria for this statistical method.

  32. In an engineering sense….. The Climate scientist’s measurement regime is fine if one was cutting 2×4’s and plywood for shelves in the garage….. You definately wouldn’t want these guys designing an air frame for a supersonic jet fighter!……:-o

    ….. and to be honest, I think they knocked together the climate science club house that they alone are playing in. All that’s missing is the hand painted, “NO GURLS” sign….(that’d be us skeptics)…..;-)

  33. Thanks for posting this Anthony. I would like to see a lot more posts on this topic. I do not agree with Mr Mosher that Result? no difference. If we accepted his dismissive argument about the lack of importance of calibration of the instruments (all of them), then why do other disciplines make a big fuss about correct readings? Is it not important when we are talking about only .7 deg per century, and the economies of many countries being trashed for the sake of that? If instrument calibration adds a degree or two to the “noise”, then the .7 is meaningless.

  34. John Marshall says:
    January 22, 2011 at 1:35 am
    So in reality the 0.6C feared temperature rise could mean that statistically there has been no temperature change. And all models give the wrong answer because temperature inputs are incorrect.
    Very interesting article is a pdf copy possible please Anthony?
    ____________________________________________________________
    Frank – no need to bother Anthony. Just copy and paste into your word processor and export or save as a pdf. Use MS Word, Word Perfect, Open Source or whatever share ware program you want, they pretty well all do that. I don’t know about copyright.

    I remember having thermometer correction sheets in the labs when calibrating thermometers and old steel survey tapes as measurements all had to be adjusted for temperature, – even temperature. ;-) Even modern electronic distance measuring devices have a temperature bias that needs to be adjusted although newer multifrequency devices have self correcting circuits but still: “Need to know change in elevation of two points (slope correction), air temperature, atmospheric pressure, water vapor amount in air”… –all can have an effect on the measurement.

    In other words, EVERYTHING requires adjustments to correct for site conditions. All instruments and observers have built in biases and inaccuracies and NOTHING is absolute.

  35. It seems to me that there are more positive biases than negative. Consider, glass hardening will always increase over time. At the beginning of a temperature record, the thermometers are new. 100 years later, most of them are old to very old, and are reading high by 0.7 degrees. The enclosures do the same thing. At the beginning all are shiny and new. After 100 years most are in bad shape. Even if they have been repainted, it was with modern paint, not the old white-wash, which was more reflective. You need not consider UHI problems, or siting difficulties to explain all the temperature rise seen over time.

  36. This doesn’t indict the temperature record.

    Accuracy of thermometers matters hardly at all because the acquired data in absolute degrees is used to generate data which is change over time. If a thermometer or observer is off by 10 whole degrees it won’t matter so long as the error is consistently 10 degrees day afer day – the change over time will still be accurate.

    Precision is a similar story. There would have to be a bias that changes over the years that somehow makes the thermometers or observers record an ever growing higher temperature as the years go by. Urban heat islands are perfect for that but instrument/recording error just doesn’t work that way. There are thousands of instruments in the network each being replaced at random intervals so the error from age/drift is averaged out because there is an equal distribution of old and new instruments.

    This might be interesting in an academic way but isn’t productive in falsifying the CAGW hypothesis. The instrumentation and observation methods are adequate and trying to paint them as less than adequate only appears to be an act of desperation – if the job is botched blaming the tools is no excuse.

  37. Another problem comes from taking the average temperature to be halfway between Tmin and Tmax. This may well be the case if temperatures rise and fall in a cyclical fashion in a 24-hour period. However, an anomalous event, such as hot aircraft exhaust blowing in the direction of the station, can increase Tmax considerably, thereby creating a considerable error in the ‘average’. And because such ‘hot’ anomalies are more likely than ‘cold’ ones, and also increasingly likely over time, the bias is likely upwards.

  38. xyzlatin says: January 22, 2011 at 6:57 am
    […] If instrument calibration adds a degree or two to the “noise”, then the .7 is meaningless.

    Many good points above, I think this sentence says it all. Trying to parse fractions of a degree change from the current system is not possible. Temperature is very difficult to measure accurately. I know, I successfully measured and controlled temperatures in an IC process to +/-0.1°F in the eighties when linewidths went sub-micron.

  39. Steven Mosher says:
    January 22, 2011 at 3:23 am

    Result? the error structure of individual measures doesnt impact your estimation of the long term trends.

    NOW, if many thermomemters all has BIASES ( not uncertainty) and if those biases were skewed hot or cold, and if those biases changed over time, then your trend estimation would get impacted

    Result? no difference.

    Absolutely right, Steve.

    Skeptics are no better than CAGW alarmists in their willingness to believe anything which supports their own beliefs or disputes the beliefs of the other side. It’s sad. Objectivity is a rare and precious commodity.

  40. Great bunch of real data in one post… First time I ever heard of the following:

    […]”However, a glass thermometer at 100c is longer than a thermometer at 0c.”

    The 5 degree cold record set in International Falls (posted below), pretty much eliminates the problem of the thermometer’s resolution.

    Look for records like this in the future, burr.

    We need a NEW name for the next minimum, we are moving into.
    My vote is for “David Archibald Minimum”.
    Well maybe not, having one’s name attached to miserable weather, may not be the best way to be remembered.

  41. Steven Mosher,

    You say,

    “NOW, if many thermomemters all has BIASES ( not uncertainty) and if those biases were skewed hot or cold, and if those biases changed over time, then your trend estimation would get impacted”

    Could you confirm that this is what you meant?

    (And, I would be grateful if you could comment on the effect of transducer non-linearities, too)

  42. This is why climatologists work with anomalies rather than estimates of absolute temperature. And they do so on long time scales using lots of instruments. Instrument error is of interest only if a general bias becomes significantly greater over time.

  43. Has anyone investigated the possible impact of observer preconception bias?

    In particular, does positive bias in temperature readings rise and fall with belief in AGW?

    Would this not be a worthy topic for investigation.

  44. John McManus:

    So; the old trees make lousy thermometers is now thermometers make lousy thermometers.

    They do, when you’re trying to measure fractions of a degree change over a period of decades to centuries.
    The fact that tree rings etc make such lousy temperature proxies is probably due to the fact that nothing in nature exhibits any great sensitivity to small temperature changes – especially to warming changes. Most plants and animals do better with warmer temperatures. So if nature isn’t particularly perturbed by small temp increases, why should we be?

  45. Anyone interested in measurement standards and calibration of their instruments can call up the NIST website for information that includes certified calibration labs and the traceability of measurements to a national or international standard.

    There is a paper presented in 1999 by Dr Henrik S. Nielsen titled what is traceability and why do we calibrate.
    In it he defines traceability.
    “property of the result of a measurement or the value of a standard whereby it can be related to stated references, usually national or international standards, through an unbroken chain of comparisons all having stated uncertainties.”.
    He also adds.
    “The concept of traceability requires more than just a calibration sticker. It requires an uncertainty budget for the measurement process and traceable calibration of all the instrument and environmental attributes that have a significant influence on the uncertainty.
    In addition to identifying the instrument attributes that need to be calibrated to establish traceability, the uncertainty budget also allows for the optimization of measurement processes both in terms of uncertainty and dollars and cents”.

    In short, if the measurements are bad everything determined from the measurements is bad.

  46. @Mark:
    I challenge the validity of this: ” a glass thermometer at 100c is longer than a thermometer at 0c.” You’re talking of about a length increase measured in microns.

    I also challenge the idea that glass thermometers shorten over time. That seems very unlikely given the longevity of glass structures. It’s far more likely that mercury picks up impurities from the glass to change its coefficient of expansion. But even that is rampant speculation. What is your source?

    I challenge the idea that high quality thermometer scales are inherently inaccurate (which is what you imply throughout). I’m sure it’s true for some manufacturers, but, is it true for all?

    Also, “a quantum mechanics effect in the metal parts of the temperature sensor ” causing drift sounds like you don’t understand what’s causing it. Can you be a bit more specific?

    I trust the accuracy of my lab thermometers to about 0.5C. Nothing you’ve written changes that. If I had a higher quality long thermometer with gradations in degrees F, I would trust that to about 0.5 F.

    What is your source for this statement: “In a high quality glass environmental thermometer manufactured in 1960, the accuracy would be +/- 1.4F. (2% of range)”? If the range is -40 to 120 F, then the total error is +/-3.2 degrees at 120F, which seems unacceptably high. If the error is 2% of the reading, that also seems like a low quality instrument. So, how is this “2% of range” working?

    I also challenge the idea that error in the thermometer adds with error in the observer. The error in the thermometer is “fixed” and systematic (therefore correctable) while the error in the observer is usually random. If you’re looking for global warming (and who isn’t?) then the systematic error goes away because you are subtracting the baseline of that instrument. Obviously, changing thermometers without overlap (as has been done) will re-introduce this error, possibly compounding it. However, if you change the thermometers enough over time, the error goes away (with certain caveats).

    I think the error induced by variable Stevenson screens has to be larger than the error in the thermometer. Is the paint on these screens standardized? Are they just not repainted for years on end (as in the one you picture)? That has to be good for about 2% error.

  47. Great article. I would bet the guys in the Metrology Dept., at one of my old employers would have loved it.

    The discussion raises a point I had with some “friends” over at RC. While the discussion was about sea levels, I brought up the subject of instrument accuracy. It was maintained by some there, that by taking a large number of readings, it will all average out, to a null mean.

    I pointed out that in the manufacturing process, many products are tested and passed if they are within, say 1.00 deg. of the reqts. In manufacturing, a “batch” or “run” can have a bias, or non-zero mean. So a “batch” of thermometers can have a mean of 0.75 deg., a 3-sigma variance of 0.25 deg., and still pass inspection. Hence, no matter how many readings, or thermometers of that batch, the mean error will still be 0.75 deg.

    That caused some interesting discussions.

  48. I’ve wonder for along time whether they could determine the earth’s temperature accurately enough to say it’s increased by 0.7 C over the last 100 years. First of all who was measuring the temperature in the Arctic 100 years ago? How about Antarctica? And how about the temperature over the oceans? Afterall 70% of the earth’s surface is covered by oceans. During WWII the US launched weather ships, but there weren’t many of them. Before there there was very little data for those oceans.

    Now add in the UHI effect and all the problems noted in this article and how can anyone make an accurate determine of the earth’s temperature change over the last 100 years?

    It’s just aonther example of the government’s use of fearmongering to scare the people into allowing things they would not normally accept, like carbon taxes, cap and trade schemes, etc. Another example is how the fear of terrorism is being used to deprive us of our rights. The TSA now feels free to feel us up at airports or take naked body scans of us. The American people would not have allowed this in the past but now they are so scared and so dumb that they accept it as part of living in the 21st century.

  49. @Dave Springer,

    Could you explain or give a link to the ‘change over time’ procedure you mention?
    Is it about summing and averaging the ‘first difference’ over a series?

    Do they do the whole series in one go? or do they do it in a kind of hierarchy of series; weekly, then monthly, then yearly?

    Either way, how do they ensure that they keep the series to an odd number?

  50. It would be interesting to take a random sample of active weather stations that are actually being used (of various types of thermometers) and run a calibration test on them. Then analyse that to see if there is any bias in the errors.

    It seems to me that all theoretical discussions about error aside a real world measurement would answer the question.

  51. J. Hall
    One of the things which I think the nay-sayers are overlooking is that the official measurement sites have changed, changing the micro-climate. In addition, with
    different sensors used over the years, comparison, without looking at the rated
    precision of each sensor, or the accuracy/frequency of calibration still skews the
    results, even if you are looking only at the change of the data over the years.

    Example: during the late 1950s to 1960s, many temperature measurements were
    moved from the city to the ‘new’ airport locations. At many small airports, the
    site for measuring temperature was about 20 yards away from the weather office.
    At the major airports in the U. S. new HO-60 temperature sensors were installed
    about 20 feet off the runways, and used for both climate data and aircraft operations.

    These sensors had a stated accuracy of +- 1 degree F, and used a relatively large
    sensor which had either platinum or nickel wire bobin to measure temperature by resistance change. Then, to cut expenses, the HO-61 instrument was designed, which used the expansion of a fluid to change time of a pulse. Again +- 1 degree F, with both
    types of sensors checked about once every 30 days to be within calibration. Note
    that the Electronics Technician would use a mercury in glass thermometer, with
    an accuracy of +- 1 degree F to measure whether it was in calibration.

    With the HO-83 sensor, which used a platinum 100 ohm sensor, the accuracy of
    the basic system was stated as +- 1.8 degrees F (+-1 degree C). This was known
    to be wrong, as they had to ‘fudge’ temperatures in accepting the system, for
    the contractor to meet tthe standard. Also, due to inadequate ventilation, the
    system was prone to inaccuracies due to sun on the sensor shield, and wind effects.

    Redesign implimented in the ASOS system has improved accuracy back to +- 0.5
    degree celsius over the normal temperature range (-15 degrees C to +40 degrees C).
    However, with a high current sent through the platinum sensor, self heating places
    a strain on the sensor if the air flow diminishes even a little bit, causing higher
    temperature readings. Also, the accuracy of the electronics, and the method used
    to measure, doesn’t take into account that the sensor readings have a warm bias
    at low temperatures, and that, if you are in the minus range, the temperature
    average of the readings, is always rounded up if it is 0.5 degree below an
    even degree, for Max and Min computations. Another non-disclosed item is
    that, even though the dewpoint sensor has been offloaded to another sensor
    outside the housing, the dewpoint mirror cooler is left on in all ASOS temperature
    sensors, which can, if insufficient ventilation occurs, cause the whole temperature
    sensor reading to be biased upward, depending on air flow/wind conditions.
    Calibration is normally run each 3 months, against a sensor which has a 4 minute
    response, like the HO-1088 sensor (a revised HO-83 sensor with more ventilation),
    and is considered by the technician to be in calibration if it is within about +- 2 degrees to +_3 degrees C, although the guiding directive allows up to +- 5 degrees
    C as being within calibration. Again, not using the full correction curve which is non-linear below zero degrees C can also bias the readings. The electronics also
    have their biases, including the Analog to Digital converter and other constant
    current electronics used to set current flow through the sensor.

    Just some things to think about. There was an article in one of the AMS journals
    which talked about accuracy of the electronics used in temperature measurement
    systems a few years ago.

  52. Steven Mosher says:
    January 22, 2011 at 3:23 am

    The other thing that is instructive is to compare two thermometers that are within a few km of each other.. over the period of say 100 years. Look at the corellation.
    98% plus.

    Or you can write a simulation of a sensor with very gross errors. simulate daily data for 100 years. Assume small errors. calculate the trend. Assume large errors. calculated the trend.

    Result? the error structure of individual measures doesnt impact your estimation of the long term trends. NOW, if many thermomemters all has BIASES ( not uncertainty) and if those biases were skewed hot or cold, and if those biases changed over time, then your trend estimation would get impacted

    Result? no difference.

    _____________________________________________________________

    EXACTLY!

    In this case strong spatial/temporal autocorrelation is indeed your best friend.

    You can take any temperature record (of sufficient duration, say 60-120 years) and round the raw 0.1C temperature precision records (e. g. Canada HAWS) (~ 10-bit resolution) to either 1C (~ 7-bit resolution) or 10C (~ 3-bit resolution) temperature readings and get the exact same low frequency trendlines (up through 6th order polynomial trendlines in Excel 2010).

    The same is true for monthly or yearly averages and reporting them to 2-3 decimal points, even though the actual readings are only to the nearest 1C (errors in readings are indeed uniform with 0.5 high and 0.5 low). Remember we are using N = 60 (or 30) or N = 730 (or 365).

    And no, temperature uncertainty does not vary with sigma, but varies as sigma/SQRT(N), regardless of the amount of specious handwaving that ensues in E&E.

    Both of these can be easily confirmed in Excel 2010 (or any other analysis SW for that matter). Note that all readings should be converted to Kelvin/Rankine (and than back again to C/F) so as not to, for example, conflate a mean of 0.1C with a mean of 0.0C (0.1/0.0 = n/0 = infinity issues).

  53. Steven Mosher
    ” the error structure of individual measures doesnt impact your estimation of the long term trends. NOW, if many thermomemters all has BIASES ( not uncertainty) and if those biases were skewed hot or cold, and if those biases changed over time, then your trend estimation would get impacted. Result? no difference. ”

    This is only true if the errors are normally distributed. The idea that sampled and averaged readings maps to a normal distribution owing to CLT and therefore so do their errors, is a nice theory — assuming things are iid and stationary. But are they?

    This situation reminds me a lot of the fallacy of Black Scholes. Black Scholes asserts that for any financial option of sufficient liquidity, an interest rate can be quantitatively found that complete neutralizes risk. The problem is that small amounts of real world messyness completely destabilizes the risk equation. In the Black Scholes case, something called matching friction (a very real world issue) is among the culprits. In our case, I suspect that things like instrument drift, UHI, and TOD adjustments, infilling, etc. completely invalidate the idea that CAM neutralizes the aggregate error.

  54. I mentioned this before in the previous thread on this topic, but in all the down-in- the-weeds discussion on this thread, one very important aspect of this lack of adequate instrument control has been neglected. Specifically, one of the primary reasons for having traceability to National or International standards, and a regular calibration schedule, is for legal and liability reasons. This data is being used to drive many industries to change the way they do business, manufacture product, etc. due to regulatory and contractual requirements put in place that may be based on suspect data. This results in added costs for everyone downstream.

    If the data resulting from these instruments cannot be trusted within known uncertainty as a result of formal and traceable calibration and management then those who own that data may be at risk for substantial liability claims. If I were a lawyer I would absolutely be looking into this.

  55. Does anyone really think that climatologists who claim to be able to measure temperatures using tree rings are concerned in any way about accuracy and precision of modern thermometers? Except of course when the two records are juxtaposed and then they have to ‘hide the decline’ as they have NO correlation in value or in trend.

    And of course – to stop everyone running down this rathole…

    Atmospheric Temperature does NOT EQUAL Atmospheric Heat Content.

    Then entire claim of ‘greenhouse gases’ (sic) causing [global warming|Climate Change|climate catastrophe|climate disruption|tipping points] is based on the hypothesis that these gases trap heat in the atmosphere. To show this is the case the climatologists measure atmospheric temperature

    BUT

    Atmospheric Temperature does NOT EQUAL Atmospheric Heat Content.

    This reality remains the same however accurately you quantify the incorrect metric as it ignores the huge effect on atmospheric enthalpy from water vapor.

    The heat content of the Earth is far more accurately measured by measuring the temperature of the oceans as ocean temperature is closely equivalent to ocean heat content and the top 2 or 3 meters of ocean holds as much heat as the entire atmosphere.
    So while all the media and climatologists are leaping about saying how this year average atmospheric temperature was almost the same as that in 1998 – the seas are getting colder

    Accuracy and precision are important – but the first thing to ensure is that the correct metric is being quantified.

  56. Don’t want to rain on everybody’s parade, but there’s a problem here.

    I’m looking a a plot of 12 Greenland temperature records. They all show the same trends (flat from 1890-1920, 3C of warming from 1920-1930, 2C of cooling from 1930-1990, 2C of warming since).

    Now I’m looking at 25 records from Northern Europe. They also show the same trends (flat from 1890-1920, 1.5C of warming from 1920-1940, 1.5C of cooling from 1940-1970, 2C of warming since. They also all show the same 1.5C “spike” in the early 1970s, and records in the WWII war zone all show the same 2C of cooling in 1940-42.)

    Now I’m looking at 9 records from in and around Paraguay. Again they show the same trends (about 1C of cooling between 1945 and 1975 and flat thereafter.)

    Now I’m looking at 29 records from the Sahara. They show the same trends too (flat to 1975, 1C of warming since.)

    Now I’m looking at 11 records from the Central Pacific. Again, the same trends (1C of cooling from 1945-1975, 0.5C of warming since).

    Now I’m looking at 11 records from Western Australia. Once more they all show the same trends (no warming at all since 1900).

    I don’t have an exact count, but I would guess that 70-80% of the air temperature records in the global data base show the same overall trends as the records around them. We wouldn’t get this result if the temperature records really were seriously distorted by reading errors or other biases.

  57. With this big of a margin of error, are any global temperature records scientifically useful? Before the AGW scare, what were all these temperature readings done for?

    What about the time of day when the thermometers are checked? Do modern thermometers automatically record highs and lows, or do they rely on humans showing up at specific hours and minutes?

    It seems fromn this article that, theoretically at least, all the claimed warming over the past century could be nothing more than homogenized guesswork. How can James Hansen spend years arbitrarily adjusting and re-writing the historical temperature records to tiny fractions of degrees, over and over again, when the thermometers themselves don’t come anywhere close to such alleged precision? Is it now accepted scientific practice to just make up adjustments and “corrections” and send out press releases about ones new alarming discoveries?

  58. Dave Springer wrote:

    “Accuracy of thermometers matters hardly at all because the acquired data in absolute degrees is used to generate data which is change over time. If a thermometer or observer is off by 10 whole degrees it won’t matter so long as the error is consistently 10 degrees day after day – the change over time will still be accurate.”

    The change will only be accurate if the thermometer itself does not change with time but Mark pointed out that they do change when he wrote “Temperature cycles in the glass bulb of a thermometer harden the glass and shrink over time, a 10 yr old -20 to +50c thermometer will give a false high reading of around 0.7c.”

    Dave Springer also wrote:

    “This might be interesting in an academic way but isn’t productive in falsifying the CAGW hypothesis. The instrumentation and observation methods are adequate and trying to paint them as less than adequate only appears to be an act of desperation – if the job is botched blaming the tools is no excuse.”

    The purpose of measurement should be to discover what is happening. Accuracy and reliability should be prime considerations irrespective of your views on the CAGW hypothesis.

  59. Dave Springer says:

    “The instrumentation and observation methods are adequate and trying to paint them as less than adequate only appears to be an act of desperation – if the job is botched blaming the tools is no excuse.”

    In other words, everything averages out over time and over the number of stations, so individual accuracy is unimportant.

    So then, why have Setevenson screens at all? Why worry about accuracy? Just any old thermometers in any random locations will eventually average out, and the temperature and trend will be clear.

    Ridiculous. Accuracy matters.

    And please try to explain why the “adjustments” to the temperature record always show either higher current temperatures, or lower past temps – making the rise to the current temperature look scary. Current temperatures are never adjjusted downward. What are the odds, eh?

    The answer, of course, is too much grant money and too much taxpayer funding of NOAA, IPCC, GISS, etc.

    Money raises adjusted temperatures.

  60. One method of analyzing the instrumental errors is to find an independent instrument and do a formal cross-calibration.

    The satellite record is independent. But the bulk of the comparisons are ‘correlation studies’, or compare the total surface record with the satellite record. That isn’t the same thing, and of course they’re “well correlated.”

    But a cross-calibration allows the actual quantization of the uncertainties in the errors. Site-change errors, and other errors that occur during the (limited!) period of overlap should even be recognizable and correctable once one has a grasp of the relationships.

  61. David S wrote:

    “During WWII the US launched weather ships, but there weren’t many of them. Before there there was very little data for those oceans.”

    On the contrary British ships had been making and recording weather observations since the 1780s. It is estimated that there are about 250,000 surviving log books containing some meteorological data in Britain and there is a project (still in its early days) to extract the data from them. See the website below for details.

    oldWeather

    http://www.oldweather.org/

    I particularly like this quote from the page on “Why Scientists need You”.

    “The Old Weather project isn’t about proving or disproving global warming. We need to collect as much historical data as we can over the oceans, because if we wish to understand what the weather will do in the future, then we need to understand what the weather was doing in the past.”

  62. I’m sure glad that the manufacturers of car, planes, and trains don’t take Mr. Springer’s attitude toward instrument error. I wonder if Mr. Springer would be comfortable flying in a Boeing 777 that was built according to his logic.

  63. Dave Springer and Steven Mosher own a red herring farm in East Anglia, UK. In the engineering world, it is well understood that to be able to control a variable, first one must be able to measure it. “Weather”-station thermometers were developed to measure “Weather” to the satisfaction of “Weather” forecasters, pilots, ship captains, and other concerned with what the “Weather” would be doing in the coming hours or days.

    Attempting to use this data to determine “Climate Change” is ludicrous and meaningless. To control “Climate Change’ we must be able to measure “Climate Change.” We probably only started measuring this hugely variable, non-linear, non-coupled phenomenon in the late ’70s with satellites, well, not counting Farmer’s Almanac.

    Ice records from Greenland clearly show that the Earth has been far warmer than now for most of the last 10,000 years, from the Oxygen Isotope ratios in the ice. This is not controversial. These so-called “Climate Scientists” are frauds, every single one, all know this, none ever ever ever says it out loud…

  64. I think that Steve Mosher is mostly correct – the trend is not as much a problem, but there remains a significant issue w.r.t. bias.

    What this means is that the overall increase in T since 1850 is likely to be about 0.7C +/- a small fraction of that number, but the actual T at any point in the series remains accurate to only within +/- 2C.

    As has been noted above, that the world is likely warmer today than it was 160 years ago is not in dispute. There are, after all, independent measures of this such as receding temperate latitude alpine glaciers. Making claims as to what years are warmer or colder (or ranking them) though is laughable given the wide uncertainty.

  65. Stephen Mosher and a few others…

    You are making a typical mistake by convolting Central Limit Theorem with the Limit Of Observability.

    What Mark is talking about here is that any thermometer measurement has a limit of observability of T ±1.3°C. That means that the result has EQUAL PROBABILITY of existing anywhere in that range. We do not know what the absolute value of temperature is. All we can say is that it exists in a range.

    The way you go about improving the accuracy is by reducing the limit of observability either by calibrating against a more accurate device or improving the scale measurements on the current device (as in employing Vernier scales) and characterising it.

    After all that is what the field of metrology is all about.

    So any temperature measurement and especially a temperature anomaly is subject to the same ±1.3°C limit of observability (which is often just called an “error” or an accuracy error)

    To say that random errors as per CLT even out is something completely different. That applies to a theory that says that on average a result will most likely be in a normal distribution within the given observable range…

    But this is an extrapolation. It does not account for real characterisation which often show step wise shifts and drifts. Exactly as Mark was talking about. To then use this as a basis for an average temperature record TREND is another extrapolation…in this case 2 strikes and you’re out.

    As per your example if one thermometer had a recorded limit of observability of ±1.3°C and 100 or so others had one of 0.1°C then the average observability or measurement limit could fairly be approximated as 0.1°C…The larger error wouldn’t really make a dent.

    But if all thermometers have an average a limit of observability/measurement of ±1.3°C then all anomalies and absolute temperature readings are subject to this error.

    Then, to paraphrase the words of Richard Feynman, “I think you have a real problem” with the temperature trend.

    And on one last note. This goes to show once again the lure of theory over cold hard experimental reality. A lure that is beaten out of metrologists, or at least the ones I know at NPL in the UK.

    I appreciate that the ugly nature of the limit of observability/measurement can be unpalatable for some. But that’s just the way it is.

  66. JDN says:
    January 22, 2011 at 8:05 am
    @Mark:
    I challenge the validity of this:

    See National Institute of Standards and Technology.

    http://www.temperatures.com/Papers/nist%20papers/2009Cross_LiG_Validation.pdf

    Interesting comment in this article is that mercury or organic filled thermometers should not be used horizontally as this exposes them to gravity problems and column separation, and they need more frequent calibration as they can drift higher as a result.

    I wonder about how accurate the readings are in some areas – read NOAA instructions: http://www.nws.noaa.gov/om/coop/forms/b91-notes.htm

    In the article below, it is recognized that the glass changes over time and the you get column separation from time to time and the observer needs to correct separated columns. I note in the photo that the alcohol thermometer is sloped up and the mercury thermometer is sloped down, the latter being contrary to recommendations as a downward slope promotes column separation. Note other sources of error in the article.

    http://www.wmo.int/pages/prog/www/IMOP/publications/CIMO-Guide/CIMO%20Guide%207th%20Edition,%202008/Part%20I/Chapter%202.pdf

    Both ordinary thermometers and maximum and
    minimum thermometers are always exposed in a
    thermometer screen placed on a support. Extreme
    thermometers are mounted on suitable supports
    so that they are inclined at an angle of about 2°
    from the horizontal position, with the bulb being
    lower than the stem.

    So unless things have changed, the thermometer in the photo at the top of this article is installed incorrectly?

  67. In all the comments to this point there is only one comment mentioning relative humidity measurement. One other comment mentions relative humidity. It seems to me that if we are to measure temperature for the purpose of estimating global warming, we should be looking for the temperature of dry air at sea level, or the heat contained in one cubic meter of dry air at 1000 mbar. Of course with appropriate 1 sigma error bars for each calculation.

    On another note, I see graphs all the time with varying vertical scales expressing temperature anomaly. Seems to me that these graphs should always use the same range such as +- 2 deg C so that the visual differences are apparent.

    Over all, I don’t see much change in the global climate during my lifetime (I’m 76).

  68. Dave Springer says, “the error from age/drift is averaged out because there is an equal distribution of old and new instruments”.

    Not necessarily. You set up a new program around 1900 and purchase hundreds of new thermometers. That’s an approximation of governments suddenly deciding to begin weather observation on a large scale. Over time you expand your program and replace some of your thermometers. When does the thermometer age distrubution reach equilibrium? We don’t know.

    The age distribution depends on how long the thermometers stay in service. As a first approximation, the average age might be half the age of the oldest themometers. If your age error is .7 C at 100 years, and your average age of thermometers is say 50 years you will have a trend component due to systematic error of .35 C.

    Here is where it might get interesting. If you do a temperature adjustment in your analytical procedure to make the new readings match the old readings, then you artificially make the new thermometers read the same as the old ones. That correction potentially skews the distribution so it never reaches equilibrium. Your collection of thermometers effectively just gets older and older no matter whether you replace them with new ones or not.

    Am I missing something?

    Here is what NOAA say about the temperature trend (http://www.ncdc.noaa.gov/oa/climate/globalwarming.html):
    “Global surface temperatures have increased about 0.74°C (plus or minus 0.18°C) since the late-19th century, and the linear trend for the past 50 years of 0.13°C (plus or minus 0.03°C) per decade is nearly twice that for the past 100 years.”

    After reading this metrology post, you just have to laugh. I prefer laughing to bleeting.

  69. It is even worse than that. Speaking of thermometers circa 1953, the constancy of the cross-section of the mercury tube varied a bit. With a tiny bit of “pinch” the readings above it would be high, and a bit of “wow” would make readings above the “wow” lower than real.

    As for it all “averaging out” with many readings, my chem prof demonstrated that it was as likely that the average would be on the low side or the high side.

    People should not treat rounded numbers as if they were integers, as they are no such thing. Rounded numbers represent ranges, not distinct numbers. Write the readings in a center column. Add the high end margin to the reading, and write the result in a column to the left, and the reading less the low margin of error on a column to the right. Add the left column, average it, then add the right column and averaged it and the results of these average will represent the range of the reading, high temperature possibility of the range on the left and low temperature possibility of the range on the right. The actual temperature will fall somewhere in between.

    For surface of ocean temperatures, an ordinary seaman, under the direction of a low ranked ships officer, would toss out a bucket on a rope, allow it to sink to an unknown depth (ship moving) and then pull the bucket up, put a thermometer in (and the time of immersion varied considerably), then read the thermometer. It is highly unlikely that the ordinary seaman was ever trained to properly read a thermometer, and a toss-up as to whether the thermometer stayed immersed long enough to stabilize to the temperature of the water in the bucket. Most probably depended on how busy the ship’s crew was. I can assure all that getting the temperature right was no priority, and very much a bother.

    To actually attempt to read a mercury thermometer to an accuracy of 0.1 degree is foolhardy, and to even think that averaging temperatures with a margin of error of a degree to an accuracy of 0.01 degree will result in any reality is beyond belief stupid. These folks should take grade-school arithmetic over again.

    More than that, ships courses through a sea lane, older days, could vary up to at least 50 miles each side of the intended course.

  70. Talking of folks in the Yukon back in the 70’s, Oliver Ramsay says with reference to the question I raised about observer preconception bias (OPB):

    “Everybody exaggerated the temperatures. It was always said to be colder than it really was. … So, imagine you’re 21, from Vancouver, you have a job at the Dawson City airport and you’re looking at a thermometer that reads -39. What do you write down?”

    This explains well the kind of underlying psychology. However, to be clear, I was not suggesting fraud, but simply the possibility that unconscious factors, including preconceptions about climate change, will skew observational data when based on a tricky assessment, e.g., when the meniscus is about to halfway between two notches on the thermometer.

    And that raises another question. If we are contemplating the expenditure of $trillions to avert AGW, why not spend a few billion on new global land surface meteorological recording network of frequently calibrated, properly sited, high precision instruments that transmit data in real time via satellite to a computerized analytical system. At least then, if 2011 is reported to be the hottest or the coldest, or the wettest year on record, we could fairly sure about it.

  71. John Andrews says:
    January 22, 2011 at 11:09 am
    In all the comments to this point there is only one comment mentioning relative humidity measurement. One other comment mentions relative humidity. It seems to me that if we are to measure temperature for the purpose of estimating global warming, we should be looking for the temperature of dry air at sea level, or the heat contained in one cubic meter of dry air at 1000 mbar. Of course with appropriate 1 sigma error bars for each calculation.

    Thanks for noticing – everyone else is away arguing about the incorrect metric.

    One would almost think that this was a deliberate ploy to ensure everyone is involved in more and more detailed debate about the incorrect metric.

  72. Dave Springer says:
    January 22, 2011 at 7:24 am

    Steven Mosher says:
    January 22, 2011 at 3:23 am

    Result? the error structure of individual measures doesnt impact your estimation of the long term trends.

    NOW, if many thermomemters all has BIASES ( not uncertainty) and if those biases were skewed hot or cold, and if those biases changed over time, then your trend estimation would get impacted

    Result? no difference.

    Absolutely right, Steve.

    Skeptics are no better than CAGW alarmists in their willingness to believe anything which supports their own beliefs or disputes the beliefs of the other side. It’s sad. Objectivity is a rare and precious commodity.
    ———————————–

    Dave,
    This observation and other similar ones are made with tedious frequency. Please be assured that I believe that you think you are much more objective than everyone you disagree with.
    —————————
    Alfred Burdett says:
    January 22, 2011 at 9:00 am

    Has anyone investigated the possible impact of observer preconception bias?

    In particular, does positive bias in temperature readings rise and fall with belief in AGW?

    Would this not be a worthy topic for investigation.
    —————————-
    I think it’s a very significant factor that is probably impossible to quantify.
    As a young man in Canada’s frozen North in the seventies, I was one of many clueless adventurers working in the mines, mills and such of the Yukon. There was no AGW talk then but there was a fierce desire in our hearts to see ourselves as latterday Jack Londons, scorning the bitter cold as much as we disdained those softies in the South.
    Everybody exaggerated the temperatures. It was always said to be colder than it really was. In the winter, at least. In the summer we just bragged about the bugs and how many days we’d paddled on bannock leavened with wood ash.
    To this day, forty below is the legendary temperature that people put in books and songs. Never mind that they’ve never come close to it. Nowadays, we just throw in the “wind chill factor” and everybody’s impressed.
    So, imagine you’re 21, from Vancouver, you have a job at the Dawson City airport and you’re looking at a thermometer that reads -39. What do you write down?

  73. Do things like meniscus really affect trends? Somehow I think not since the meniscus is the same regardless of who is reading the instrument or what the current temperature is.

  74. Since average human height has been increasing since the late 1800s then one would expect an ever increasing temperature from the parallax error.

    Or are one of my assumptions in error here?

  75. About a year ago I asked on one of these threads how often the calibration of these NOAA thermometers was checked. Now I know: Never.

    Man, that would never cut it in the nuclear power industry. In a nuke plant, every single one of the ~10,000 instruments in the plant was required to have it’s calibration checked periodically – anywhere from monthly to every five years.

  76. The back end issue that seems to keep getting glossed over as people argue, and counter argue is quite simple.

    Ben D.

    “…although I question the larger possible error and I would hazzard to guess that 1.3C is the actual limit of the error since we would assume that the observer bias…”

    Dave Springer

    “…Accuracy of thermometers matters hardly at all because the acquired data in absolute degrees is used to generate data which is change over time. If a thermometer or observer is off by 10 whole degrees it won’t matter so long as the error is consistently 10 degrees day after day – the change over time will still be accurate.”

    There is a lot of assuming going on.

    The point is, that NO ONE seems to have taken the error rates or bias into account.
    Well, at least no one who actually cares that businesses are being driven into extinction because of this flawed theology.

    Leave it to mankind to make Extinction Level Events a practical business model.

  77. Although only partially related to this story, I’ve written a short post on my blog about the nature of a “proxy”. In particular, how is the rate of growth of a tree fundamentally different from a spreadsheet entry from an RTD. They are both recordings of approximations of a temperature over a period of time. For an RTD, the time can be very short. In addition, an RTD is much more precise. Accurate too if it is calibrated properly. Nonetheless, I would argue the term “proxy” is a misdirection. Tree rings are thermometers. Or at least they are being used as thermometers in Mann’s and others various papers.

  78. Ian W says:
    January 22, 2011 at 8:58 am
    “…Atmospheric Temperature does NOT EQUAL Atmospheric Heat Content.

    Then entire claim of ‘greenhouse gases’ (sic) causing [global warming|Climate Change|climate catastrophe|climate disruption|tipping points] is based on the hypothesis that these gases trap heat in the atmosphere. To show this is the case the climatologists measure atmospheric temperature

    BUT

    Atmospheric Temperature does NOT EQUAL Atmospheric Heat Content.
    This reality remains the same however accurately you quantify the incorrect metric as it ignores the huge effect on atmospheric enthalpy from water vapor.

    The heat content of the Earth is far more accurately measured by measuring the temperature of the oceans as ocean temperature is closely equivalent to ocean heat content and the top 2 or 3 meters of ocean holds as much heat as the entire atmosphere….”
    ——————————————
    Ian is right. I have long argued that we should ditch the land based temperature record; it was never designed for the purpose of assessing global warming and it is now too corrupted to be relied upon and, in any event, it is essentially of minor importance.

    It is the heat contents of the oceans that is important. Given the entire volume of the oceans, they probably account for about 99% of the total heat content of the earth (ignoring the core/mantle). If the oceans are not warming, global warming is not happening. If the oceans are heating this is most probably due to changes in cloud albedo, since only solar radiation (or geothermal energy) can effectively heat the oceans.

    The oceans are the key to this debate for another reason. Namely, one fundamental problem for AGW is whether back radiation from increased CO2 in the atmosphere can in practice warm the oceans. Due to the wavelength of back radiation, it cannot effectively heat the oceans. It is all fully absorbed within about the first 10 microns and any heat absorbed by this layer either boils off, or is thrown to the air as spray and/or, in any event, cannot transfer its heat downwards to reach the depths required for circulation admixture.

  79. While Mosh is correct that random observation errors will tend to average out when computing long-run trends, problems arise when they are not random.

    Besides the drift issues Mark discusses, changes in rounding rules come to mind: If the temperature is between the 60 mark and 62 mark, are observers trained to read this as 60 or as 62 or to the nearer one? If the rule (or custom) was different in the late 19th century than in the mid-20th century, it could make a big difference in the trend change. And if it’s about half way, should it be rounded up or down? My father, and engineer, taught me that good engineering practice is to round ties to the nearest even value, so that on average they will cancel out, though that wouldn’t work if the marks are every 2 dF.

    Mark, for his part, rounds 15.5555… to 15.55, even though .0055… is clearly greater than .005, so that he is rounding down rather than off. In this example .01 makes no difference, but the same rule would lead one to round 15.5555… to 15 rather than 16 when rounding to integers.

    But when historical records have been converted from integer F to integer C, it wouldn’t surprise me if the software truncated down to the lower integer in one decade, but rounded to the nearer integer in another decade (with time-varying tie-breaker rules — Matlab, for example, rounds exact ties away from 0).

    When converting from F to C, I would carry at least one decimal place, just so the conversion doesn’t introduce additional meaningful error. The last digit has to be meaningless to prevent loss of precision. But even there it could make a perceptible difference whether the one decimal place is the result of truncation (which is easy in Fortran, say) or rounding (which requires a little more effort).

  80. I should have ended my last post with a final paragraph:

    This explains why Trenbeth cannot find his missing heat in the oceans. Due to the wavelength of back radiation, it is incapable of getting there!

  81. Steve in SC:
    You brought up a good point, but with regard to thermocouples and RTDs you left out a few issues.

    Thermocouples and RTDs have an internationally recognized temperature/voltage curve that is based on statistical samples of many devices of each sensor type. All thermocouples and RTDs are required to be within a predefined error from those curves. These errors can be as much as 1.5 degree C. Some devices are tested and have tighter tolerances.

    To read a thermocouple or RTD, you need a temperature indicator. These devices have additional errors that get added to the reading.

    I was responsible for the design of a temperature meter some years back. The unit was designed with a ±0.1% of reading ±1 count accuracy relative to the NIST thermocouple and RTD curves.

    What this means is a system using this meter reading 100°C would have an accuracy of ±2.6°C (±1.5° for the thermocouple, ±0.1° for the % of reading and ±1° for the count uncertainty) while the same unit configured to read 100.0°C would have an accuracy of ±1.7°C (±1.5° for the thermocouple, ±0.1° for the % of reading and ±0.1° for the count uncertainty) .

    The only time you could say you actually knew the temperature was 100°C was when you had a precision temperature standard to compare the system to. Any other time, you had system uncertainty as part of the reading.

    When using these devices, you must pay close attention to the accuracy specifications of the device and the accuracy specifications for the indicator. They all add up.

    Based on that experience, I can argue that anyone who claims they know the temperature of anything (except a NIST or BSI traceable standard) to better than ±1° or ±2°C is fooling themselves.

  82. As soon as I see the “I am a technical expert and i know that all scientists are idiots” theme I begin to suspect how this is going to turn out.

    And after a bit more reading we come up with
    ——-
    If the scale is marked in 1c steps (which is very common), then you probably cannot extrapolate between the scale markers.
    ——-
    The correct word is interpolate not extrapolate.

    The claim “you probably cannot interpolate between the scale markers” is false. Anyone trained to read scales properly can do this. Although I cannot answer for the training and conscientiousness of the people who make meteorology measurements.

    REPLY:According to the info you provide WUWT with comments, you’re a software developer that does typing tutor programs and some other educational k-12 programs for the MacIntosh, what makes you an expert on thermometers/meteorological measurement and everybody else here not? Provide a citation. – Anthony

  83. Wayne Delbeke says:
    January 22, 2011 at 10:42 am
    See National Institute of Standards and Technology.

    http://www.temperatures.com/Papers/nist%20papers/2009Cross_LiG_Validation.pdf

    There aren’t any experiments or references to experiments in this document that have any bearing on glass properties of thermometers. It is merely asserted. So, I challenge the NIST as well.

    Proof is going to be hard to come by. It requires a thermometer to be made, to be measured accurately w.r.t. volume of bulb, diameter of capillary, calibration, etc. prior to service, to undergo service for ~20-50 years and then to be measured again. Simulated aging is not acceptable. I say that this whole article and the NIST document backing it up are hearsay and could very easily be wrong. And please, let’s not have an argument along the lines of “how dare you argue with a large government agency full of experts”. We’ve seen how well that holds up.

  84. Oliver Ramsay says:
    January 22, 2011 at 11:51 am

    Dave Springer says:
    January 22, 2011 at 7:24 am

    Steven Mosher says:
    January 22, 2011 at 3:23 am

    Result? the error structure of individual measures doesnt impact your estimation of the long term trends.

    NOW, if many thermomemters all has BIASES ( not uncertainty) and if those biases were skewed hot or cold, and if those biases changed over time, then your trend estimation would get impacted

    Result? no difference.

    Absolutely right, Steve.

    Skeptics are no better than CAGW alarmists in their willingness to believe anything which supports their own beliefs or disputes the beliefs of the other side. It’s sad. Objectivity is a rare and precious commodity.
    ———————————–

    Dave,
    This observation and other similar ones are made with tedious frequency.

    Tedious frequency huh? I don’t think so. Each side accuses the other of it. That’s a given. But you’ll have to show me where someone in one camp points out that both camps do it.

  85. p says:
    January 22, 2011 at 12:05 pm (Edit)

    Do things like meniscus really affect trends? Somehow I think not since the meniscus is the same regardless of who is reading the instrument or what the current temperature is.

    #####

    for meniscus to affect trend you would have to have the following

    Let is say that the bottom of the curve was read 100% of the time for the first few years
    a 100 year record. every day, every month, every year. Then suppose that the top of the curve was read for the last few records. And suppose the difference between these was 1degree.

    You’d see a false trend. But if the observer allways records the top or allways record
    the bottom you see Zero trend. You just have a bias in the absolute temp. if the observer switches back and forth randomly, you’ll also see no trend bias.

    You can write little simulations of this if you like and “model” observer behavior.
    Or you can note that there is no reason to assume anything other than a normal distribution of reading practice and add the proper quantity to your error budget.

  86. [snip – Wow, such hypocrisy – see your comment below – simply saying something is “bogus” doesn’t mean it to be so, since you are in the mode of demanding citations, provide one – Anthony]

  87. Laurence M. Sheehan, PE says:
    January 22, 2011 at 11:40 am

    “As for it all “averaging out” with many readings, my chem prof demonstrated that it was as likely that the average would be on the low side or the high side.”

    Did your chem professor teach in a school similar to the school where Michael Mann teaches? Just because someone is a professor doesn’t mean crap.

    Once again – thousands of instruments, changing numbers not absolute numbers, the imprecision averages out and accuracy doesn’t really matter for finding trends.

  88. Temperature cycles in the glass bulb of a thermometer harden the glass and shrink over time, a 10 yr old -20 to 50c thermometer will give a false high reading of around 0.7c
    —–
    I don’t believe you. Provide a citation.

  89. Hoser says:
    January 22, 2011 at 11:11 am

    Thermometer data is corroborated by a lot of different proxies. Ya think whether the meniscus is read at the top or bottom effects arctic ice melt, tree rings, glacier retreat, rising sea levels, ice core gas/isotope ratios, and things of that nature? Ya think the age of the glass effects the temperature recorded by weather satellites? Would a thermometer used one time and discarded in a radiosonde be effected by any of that?

    The instrumental temperature record is not a point of weakness for the CAGW hypothesis. The CAGW hypothesis has more holes than swiss cheese in it but the surface thermometer record just isn’t one of them. The adjustments to the raw data and lack of adequate accounting for urban heat islands might be weaknesses but those are problems with the instruments or the manner in which the instruments are read.

  90. John Andrews says:
    January 22, 2011 at 11:09 am

    “Over all, I don’t see much change in the global climate during my lifetime (I’m 76).”

    I’m 54 and I’ve noticed the winters are generally much milder than when I was a kid. When my mom was a kid the river in my hometown completely froze over every winter and they’d plow it for miles and ice skate down it. It hasn’t frozen over like that even once in my lifetime.

    There is no doubt the climate has gotten warmer. The first days of spring weather when certain plants spring up and migrating birds return have been getting earlier in the year and the dates of the first and last frosts have been changing. Some people note these things. Evidently you aren’t one of those people. Not a farmer are ya?

  91. Over time, repeated high temperature cycles cause alcohol thermometers to evaporate  vapour into the vacuum at the top of the thermometer, creating false low temperature readings of up to 5c. (5.0c not 0.5 it’s not a typo…)
    ———-
    I don’t believe this. The vacuum would be very temporary at the time of manufacture. After that the region above the liquid would be filled with saturated vapour till the end of time.

    The amount of vapour would vary directly with the temperature. This effect should/would be incorporated into the scale since it is reproducible.

  92. Michael Moon.

    the climate does not exist as a phenomena that can be measured. I should link up the nice video from the newton institute workshop on uncertainty in climate science, because the guy does a really good job of explaining it.

    http://sms.cam.ac.uk/media/1083858;jsessionid=71886089203AF0121AED826A772E901C?format=flv&quality=high&fetch_type=stream

    Average heights do not exist. Average weight does not exist. we never observe averages.
    They are mathematical constructs that serve a useful purpose.

    The temperature in SF today is 60F. That’s observable. That’s the weather.
    What’s the climate for SF on Jan 22? Well, collect the weather for the past, say 30 years (assume stationarity over those years) and do a thing called averaging. This construct is called the climatology. The
    digits in this construct have nothing to do with the accuracy of the instrument recording the temp. lets say that construct is 54.8888771633784959
    do the math and you have another construct telling you how much warmer it is today
    than “normal”. And in this context the word “normal” has nothing to do with being
    “normal” That’s just shorthand for the computation of the “average”

    If you ask me how warm it was 15 years ago, i will estimate
    “54.8888771633784959”
    That estimate will be the best estimate, given no other information than knowledge of the average. that estimate will minimize my error. It will also be wrong. But, it will be the best estimate. and in a betting game if you bet something different you are more likely to lose the bet to me than win it.

  93. The other thing that is instructive is to compare two thermometers that are within a few km of each other.. over the period of say 100 years. Look at the corellation.
    98% plus.

    If both thermometers were affected by the same sort of physical process that was causing a degradation of accuracy over time, you would expect them to have highly correlated data, too.

    Or you can write a simulation of a sensor with very gross errors. simulate daily data for 100 years. Assume small errors. calculate the trend. Assume large errors. calculated the trend.

    If you’re drawing your “errors” using independent trials from the same distribution of course this will work. That is a trivial application of the CLT that proves nothing other than the fact that the CLT works if you meet all the requirements.

    In general, I don’t think anybody in here actually understands how the CLT or LLN work. A few came close. You do not need a normal distribution for the CLT to work. You need independent and identically distributed (i.i.d.) error distributions for errors to cancel with the sqrt(N). It is my hope that at some point everyone will figure out how much of a limit i.i.d. really is.

    Generally speaking, independence is not really required (independence is calculated over all time, which is not possible,) just orthogonality (uncorrelatedness) but the errors do need to be drawn from an identical distribution if you want the cancellation property to apply. That implies the same mean and variance, btw. The mean and variance need to exist, obviously, and they also should be stationary (unless they all vary identically over time,) which is not as obvious but easy to figure out. That also implies that if the errors are a function of the thing you’re measuring, e.g., a percentage, then the CLT will not apply. Sorry. Get over it. The same applies to situations in which the error distributions are unknown, which clearly applies to temperature measurements.

    Increased uncertainty in the data itself implies increased uncertainty in any calculations done with the data. If the i.i.d. requirement is not met, then you have no choice but to assume the errors do not cancel… anywhere. It sucks, I know, but them’s the breaks. Stay away from statistical endeavors if you cannot wrap your head around this very basic concept.

    Mark

  94. “Therefore the total error margin of all observed weather station temperatures would be a minimum of +/-2.5F, or +/-1.30c”
    =======================================

    This seems to be a good 1st step. Can anyone provide an analysis of how this uncertainty should be translated into the error margin for larger composites of temperature measurements or global anomaly measures? Is the error margin the same no matter how we transform the individual stations’ data ?

    Thanks !

  95. Dave Springer says:
    January 22, 2011 at 1:41 pm

    Once again – thousands of instruments, changing numbers not absolute numbers, the imprecision averages out and accuracy doesn’t really matter for finding trends.

    Can you prove all the errors are i.i.d.?

    I dare you to prove it. Until then, you are just plain wrong.

    Mark

  96. Drift, is where a recording error gradually gets larger and larger over time- this is a quantum mechanics effect in the metal parts of the temperature sensor that cannot be compensated for typical drift of a -100c to 100c electronic thermometer is about 1c per year! and the sensor must be recalibrated annually to fix this error.
    ———
    Quantum — I am about to talk rubbish warning.

    Misleading. Drift can mean different things and this explanation does not capture that.

  97. Here’s the bottom line: if you have a claimed increase of +.7 degree C and a margin of error of +-1.4, that doesn’t mean that “most likely” the temperature change was a positive .7 degree. What it does mean is that the temperature change was +2.1 degree or -.7 degree or anywhere in between and – here’s the punch line – we have no idea what the actual temperature was between those two values. Nothing more, nothing less. When the margin of error is larger than the claimed measurement, that conclusion is possible. And that is the stake in the heart of all the claims of global warming, no matter what term or terms is substituted for it.

  98. Apologies. Next to the last sentence should read: “When the margin of error is larged than the claimed measurement, only that conclusion is possible.” The word “only” was omitted in the original post.

  99. I know more than anyone should ever have to know about evolutionary psychology. Talk about a theory that explains everything and hence explains nothing. It’s almost like climate change science in that regard. I bet if we were to ask those EvP boys they could spin us a yarn about how there’s a psychological reason why people 100 years ago used to read a thermometer one way and how they read it differently now. That’s how evolution works… when an explanation is needed it never fails to produce one. Just don’t bring up pesky points like falsification or the scientific method because these sciences are so advanced they don’t need that stuff because it’s just never wrong anymore.

  100. Here is a typical food temperature sensor behaviour compared to a calibrated thermometer without even considering sensor drift: Thermometer Calibration depending on the measured temperature in this high accuracy gauge, the offset is from -.8 to 1c
    ———–
    This is not relevant. A food process thermometer is not a professional level meteorological thermometer.

  101. Mark T

    Prove it in a blog post? Hardly. I’ve been an engineer all my life. A very successful one. I know how these things work. If I didn’t I never would have been able to outperform my peers.

  102. yet frequently this information is not incorporated into statistical calculations used in climatology.
    ———–
    Probably because the accuracy is not relevant to the determination of trends. As long as the thermometer is not changed.

  103. Firstly, I hate to nitpick, but since this whole section is about accuracy and precision, the author should not have talked about “extrapolating” between markers. One extrapolates beyond data points, but interpolates between data points.

    The discussion of errors is correct as far as it goes, but in the context of discussing climatic changes they are mostly random errors in readings which cancel out over the long run. For example, for every reading that is in parralax error due to the observer eyeballing from above the parallel there will be another from below the parallel.

    Systematic errors occur in the same direction and are not cancelled. In this example it is claimed that 10 year old mercury thermometers give a 0.7 C high reading but old alcohol thermometers can read 5 C low.

    As for conversion from F to C, quoting too many significant figures at the end of the process gives a misleading impression of the precision (or resolution) of the result, claiming we know the result more precisely than we actually do, but does not effect the accuracy of the measurement, accuracy being how close the quoted figure is to the true figure.

    Strings of zeros immediately before the decimal point are non-significant. All zeros after the decimal point are significant. Thus 1500 has a precision of two sig figs, generally taken as meaning the true figure is between 1450 and 1550. 1503 has 4 figs (1502.5 – 1503.5). 1503.0 has 5 figs (1502.95 – 15203.05). 1503.0026173 – well you get the drift. The number of significant figures is a claim to the precision with which we know the result.

    At the end of a calculation you are not justified in claiming more than the least number of figures in any of the numbers in the calculation.

    The conversion example given actually shows another problem with significant figures.

    The F to C conversion funtion on my HP calculator cives the folllowing results (to four decimal places).

    60 F = 15.5556 C = 16 C to two significant figures
    61 F = 16.1111 C = 16 C
    62 F = 16.6667 C = 17 C

    Note that my calculator gives a value for 61 F as 16 C, not 17 C as calculated by the author. I assume the difference arises because he has taken the 2 F uncertainty that he says the thermometer readers use converted that to 1.1 C and adjusted final figures accordingly. But that is not the real problem.

    Whereas 61 and 62 F have two significant figures, 60 F has one significant figure, (55-65). So should not the centigrade conversion be given to one significant figure; 20 C (15-25)? Strictly speaking yes, but the context makes it clear that in this case the zero is intended to be significant. Such ambiguity is avoided by using power of ten or “scientific” notation:

    60 = 6 x 10 (that is 10 to the power of one – not sure how to do superscripts here) is one significant figure.
    60 = 6.0 x 10 is two significant figures.

  104. Dave Springer says:
    January 22, 2011 at 2:51 pm

    Prove it in a blog post? Hardly. I’ve been an engineer all my life. A very successful one.

    Good for you though I did not realize they were handing out degrees to newborns. Learn something new every day I guess. I am unimpressed by your “authority.” I have pretty significant qualifications myself but I have never claimed that is why I am right. I am right because you have not proven i.i.d., nor can you, not in a blog post nor anywhere else. Just knowing that the errors are potentially a function of temperature immediately invalidates any attempt.

    I know how these things work. If I didn’t I never would have been able to outperform my peers.

    Wow, now I’m really unimpressed. So, what, because you are sooooo good you can magically violate the requirements for fairly well established theory and make it work anyway?

    It doesn’t matter how good you are, you clearly do not understand the CLT nor the LLN nor the concept of i.i.d.

    What a joke.

    Mark

  105. Mark T

    On second thought the claim to being an engineer all my life isn’t quite true. From age 18 to 22 I was metrology technician in the military responsible for calibration, maintenance, and repair of all the weather forecasting gear at USMCAS El Toro, California. So I know a lot more than average engineer about all the gimcracks used by meteorologists. For the 30 odd years since then I’ve been a hardware/software design engineer. My life has been consumed by knowing how to read instrumentation and know the limits therein. Thousands of people reading thousands of different thermometers for hundreds of years won’t give you the confidence to say it was 70.2 degrees +-1 degree on April 4th, 1880 in Possum Trot, Kentucky but it will allow you to say the average temperature for April in Kentucky in 1880 was 0.5 degrees +-0.1 degrees cooler in 1880 than it was 1980. That’s just how these things work out as a practical matter. Trends from thousands of samples from thousands of different instruments are generally reliable. One sample from one instrument can be catastrophically wrong. There’s a continuum of increasing reliability with increasing number of observations, instruments, and observers.

  106. Therefore mathematically :
    60F=16C
    61F17C
    62F=17C
    —-
    No. Mathematically 61F = 16C if you round it correctly. But that is just nitpicking.

    The problem with the article is that there are 2 issues that are being confused in this conversion argument..
    1. The use of significant digits is used to convey the degree of uncertainty in the final result of a calculation. Hence the author is correct in that sense.
    2. But if values are to be used in subsequent calculations then it is common to retain guard digits to avoid biassing the final result. That is why it is an acceptable practice here.

    The author’s insistence of his superiority relative climate scientists is not justified.

  107. Thousands of people reading thousands of different thermometers for hundreds of years won’t give you the confidence to say it was 70.2 degrees +-1 degree on April 4th, 1880 in Possum Trot, Kentucky but it will allow you to say the average temperature for April in Kentucky in 1880 was 0.5 degrees +-0.1 degrees cooler in 1880 than it was 1980.

    No, they won’t, not unless you can prove the errors are drawn from independent and identically distributed distributions.

    That’s just how these things work out as a practical matter.

    Wow. My jaw is on the floor. So, what you’re saying is that this happens because you “just know it happens.” You don’t even know the theory behind it? Holy s***t, you really need to educate yourself. You clearly do not know what you are talking about, seriously.

    Trends from thousands of samples from thousands of different instruments are generally reliable. One sample from one instrument can be catastrophically wrong. There’s a continuum of increasing reliability with increasing number of observations, instruments, and observers.

    Again, not unless you meet the requirements of independence and identical distributions. You can certainly do the averages and get all sorts of extra digits, but they are meaningless.

    Mark

  108. Ike, you are correct that an increase of 0.7 ± 1.4 , whether temperature or something else, is not statistically meaningful. But it is not necessarily true that the true value must be between 2.1 and -0.7 and 0.7 is not the most probable value. Uncertainties are often quoted as 95% confidence limits of a normal or bell shaped probability lcurve. In this case that would mean there is a 95% chance the true figure is between 2.1 and -.07 and the mostlikely figure is at the top of the bell curve, at +0.7.

    And in terms of global warming, the measurements are the statistical average of thousands of measurements and probably hundreds of studies, using different methods (including satellite) so again the error averages out. Whereas you may get a measurement of +0.7 ± 1.4 for a single measurement at one station (I assume you are taking the error as the quoted one for a glass thermometer) it will simply not carry over to the global picture.

  109. Therefore the total error margin of all observed weather station temperatures would be a minimum of /-2.5F, or /-1.30c…
    ———
    This claim is ambiguously expressed in multiple ways and makes no sense.

    What exactly is the total error margin of all observed weather stations?

    Why “observed”
    Why “total”
    Why minimum instead of maximum?
    Why is this relevant to climatology?
    Why assume 1960 thermometers are relevant to the current network?

    REPLY: Try to collect all of your thoughts into one post instead of serial thread bombing – penalty box assigned to you – first warning – Anthony

  110. Ike says:
    January 22, 2011 at 2:46 pm

    “Here’s the bottom line: if you have a claimed increase of +.7 degree C and a margin of error of +-1.4, that doesn’t mean that “most likely” the temperature change was a positive .7 degree. What it does mean is that the temperature change was +2.1 degree or -.7 degree or anywhere in between and – here’s the punch line – we have no idea what the actual temperature was between those two values. Nothing more, nothing less. When the margin of error is larger than the claimed measurement, that conclusion is possible. And that is the stake in the heart of all the claims of global warming, no matter what term or terms is substituted for it.”

    That’s all sorts of wrong. It applies to single measurements. It’s a different ballgame when you have thousands of measurements from thousands of instruments and thousands of observers with dozens of different instrument manufacturers and changing technologies over the course of hundreds of years and then on top of that you have proxies totally unrelated to the instruments and those proxies are in general agreement. THAT’s the bottom line.

  111. And in terms of global warming, the measurements are the statistical average of thousands of measurements and probably hundreds of studies, using different methods (including satellite) so again the error averages out.

    What???

    For chrissakes… go back and read the couple posts I just made regarding the conditions that are required for this to be true. Then go find a suitable text and read up on the LLN and the CLT which will demonstrate that what I just wrote is indeed required. Then, probably in the same text, read and understand the concept of independence (really orthogonality) and attempt to understand the concept of identical distributions.

    Really, c’mon folks. Where on earth do you get this nonsense?

    Mark

  112. Mark T

    Prove it doesn’t work the way I said it does. Maybe you can share a Nobel Peace prize for proving that the instrumental temperature record for the past 200 years is worthless and you can, all by your lonesome, end the biggest scientific hoax in history. Good luck.

  113. Dave Springer says:
    January 22, 2011 at 3:38 pm

    Prove it doesn’t work the way I said it does.

    You’re joking, right? YOU MADE THE CLAIM, you need to prove it, not me.

    I have already given you the requirements for the LLN, do you deny that?

    Maybe you can share a Nobel Peace prize for proving that the instrumental temperature record for the past 200 years is worthless and you can, all by your lonesome, end the biggest scientific hoax in history. Good luck.

    Where on earth did you get this from? Who said it is worthless? I only noted that you cannot arbitrarily cancel errors, and I am correct in that statement.

    Mark

  114. Oh my God! Anthony, this joker needs to be on the quote of the week.

    How can you be a 30-year engineer and not know how science works? Wow… flabbergasted.

    Mark

  115. Mark T says:
    January 22, 2011 at 3:35 pm

    “What???”

    You need to figure out the difference between systematic and random sources of instrument errors and how these are addressed in the real world. Off the top of my head I can describe to you how the computer controlled electro-mechanical flight control surfaces on the Space Shuttle were designed to eliminate both systematic and random errors. The nut of it is redundant systems designed by independent teams who weren’t allowed to crib from each other. Since you suggest people “read up” on things I suggest you do some reading up yourself or better yet get out into the real world where people actually do this stuff for a living and when they’re wrong it results in destruction, loss of life, and an instant end to a career. Pilots for instance (yeah, I’ve got a pilot’s license in case you’re wondering) that don’t how to read instruments have short careers and sometimes short lives too.

  116. I mean Dave Springer… “prove it doesn’t work the way I said it does.” Prove I don’t have a leprechaun in my back yard. I said he’s there and thus, it must be true.

    Words fail me.

    I was actually ignoring Lazy Teenager, though he did have a few correct statements.

    Mark

  117. Dave Springer says:
    January 22, 2011 at 3:57 pm
    Mark T says:
    January 22, 2011 at 3:35 pm

    You need to figure out the difference between systematic and random sources of instrument errors and how these are addressed in the real world.

    I don’t have to figure out anything. You have to prove that the random errors are a) independent and b) taken from identical distributions.

    Again, I ask, can you do that?

    Since you suggest people “read up” on things I suggest you do some reading up yourself or better yet get out into the real world where people actually do this stuff for a living and when they’re wrong it results in destruction, loss of life, and an instant end to a career.

    Don’t worry, I work for a living. I suggest you actually attempt to understand the theory you are applying and why you are applying it incorrectly. When you integrate a large number of ADC values that are corrupted by thermal noise, you will most definitely get a sqrt(N) reduction in noise. Averages across different measuring devices, taken at different times, in different locations, with different error statistics, however, does not provide the same guarantee.

    That’s how it really works in the real world.

    Mark

  118. “Until the 1960′s almost all global temperatures were measured in Fahrenheit. ”

    Really? I was around then, and I seem to remember that by the 1960s Farenheit was only used in ex-British Empire countries and the US. Everywhere else used Celsius.

  119. Ian W says:
    January 22, 2011 at 11:42 am

    E.M.Smith, me and a few others have been arguing the same point. I gave up after a while as you can only bang your head against a wall so many times.

    DaveE.

  120. Speaking of pilots and instrument error there was this time when I was student pilot flying solo and I rented a plane that happened to have an air-speed indicator that showed MPH instead of KNOTS. Silly me. I thought they all read out knots and never noticed the small print reading MPH on the dial. Redundancy saved my butt. Takeoff, landing, and cruise performance didn’t feel right. The plane had an indicated speed that just didn’t agree with what I knew should be happening at certain speeds. I presumed the indicator was wrong but as in inexperienced pilot I knew the seat of my pants wasn’t the most reliable thing in the world either. So I split the difference between what my gut told me was the right speed for various flight conditions and what the instrument told me was the right speed. Upon landing I told my instructor (he was a Marine Corps fighter pilot) about it and he told me that in a rare few aircraft some boneheads install air speed indicators that read out miles per hour instead of knots. After reading me the riot act for not paying closer attention to my instruments he said he was proud of me as I demonstrated I was proficient enough to pilot a plane with a faulty air-speed indicator.

  121. As Steve Mosher has pointed out, if the errors are random normal, or if they are “offset” errors (e.g. the whole record is warm by 1°), increasing the number of observations helps reduce the size of the error. All that matters are things that cause a “bias”, a trend in the measurements. There are some caveats, however.

    First, instrument replacement can certainly introduce a trend, as can site relocation.

    Second, some changes have hidden bias. The short maximum length of the wiring connecting the electronic sensors introduced in the late 20th century moved a host of Stevenson Screens much closer to inhabited structures. As Anthony’s study showed, this has had an effect on trends that I think is still not properly accounted for, and certainly wasn’t expected at the time.

    Third, in lovely recursiveness, there is a limit on the law of large numbers as it applies to measurements. A hundred thousand people measuring the width of a hair by eye, armed only with a ruler measured in mm, won’t do much better than a few dozen people doing the same thing. So you need to be a little careful about saying problems will be fixed by large amounts of data.

    Fourth, if the errors are not random normal, your assumption that everything averages out may (I emphasize may) be in trouble. And unfortunately, in the real world, things are rarely that nice. If you send 50 guys out to do a job, there will be errors. But these errors will NOT tend to cluster around zero. They will tend to cluster around the easiest or most probable mistakes, and thus the errors will not be symmetrical.

    Fifth, the law of large numbers (as I understand it) refers to either a large number of measurements made of an unchanging variable (say hair width or the throw of dice) at any time, or it refers to a large number of measurements of a changing variable (say vehicle speed) at the same time. However, when you start applying it to a large number of measurements of different variables (local temperatures), at different times, at different locations, you are stretching the limits …

    Sixth, the method usually used for ascribing uncertainty to a linear trend does not include any adjustment for known uncertainties in the data points themselves. I see this as a very large problem affecting all calculation of trends. All that are ever given are the statistical error in the trend, not the real error, which perforce much be larger.

    Seventh, there are hidden biases. I have read (but haven’t been able to verify) that under Soviet rule, cities in Siberia received government funds and fuel based on how cold it was. Makes sense, when it’s cold you have to heat more, takes money and fuel. But of course, everyone knew that, so subtracting a few degrees from the winter temperatures became standard practice …

    My own bozo cowboy rule of thumb? I hold that in the real world, you can gain maybe an order of magnitude by repeat measurements, but not much beyond that, absent special circumstances. This is because despite global efforts to kill him, Murphy still lives, and so no matter how much we’d like it to work out perfectly, errors won’t be normal, and biases won’t cancel, and crucial data will be missing, and a thermometer will be broken and the new one reads higher, and …

    Finally, I would back Steven Mosher to the hilt when he tells people to generate some pseudo-data, add some random numbers, and see what comes out. I find that actually giving things a try is often far better than profound and erudite discussion, no matter how learned.

    w.

  122. It may seem off the subject, but there do seem to be a lot of senior experienced people reading this blog and commenting and hence may be interested in this.

    So for what it is worth for all those people with high blood pressure: certification of many blood pressure cuffs has an acceptable error of plus or minus 20% to be certified accurate as I recall.

    And this is from an Australian report: “Sphygmomanometers can pass validation tests despite producing clinically significant errors that can be greater than 15 mmHg in some individuals.” http://www.racgp.org.au/afp/200710/200710turner.pdf

  123. Finally, I would back Steven Mosher to the hilt when he tells people to generate some pseudo-data, add some random numbers, and see what comes out.

    I do that all the time. It is verification that when you do meet the i.i.d. requirements the CLT and LLN work. It does not, however, provide you with a warm fuzzy regarding errors that are drawn from distributions with differing statistics. If one type of error is, for example, drawn from a uniform distribution and another from a Gaussian distribution, you cannot expect the errors to cancel. Gaussianity is not required for these two theorems, btw (Gaussianity is actually the result of the CLT.)

    The LLN works best when you are measuring the same thing over and over, but it is not required. Simply having the same statistics suffices (hehe, simply is hardly a true comparison, it is very difficult to acheive.) In the absence of identical statistics, your actual error (in an average) will be somewhere between the sqrt(N) you desire and the largest error in your data.

    Dave, you MUST have had at least some statistics training while you were getting your engineering degree. I had at least 3 or 4 classes just in my undergrad alone. Surely someone along the line explained this to you?

    Mark

  124. Mark T says:
    January 22, 2011 at 3:45 pm

    “You’re joking, right? YOU MADE THE CLAIM, you need to prove it, not me.”

    Two can play that game, Mark. You claimed I was wrong. You prove your claim.

    “Where on earth did you get this from? Who said it is worthless? I only noted that you cannot arbitrarily cancel errors, and I am correct in that statement.”

    There is nothing arbitrary in the way that instrument errors cancel out through redundancy.

  125. David A. Evans says:
    January 22, 2011 at 4:05 pm

    E.M.Smith, me and a few others have been arguing the same point. I gave up after a while as you can only bang your head against a wall so many times.

    Are you referring to the heat content issue? If so, I definitely agree. The average, though known to be simply a mathematical construct, loses all meaning when averaging different things. Any given temperature can be arrived at from different levels of heat. Why anyone ever argues it makes sense to average temperature has always been beyond me.

    Mark

  126. Dave Springer says:
    January 22, 2011 at 4:23 pm
    Mark T says:
    January 22, 2011 at 3:45 pm

    Two can play that game, Mark. You claimed I was wrong. You prove your claim.

    NO! Where on earth did you learn this? You made a claim and I pointed out what YOU have to prove for your claim to be true. You can’t just say “It’s true because I say so.” That’s nonsense. Meet the requirements or shut up.

    I have already given you the requirements which you have thus far failed to address. Do you even understand what i.i.d. means?

    There is nothing arbitrary in the way that instrument errors cancel out through redundancy.

    If they are uncorrelated and drawn from identical distributions, no, there is not. But you need to meet these requirements otherwise you cannot claim they cancel.

    So, Dave, are you going to answer any of my questions or are you just going to keep running in circles?

    Mark

  127. Mark T says:
    January 22, 2011 at 2:29 pm
    The other thing that is instructive is to compare two thermometers that are within a few km of each other.. over the period of say 100 years. Look at the corellation.
    98% plus.

    If both thermometers were affected by the same sort of physical process that was causing a degradation of accuracy over time, you would expect them to have highly correlated data, too.
    _____________________________________________________________

    Please extend this analogue, first to 10’s of thermometer readings, than to 100’s of thermometer readings, and finally to 1000’s of thermometer readings, in terms of having identical systematic errors and in probabilistic terms (e. g. How likely are 1000’s of thermometers likely to have the exact same systematic errors AND show identical low frequency trendlines?). TIA
    _____________________________________________________________

    Or you can write a simulation of a sensor with very gross errors. simulate daily data for 100 years. Assume small errors. calculate the trend. Assume large errors. calculated the trend.

    If you’re drawing your “errors” using independent trials from the same distribution of course this will work. That is a trivial application of the CLT that proves nothing other than the fact that the CLT works if you meet all the requirements.

    In general, I don’t think anybody in here actually understands how the CLT or LLN work. A few came close. You do not need a normal distribution for the CLT to work. You need independent and identically distributed (i.i.d.) error distributions for errors to cancel with the sqrt(N). It is my hope that at some point everyone will figure out how much of a limit i.i.d. really is.
    _____________________________________________________________
    And I don’t think anyone here understands how to extract a very real low frequency signiture from “noisy” data. :(
    _____________________________________________________________

    Generally speaking, independence is not really required (independence is calculated over all time, which is not possible,) just orthogonality (uncorrelatedness) but the errors do need to be drawn from an identical distribution if you want the cancellation property to apply. That implies the same mean and variance, btw. The mean and variance need to exist, obviously, and they also should be stationary (unless they all vary identically over time,) which is not as obvious but easy to figure out. That also implies that if the errors are a function of the thing you’re measuring, e.g., a percentage, then the CLT will not apply. Sorry. Get over it. The same applies to situations in which the error distributions are unknown, which clearly applies to temperature measurements.

    Increased uncertainty in the data itself implies increased uncertainty in any calculations done with the data. If the i.i.d. requirement is not met, then you have no choice but to assume the errors do not cancel… anywhere. It sucks, I know, but them’s the breaks. Stay away from statistical endeavors if you cannot wrap your head around this very basic concept.

    Mark

    _____________________________________________________________
    What the heck does LLT stand for?

    Standard practice is to spell it out first then yo put it in perenteses (e. g. per http://www.acronymgeek.com/LLT “Language Learning & Technology (LLT)”) for later reference(s).

    http://en.wikipedia.org/wiki/Random_errors

    “In statistics and optimization, statistical errors and residuals are two closely related and easily confused measures of the deviation of a sample from its “theoretical value”. The error of a sample is the deviation of the sample from the (unobservable) true function value; while the residual of a sample is the difference between the sample and the estimated function value.”

    Perfect Deconstructionist logic therefore dictates that nothing is knowable. Q.E.D.

  128. Mark T:
    Dave Springer wrote:

    “Thousands of people reading thousands of different thermometers for hundreds of years won’t give you the confidence to say it was 70.2 degrees +-1 degree on April 4th, 1880 in Possum Trot, Kentucky but it will allow you to say the average temperature for April in Kentucky in 1880 was 0.5 degrees +-0.1 degrees cooler in 1880 than it was 1980.”

    And you replied:

    “No, they won’t, not unless you can prove the errors are drawn from independent and identically distributed distributions.”

    If I understand you both correctly, I’m afraid I must agree with Dave Springer. Random errors being, well random, multiple measurements will cancel them out. And when you are measuring changes in temperature, systematic errors will also cancel out. Take a thermometer than reads 1 degree high. If in 1950 it read 72 F it was actually 71 F. In 2000 it read 74 F it was actually 73F. But the rise is 2F regardless of whether you you correct for the true temperature or not.

    And tempratures from proxy measurements such as tree ring growth can be used to check past thermometer readings, or even infer measurements where no measurements were taken.

  129. From Wikipedia:

    Two different versions of the Law of Large Numbers are described below; they are called the Strong Law of Large Numbers, and the Weak Law of Large Numbers. Both versions of the law state that – with virtual certainty – the sample average converges to the expected value where X1, X2, … is an infinite sequence of i.i.d. random variables with finite expected value E(X1) = E(X2) = … = µ < ∞.

    Bold mine. The definition of i.i.d. is:

    In probability theory and statistics, a sequence or other collection of random variables is independent and identically distributed (i.i.d.) if each random variable has the same probability distribution as the others and all are mutually independent.

    This is taught in engineering school. It is required, btw. You should know this, Dave. Why don’t you?

    Mark

  130. island says:
    January 22, 2011 at 4:16 pm

    So for what it is worth for all those people with high blood pressure: certification of many blood pressure cuffs has an acceptable error of plus or minus 20% to be certified accurate as I recall.

    And this is from an Australian report: “Sphygmomanometers can pass validation tests despite producing clinically significant errors that can be greater than 15 mmHg in some individuals.”

    I don’t think I’ve seen the term sphygmomanometer since I took “Human Anatomy and Physiology” 30-some years ago in college. That was my favorite class of all time. It was a required class for registered nurses and we got to play with all the standard medical office instruments in the lab portion. What made the class so great was there were, counting the instructor, only two guys in a class of 30 people. The other 28 were 18-20 year-old girls. Yowzah. That’s a good ratio and unlike the instructor, I was under no professional contraint to avoid intimate contact with the students if you get my drift.

    Anyhow, to the best of my recall, we weren’t taught that blood pressure cuffs could be wrong by that much. Of course we didn’t have those automatic jobbies you find in grocery stores and whatnot. Just the regular cuff inflated by a hand squeezed pump and a stethoscope to listen for diastolic and systolic pulse-pressure points. The operator could easily be wrong by 15 points but the instruments could rarely if ever be blamed for that much error. In fact at my last checkup the gal who took my blood pressure before the doctor came in got mine wrong by 20 points. I told her she must be mistaken and she said she’d ask the doctor to take it again. Sure enough when my sawbones took it a few minutes later it was 20 points lower.

  131. Philip Shehan says:
    January 22, 2011 at 4:42 pm

    If I understand you both correctly, I’m afraid I must agree with Dave Springer.

    Because you don’t know what you’re talking about, either.

    Random errors being, well random, multiple measurements will cancel them out.

    Did you bother to read the definition of the LLN before you made this post? Seriously, it’s online and I have now posted definition as well as the requirements. This is getting more and more bizzare with every post. Are you the same sort of person that thinks this way, too:

    Two can play that game, Mark. You claimed I was wrong. You prove your claim.

    Of course, I did prove my claim, though I did not need to: I provided the definition of the LLN.

    And when you are measuring changes in temperature, systematic errors will also cancel out. Take a thermometer than reads 1 degree high. If in 1950 it read 72 F it was actually 71 F. In 2000 it read 74 F it was actually 73F. But the rise is 2F regardless of whether you you correct for the true temperature or not.

    Wow. Ignorance is contagious. That’s all I can say regarding this thread.

    Mark

    And tempratures from proxy measurements such as tree ring growth can be used to check past thermometer readings, or even infer measurements where no measurements were taken.

  132. The last bit did not get edited off properly, but it is a rather silly statement. I think I understand why you agree with Dave Springer. Tree rings would be a useful proxy if temperature was the only thing affecting their growth, but sadly, they are actually driven more by water.

    Mark

  133. Dave Springer says:
    January 22, 2011 at 4:10 pm
    =============
    I call B.S. on this entire entry, and therefore all your recent comments.
    From one pilot, to “another”.

  134. Mark T says:
    January 22, 2011 at 4:43 pm

    “This is taught in engineering school. It is required, btw. You should know this, Dave. Why don’t you?”

    What you’re taught in school and what you learn in the real world are often two different things. That’s why you don’t step out of school into a senior engineering position. I have a pretty good idea of why you don’t know that, Mark.

    The thermometers used in the instrument record came from dozens of different manufacturers using different manufacturing methods and different technologies (alcohol vs. mercury for instance) with various sources of error from each due to quality control and whatnot. Over the course of millions of readings recorded by thousands of people the errors, unless they are systematic in nature, will cancel out. The wikipedia article you quoted in effect states just that. I haven’t seen anyone come up with a description of systematic error that would produce gradually rising temperatures in this scenario. No systematic error, random error cancels out, only data of concern is change in temperature over time rather than exact temperature at any one time ergo there is nothing wrong with the raw data. I won’t say the same for the “adjustments” made to the raw data though. That’s what called pencil-whipping and is frowned upon (to say the least) in my circles.

  135. Mark T says:
    January 22, 2011 at 4:28 pm

    Correct, I’m referring to energy content. That’s what is really being argued.
    OEC is a difficult one because the energy density of water requires more accurate temperature measurement but the temperature is at least more linearly related to energy.

    DaveE.

  136. If the trend does not exist beyond the margin of error then the trend does not exist at all.

    You cannot measure a 0.7º C average trend if your measuring equipment is not on average accurate to 1.3º C.

    But you can certainly claim that you can and no one will be able to ‘prove’ otherwise.

    This is why the claimed warming is so small. If it was was outside the margin of error and it was indeed fraudulent, proving that it was a fraud would be a mere formality.

    The irony of playing it this safe is that 0.7º C in 100+ years does not equate to an anomalous warming event. Particularly if the margin of error is 1.3º C.

    In fact it is comforting to know that even after the entire industrial revolution, including the recent and ongoing industrialisation of India and China, the so called “global warming signal” is indistinguishable from the noise of the method of data collection.

    I am therefore satisfied that the AGW hoax has been exposed for the fraud that it is.

  137. Willis Eschenbach says:

    “The short maximum length of the wiring connecting the electronic sensors introduced in the late 20th century moved a host of Stevenson Screens much closer to inhabited structures. As Anthony’s study showed, this has had an effect on trends that I think is still not properly accounted for, and certainly wasn’t expected at the time.”

    If I may add something: in electronic thermometers using thermocouples [which most do], the welded wires of dissimilar metals, which have a voltage output that changes with temperature, the welded bead does not provide 100% of the voltage output. The wires themselves have a declining effect according to their length from the weld. Therefore, it matters how deep into the oven, or furnace, or Stevenson screen the thermocouple is placed. This is in addition to the length of the wires as noted by Willis.

    Also, routine, periodic calibration is essential. The output of thermocouples is in the millivolt/microvolt range. A voltmeter reads the output. Voltmeters tend to drift over time, and there is also the hysteresis effect: when a thermocouple is heated, then allowed to return to ambient, the output is slightly different each time at the same ambient temperature point. This is called hysteresis.

    In my experience [and I’ve calibrated thousands of temperature devices to NIST primary and secondary standards], a well constructed mercury thermometer is usually superior to an electronic thermometer. That is why electronic thermometers almost always have a shorter calibration recall period than mercury thermometers.

  138. Dave Springer said (amongst other things) “…….Over the course of millions of readings recorded by thousands of people the errors, unless they are systematic in nature, will cancel out….”
    ———————
    I’m beginning to warm to this notion; two wrongs don’t make a right but millions of wrongs do.
    I’ve been trying to sell this to my wife but she’s not crazy about the smug pomposity that my new belief engenders.

  139. Dave Springer says:
    January 22, 2011 at 4:58 pm

    What you’re taught in school and what you learn in the real world are often two different things. That’s why you don’t step out of school into a senior engineering position.

    Hehe. In other words, you don’t understand the theory, do you? I do. I also put it in practice. Make some of these claims to your signal processing buddies. They will laugh at you a bit before explaining that the theory of the LLN works in the real world, too.

    I have a pretty good idea of why you don’t know that, Mark.

    Really… so what would that be then, Dave? You seem hung up on authority… go check my background at the Air Vent (there is a thread.) Certainly there are enough clues out there to glean that I’m not just some joe blow that spends his life in academia pondering philosophical things. If one was astute enough to pick it up, that is.

    Could it be related to the fact that you have continually avoided every question I’ve asked? Afraid of admitting that you really don’t understand the theory that you are applying? You are applying the theory, even if you don’t know how or why.

    The thermometers used in the instrument record came from dozens of different manufacturers using different manufacturing methods and different technologies (alcohol vs. mercury for instance) with various sources of error from each due to quality control and whatnot.

    So, in other words, they do NOT have identical error distributions? You just proved my point, Dave. Thanks.

    Over the course of millions of readings recorded by thousands of people the errors, unless they are systematic in nature, will cancel out.

    If they meet the requirements I already laid out above, sure.

    The wikipedia article you quoted in effect states just that.

    I know exactly how it works, I use it in my job extracting signals from noise. It is a very powerful law when used properly. The article says the errors need to be i.i.d. Did you conveniently avoid that part? You just admitted the errors aren’t i.i.d. above, so are you now changing your tune? Wikipedia is right but no, it’s not right, wait, it’s right.

    I haven’t seen anyone come up with a description of systematic error that would produce gradually rising temperatures in this scenario.

    Show me where I ever said anything even remotely close to this. Really… I dare you.

    No systematic error, random error cancels out, only data of concern is change in temperature over time rather than exact temperature at any one time ergo there is nothing wrong with the raw data.

    If you can prove the random errors are a) independent and b) identically distributed, then I will believe you. Until then, sorry, but you are wrong.

    Mark

    PS: it is easy to test this, btw. Generate a bunch of random data using different means, variances, and distributions then average them together and look at your variance. It will be somewhere between the maximum and minimum and will not decrease asymptotically. If you have MATLAB, play around with rand and randn.

  140. Mark Sonter says:
    January 22, 2011 at 5:40 am

    “…My memory from running a university metsite 40 yrs ago (University of Papua New Guinea, Port Moresby), is that the ‘almost horizontal’ orientation commented on and queried by a couple of posters, is because the max and min thermometers have little indicator rods inside the bore, which get pushed up to max by the meniscus and stick there (Hg) ; and pulled down to the min and stick there (alcohol). The weather observer resets them by turning to the vertical (upright for the Hg (max), and the rod slides back down to again touch the meniscus); or upside down, for the alcohol (min), wherupon the rod slides back up, inside the liquid, till it touches the inside of the meniscus. Both are then resplaced on their near-horizontal stands…”

    So wait – there’s a small movable rod in the bore of a thermometer. If the bore is not even, they would necessarily have to make the rod just a tiny bit smaller than the bore of the thermometer.

    Yet, this glass “hardening”, would reduce the size of that bore. There’s a possibility of the rod sticking, making the same reading for a certain number of days (till the observer or someone) suspects the thermometer as faulty and changes it out.

  141. Oliver Ramsay says:
    January 22, 2011 at 5:36 pm

    I’m beginning to warm to this notion; two wrongs don’t make a right but millions of wrongs do.

    I hope you are just being sarcastic. If not… sigh.

    Mark

  142. David A. Evans says:
    January 22, 2011 at 5:01 pm

    Correct, I’m referring to energy content. That’s what is really being argued. OEC is a difficult one because the energy density of water requires more accurate temperature measurement but the temperature is at least more linearly related to energy.

    Indeed. Pielke Sr. argues this regularly. I’ve always wondered why they use this metric instead of one that actually makes physical sense. Hell, OEC isn’t even an average, it’s an measure of total energy in the oceans. If that is changing then certainly something is happening.

    Mark

  143. u.k.(us) says:
    January 22, 2011 at 4:55 pm

    Dave Springer says:
    January 22, 2011 at 4:10 pm
    =============
    “I call B.S. on this entire entry, and therefore all your recent comments.
    From one pilot, to “another”.”

    Nope. True story. I only flew Cessna 172’s. At takeoff and cruise I thought wow, this thing must have a more powerful engine in it because it “accelerated” faster on takeoff and cruise speed was substantially faster than I’d seen in any other C172’s for the throttle setting. Gimme a break. That was like my second cross-country solo and I had maybe 30 hours in the left seat total. Takeoff and cruise speeds usually aren’t that critical as you’re trimmed for takeoff and let the plane lift off of its own accord with full throttle and except for navigation, which in this case was visual by landmark during the day, it doesn’t much matter if cruise speed is 110 or 130 knots. My landmarks came up a minute or two late. No big deal. Being on a student cross country solo I wasn’t practicing stalls or doing any low speed high bank angle turns. Except for landings of course starting with turn onto short final. That’s when I got worried and that’s where I split the difference between what felt like the right speed and what the airspeed indicator was reading. I pretty much figured out there was something amiss with the air speed indicator and how the hell I failed to notice it was MPH instead of knots is both embarrassing and scary to this day 20 years later. If I’d been flying by instruments at night that mistake might have ended up with a crash, die, and burn or worse a crash, burn, and die (as my instructor said the order is critical – you really want to die BEFORE you burn up).

    The following link discusses when and where and how often you find an MPH indicator in small Cessnas. I didn’t know this before now but MPH was standard in Cessnas manufactured before 1974 and knots were standard after that. The year this happened to me was 1991. The rental fleet at the small airport I was flying out of must have had just one oddball in the group that was pre-1975. My instructor was only about 25 years old and was still in Corps. He was moonlighting as an instructor accumulating hours required for a commercial license so it wasn’t like that was his regular job nor had he been doing it for a long time. Corona, CA municipal airport was where this took place. The solo was cross country through the desert landing at about 3 or 4 small uncontrolled runways. Corona is about 25 miles from USMCAS El Toro where he was stationed then and where I was stationed from 1975-1978. I lived in Corona at the time so it was convenient. All the other convenient airporst nearby were big ones and you’d waste a lot of your expensive rental time waiting and taxiing instead of flying. At Corona once you started rolling the runway was 50 yards away and seldom anyone in front of you.

  144. The issue of whether a measurement of +/-0.5 can be used to construct a mean value with a much smaller “error” depends on what you mean by “error”.

    In statistics it is usual to interpret this as a 95 or 99% confidence interval. As others have said, this varies with 1/sqrt(N) so with enough observations N you can narrow the confidence interval to well below the error in the original measurement. But why choose (say) a 99% CI rather than a 100% CI? Because for most statistical distributions the 100% CI is +/-infinity. But if you could have a 100% CI you would prefer it over a 99% CI, no?

    Well, in the case of a uniform distribution, as with reading a thermometer, you can have a 100% CI and it’s a piece of cake to determine it. For readings R1+/-0.5 and R2+/-0.5 the mean is (R1+R2)/2+/-(0.5+0.5)/2; ie +/-0.5. Proceeding in this way with more observarions it’s easy to see that the 100% confidence interval is the same as the “error” in the original observations.

    So take your pick: 99% or 100%.

  145. Mark said “I hope you are just being sarcastic. If not… sigh.”

    —-
    No, I’m the one doing the sighing. I was aiming for irony but somehow hit ambiguity, apparently completely missing wit on the way.
    On the bright side; that’s another screw-up so, I’m getting closer to perfection.

  146. Dave Springer says:
    January 22, 2011 at 5:49 pm
    ==============
    I, reiterate that it is B.S., that as a student pilot, you felt things the test pilots
    missed.

  147. Jeff Alberts: yes and no. That’s not really an error in the sense of the point. The increase is real. It is, however, a bias increasing with time.

    Davidc: you are only considering the recording error as a result of the minimum gradations. That is not the whole of the error distribution.

    Mark

  148. Thanks for posting Marks work. Mark take a break from arguing with idiots, the work stands pretty well on its own and many of the assumptions stipulated before arguing your work negate the argument offered.Essentially you say the error range of instruments is site and device specific, having calibrated control system devices I understand that well enough. I agree the failure to list the device and its error range is an indication of sloppy work. Also slow down on your response as you are abreviating to the limits of my comprehension.Thanks for the posting and glad you made it to WUWT.

  149. u.k.(us) says:
    January 22, 2011 at 4:55 pm

    Dave Springer says:
    January 22, 2011 at 4:10 pm
    =============
    I call B.S. on this entire entry, and therefore all your recent comments.
    From one pilot, to “another”.

    I love google. Evidently it’s not a too uncommon experience with Cessna ASIs calibrated in MPH vs. KNOT and that causing scary problems for the student pilot who happens to rent one of the former:

    http://answers.yahoo.com/question/index?qid=20070923203647AAqLOZ3

    Back in the mid to late ‘70s, Cessna quit installing airspeed indicators calibrated in mph and started using ones with kts in their C-150s and 172s. Occasionally it would confuse the solo student if there was a mixed fleet, but they soon learned to pay attention to detail after they scared themselves sufficiently.

    I’d call it unnerving but not particularly scary. About the same as when my instructor sneakily turned the fuel selector to OFF, the engine quit, and instead of figuring out my fuel was cut off I set up for no power landing on a dirt road in one of our regular remote practice areas. He let me get lined up on it and didn’t turn the fuel back on until we were down to about 100 feet off the ground. I thought it was a for-real failure but remained quite calm. I got ripped a new arse for that too. But hey, at least I did good on setting up for the emergency landing. That practice area was also where we did spin training. If you’ve never been in a spin (optional with some instructors) in a small plane it’s quite something. One second you’re angled up into the sky with the stall horn screaming at about 35 knots and the next second you’re pointed straight at the ground spinning and accelerating like a bat out of hell. Like an elevator where the cable just broke only worse.

  150. Mark:

    “Davidc: you are only considering the recording error as a result of the minimum gradations. That is not the whole of the error distribution.”

    Yes, but if you took this interpretation the “error” coud not be less than the individual measurent error.

  151. After looking over the discussions for the types of thermometers used, techniques….

    there ARE some fairly easy ways to get a much better trend signal out of the error noises to see if it really exists.

    The primary cause, by far, of drift in temperature measurement systems is due to the thermal cycling over time of the thermometer system. This includes both mercury thermometers and all electronic thermometers. A mercury thermometer is very stable over time if it is not thermally cycled very much. Same with electronic thermometers, as both the sensor tips and the electronic components, resistors, etc decay with thermal cycling over time. (Basically that is how automotive components are lifetime tested…by lots of thermal cycling to simulate thermal cycling over the lifetime desired of the component.)

    So, if you concentrate on historical temp data from land on small islands only, you will have a much better shot at seeing whether a true trend exists, and its magnitude. The reason is that temperatures on islands are dominated by the surrounding seas that greatly reduce the temperature swings over time.

    For example, I live near Miami. The normal yearly air temp delta T is only about 25 F. Daily delta T is normally only about 10 F or so. In other words, our environment here very dominated by the surrounding waters. Thus, the thermal cycling stress on the temperature systems is extremely low compared to inland systems.

    Inland systems are thermally cycled with a yearly delta T more in the 50 F to 60 F range, and a delta T of 20 F or more range on a daily basis.

    So, the small island thermometers are much more stable over time.

    Because they are more stable, it reduces the need for delving into calibration history errors, etc.

    It may also serve as a good proxy for ocean temperature changes, as the air temperatures on small islands are completely dominated by the surrounding waters.

  152. http://en.wikipedia.org/wiki/Iid

    “In probability theory and statistics, a sequence or other collection of random variables is independent and identically distributed (i.i.d.) if each random variable has the same probability distribution as the others and all are mutually independent.”

    AFAIK all distributions of errors in temperature measurements are Gaussian with zero mean. Therefore, all Gaussian distributions have the same probability distribution (e. g. Gaussian).

    Each station’s temperature measurements are independent from all other stations, if they were not, than the two stations would, in fact, be the same station.

    In fact, no two instruments will ever have the exact same “identically distributed” distribution (exact to infinite precision), not even remotely possible, again only possible if, in fact, the two instruments were one in the same instrument.

    But wait, there’s more;

    “The abbreviation i.i.d. is particularly common in statistics (often as iid, sometimes written IID), where observations in a sample are often assumed to be (more-or-less) i.i.d. for the purposes of statistical inference. The assumption (or requirement) that observations be i.i.d. tends to simplify the underlying mathematics of many statistical methods: see mathematical statistics and statistical theory. However, in practical applications of statistical modeling the assumption may or may not be realistic. The generalization of exchangeable random variables is often sufficient and more easily met.”

    “The abbreviation i.i.d. is particularly common in statistics (often as iid, sometimes written IID), where observations in a sample are often assumed to be (more-or-less) i.i.d. for the purposes of statistical inference. The assumption (or requirement) that observations be i.i.d. tends to simplify the underlying mathematics of many statistical methods: see mathematical statistics and statistical theory. However, in practical applications of statistical modeling the assumption may or may not be realistic. The generalization of exchangeable random variables is often sufficient and more easily met.”

    “The assumption is important in the classical form of the central limit theorem, which states that the probability distribution of the sum (or average) of i.i.d. variables with finite variance approaches a normal distribution.”

    Wait a minute, did the above just say “assumption” and “finite variance” and “normal distribution? Why, yes it did.

    Well than, so much for that ole’ nasty IID requirement, it’s been met head on and found to be not so necessary if need be, however, as the specious uncertainty paper claims, all sigmas have zero meas as all instances are shown as +/- meaning symmetric about a zero mean.

    If it looks Gaussian, if it feels Gaussian, if it smells Gaussian, if it sounds Gaussian, and if it tastes Gaussian, than it must be Gaussian! Q.E.D.

    Now if you want to talk about instrument bias offsets (that vary randomly with time, as they invariably do), I’m game.

  153. Dave Springer says:
    January 22, 2011 at 7:03 pm

    My only time in a light aircraft was a jolly in a DH Chipmunk. I insisted on a spin, loop wingover & Immelmann. Loved it! I flew back to the airfield & commented on turbulence. Instructor said PIO. I think he was surprised when I told him I wasn’t moving the stick & was making corrections on trim. I was making no corrections for turbulence.

    DaveE.

  154. u.k.(us) says:
    January 22, 2011 at 6:23 pm
    Dave Springer says:
    January 22, 2011 at 5:49 pm
    ==============
    I, reiterate that it is B.S., that as a student pilot, you felt things the test pilots
    missed.

    We’re going to get cut off for off-topic but I don’t understand that at all. I noticed something wrong on the first takeoff. The plane was trimmed for takeoff and it lifted off the runway at the same point all the other C172’s I’d flown. Takeoffs are all the same – full throttle, elevator trimmed for takeoff, and let it lift off by itself without pulling back on the yoke. The indicated speed was noticeably higher. Climbout works the same way – full throttle and enough elevator to get a desired rate of climb. Again I noticed the indicated airspeed was noticeably higher for the rate of climb. At that point I thought the engine might have been more powerful. Then my indicated cruise speed was noticeably higher for 2/3 throttle yet engine RPM was the same. At that point I concluded I had a funky airspeed indicator that was reading high and mentally adjusted downward for critical landing speeds. Engine RPM was the final giveaway. If you are accustomed to the aircraft the instruments all connect together in predictable fashion. Throttle vs. RPM vs. rate of climb vs. pitch vs. indicated airspeed. All the instruments had what I expected to see except for the ASI being out of whack with the others. I’m still perplexed as to how I could have possibly failed to notice the ASI had MPH on it instead of KNOTS. It’s rather small print on the face of the dial but still… it’s just an example of how the brain works. Things that always stay the same and aren’t important become “invisible”. In this case it turned out to be something important but it remained outside my notice nonetheless and didn’t find out my mistake until I was back on the ground. Instead I figured out approximately how much the ASI was off the mark and compensated for it by subtracting five to ten knots from the indicated speed.

  155. Jeff Alberts says:
    January 22, 2011 at 5:53 pm

    “So, would gradual, but constant, encroachment of urbanization upon the site be considered systematic?”

    Not from the thermometer’s point of view. That’s a systematic error in interpretation of the readings not a faulty reading i.e. blaming the increase on CO2 instead of urban heat. The data is fine it’s the explanation of the data that’s wrong.

  156. Here is an error that may be introduced into the data. I think I once saw it posted as an article.

    Say you have a weather station and over time the paint deteriorates exposing the wood. Assuming the wood has a higher emissivity than the paint, over time the temperature of the station increases during the day and decreases at night. Note they don’t quite move by the same amount. Day is a wee bit hotter than night is colder. This would increase the spread of the day/night data and increase the average of the two (if the day data shift up is hotter). Now lets say weekend you go out and paint the station and the temperature shifts 0.3C. Back at the lab you look at the shift and say, “Oh that isn’t right we need to make them the same” so you “adjust” the old data downward or the new data upward. Violla! One now has a temperature station reporting warming over decades instead of a station trending up until it’s painted, shifting discretely down then trending back up.

  157. Mark T says:
    January 22, 2011 at 5:38 pm

    “Could it be related to the fact that you have continually avoided every question I’ve asked? Afraid of admitting that you really don’t understand the theory that you are applying? You are applying the theory, even if you don’t know how or why.”

    No Mark. It’s related to the fact that when 10,000 computers a day (no exageration, I was senior R&D in Dell laptops 1993-2000 after which I retired because I never needed to work for a living again) that you designed are supposed to be rolling off the production line and an unacceptable number of them are failing some bit of manufacturing test which results in the line being shut down and millions of dollars per day are going out the window you don’t sit on your ass making bar graphs with error bars to figure out what’s gone wrong. Well maybe academics do that. Someone with decades of experience uses practical knowledge and uses it quickly or starts thinking about whether his resume is up to date.

    Instead of being obtuse why don’t you use your real name and state what your experience actually is instead of giving me a trail of breadcrumbs to follow to figure it out? What’s up with that? What are YOU hiding?

  158. Dave Springer says:

    “…you don’t sit on your ass making bar graphs with error bars to figure out what’s gone wrong.”

    …says the unemployed dilettante who spends his time defending the indefensible.☺

  159. Dave Springer says:
    January 22, 2011 at 1:52 pm

    Well, thanks, but that didn’t address my point. Ya think I believe in CAGW? Nah.

  160. Mark T.

    Liquid in glass thermometers are analog instruments not digital.

    You remind me of the guys that resort to quantum mechanics to explain how CO2 works (or doesn’t work if that’s your belief) to raise surface temperatures. QM gets so complicated that even the experts disagree. That’s unfortunate because classical mechanics is all that’s needed to explain it and it’s much simpler – so simple that they knew how it worked 200 years ago and John Tyndal proved it experimentally 150 years ago.

    Don’t make this overly complicated. The instrumental record in and of itself is adequate for the use to which it’s being put. You of course are quite welcome to try proving otherwise but if I could bet I’d bet you ever becoming famous for succeeding in that proof – you’ll just keep spinning your wheels and going nowhere.

  161. Nothing can be done to improve the accuracy of historical data (other than getting it out of the Hockey Team’s control). It is possible, however, to make more accurate measurements going forward. Check out the Analog Devices AN-0970 application note, RTD Interfacing and Linearization Using an ADuC706x Microcontroller. Its first paragraph reads:

    The platinum resistance temperature detector (RTD) is one of the most accurate sensors available for measuring temperature within the –200°C to +850°C range. The RTD is capable of achieving a calibrated accuracy of ±0.02°C or better. Obtaining the greatest degree of accuracy, however, requires precise signal conditioning, analog-to-digital conversion, linearization, and calibration.

    (Search the http://www.analog.com website for “AN-0970″ to find the application note.)

    Combine a chip optimized for making temperature measurements with a platinum RTD and other components to create a system that can measure temperature accurately and communicate with portable calibration equipment. Create a portable calibration device with a calibration head designed to slide over the temperature probe to create a controlled environment for running the tests. At the risk of further exacerbating Continuous Anthropocentric Global Whining (CAGW), include a CO2 cartridge to allow cooling of the temperature probe and a heater to allow calibration over the expected temperature range.

    Periodically calibrate the calibration device to a NIST-traceable field standard. At the weather station calibration interval, drive around to the various sites. Slide the calibration head over the temperature probe, connect a communication cable to the temperature measurement electronics (unless it communicates via RF or infrared, in which case this step would be unnecessary), and let it run a complete characterization and recalibration. The characterization that’s run prior to recalibration will allow tracking of thermometer drift by serial number.

    The thermometer systems wouldn’t have to be overwhelmingly expensive (the government has spent more for toilet seats) and the readings would be at least an order of magnitude more accurate than provided by current measurement methods. Calibration could be done by someone with minimal training. Properly designed, the calibration unit would insist on being recalibrated at appropriate intervals, and individual temperature measurement units would report the time from last calibration along with the data.

    Then all we’ll have to do is wait 150 years to accumulate an instrumental temperature record equivalent in length to the one that’s relied on today.

  162. Smokey says:
    January 22, 2011 at 8:27 pm

    “…says the unemployed dilettante who spends his time defending the indefensible.☺”

    As opposed to say Michael Mann the employed expert who spends his time manufacturing phoney statistics.

    I’ll take that as a compliment. Thanks.

  163. The insane thing about the entire argument over the distribution of the error in the surface stations is:

    This is something that should actually be directly determinable with a cross calibration with satellites from 1978-now.

    Not a correlation “Global mean surface temperature vs. Global mean satellite temperature”, but each individual surface stations vs. the satellite estimate for that same lat/long.

  164. As a mechanical engineer, whose 38-year career has been in forensic engineering and metrology (not meteorology), I can truly say I’ve enjoyed reading this thread more than any other I’ve followed in recent times.

    At last, someone has pin-pointed the climate change subject [measurement uncertainty] on which I have often been tempted to comment.

    I have usually hesitated to raise this issue because I believed that what has developed here would happen: some ‘highly qualified’ people have ended up arguing somewhat emotionally, without being clear about whether they are arguing about ‘a fact’ or ‘an opinion’.

    In the above bruising “MARK T vs DAVE SPRENGER” bout, which has now been through several gruelling rounds, I score Mark T a hands-down winner on his main points. Dave fought bravely, but not well. Too many ‘low blows’ and very poor techniques contibuted to his ultimate defeat.

    Dave did not need to mention his experience or expertise (i.e. his self-perceived level of ‘authority’) if his factual point was correct and defensible. Mark’s mathematical counter-punches stood up very well against the ham-fisted hay-makers of Dave.

    For my own contibution to the debate, I would simply remind readers of Ross McKittrick’s brilliant contibution: he (a statitician) actually reminded me (an engineer) that temperature is an intrinsic variable, whose average is NOT a physical variable, which is something I had ‘learned at school’ but ‘forgotten in my wider world-experience'; the average of two temperatures is not the temperature of anything; it is not even a temperature, it is a statistical concept.

    To Mark: ‘Well done’.
    To Dave: ‘Study a replay of the bout and pick up some very good boxing tips.

  165. Hoser says:
    January 22, 2011 at 8:30 pm

    “Well, thanks, but that didn’t address my point. Ya think I believe in CAGW? Nah.”

    Sorry. I certainly didn’t mean to imply you were among the CAGW faithful.

    I didn’t address the point because I didn’t disagree with it. I don’t believe it’s practically possible to figure out the age and manufacturer of every thermometer used over the last 100 years ever used to make an entry that was swept up into a global database. I think it can be safely said it’s a mixed lot of different brands, quality, and ages. But that’s not a bad thing. If you go down to Walmart and buy one of every different thermometer they carry (probably 50 different ones) and set them all in one place and average the readings you’ll get an accurate precision temperature as a result better than the resolution and accuracy of any individual instrument in the lot (barring having ones that are obviously broken and reading so far off the mark you know it can’t be right). I presume the global network of thermometers over the past 100 years is pretty much just like that.

  166. Mark T,
    With regard to your reponse to my post.

    Your remarks “Because you don’t know what you’re talking about, either” and “Wow. Ignorance is contagious. That’s all I can say regarding this threads” is alas typical of the level of civility that is unfortunately too common on these sites.

    I have a PhD and over 30 years research experience so I actually do know something about experimental errors. In this regard I am unfamiliar with the tactic of personal abuse substituting for reasoned argument.

    If you would be so kind, would you explain in simple terms what is wrong ith this analysis.

    “And when you are measuring changes in temperature, systematic errors will also cancel out. Take a thermometer than reads 1 degree high. If in 1950 it read 72 F it was actually 71 F. In 2000 it read 74 F it was actually 73F. But the rise is 2F regardless of whether you you correct for the true temperature or not.”

  167. If you have not already read this, it’s highly relevant:

    http://kenskingdom.wordpress.com/2010/11/22/what%e2%80%99s-the-temperature-today/

    It would be interesting to obtain the code used to extract daily maxima and minima from records that are collected many times a day: and compare them with Hg max and min thermometers.

    It is even conceivable that the lack of global warming in the past decade can largely be explained by this type of difference. (Except that it seems to affect satellites too).

    Thanks, Ken Stewart.

  168. EFS_Junior says:
    January 22, 2011 at 7:30 pm

    http://en.wikipedia.org/wiki/Iid

    AFAIK all distributions of errors in temperature measurements are Gaussian with zero mean. Therefore, all Gaussian distributions have the same probability distribution (e. g. Gaussian).

    Wow, and it continues.

    First of all, not all Gaussian distributions have the same probability distribution. If they don’t have the same variance, or the same mean, then they are not identical. I should note, btw, that the error associated with the minimum gradation is actually uniformly distributed, not Gaussian. Duh.

    If you can prove that all of the errors are a) Gaussian, b) with the same mean, and c) with the same variance, then I will believe you, btw. As it stands, based on some of your other comments, I’m guessing you would not even know where to begin. Hey, consult with Dave Springer and maybe the two of you can publish something.

    Each station’s temperature measurements are independent from all other stations, if they were not, than the two stations would, in fact, be the same station.

    Um, no, that is not what independence means. I’m not even sure how to tackle this one because you clearly do not have sufficient background to understand. Independence simply means that their distribution functions are independent of each other, i.e., F(x,y) = F(x)F(y) and likewise for densities f(x,y) = f(x)f(y), which is analogous to probabilities in which two events are independent if P(AB) = P(A)P(B).

    In fact, no two instruments will ever have the exact same “identically distributed” distribution (exact to infinite precision), not even remotely possible, again only possible if, in fact, the two instruments were one in the same instrument.

    Exactly. Thanks for pointing this out.

    Wait a minute, did the above just say “assumption” and “finite variance” and “normal distribution? Why, yes it did.

    I’m getting the impression you’re attempting to learn statistics as you go. Nothing wrong with that, but you missed the point of what this quote said. What you quoted said exactly what I have been saying: if the RVs are i.i.d., then their average will approach the actual expected value (the true mean.) The rate, of course, at which it approaches the true mean is sqrt(N), which I have also acknowledged. But you need to be able to support the assumption of i.i.d. in order for this to work.

    Well than, so much for that ole’ nasty IID requirement, it’s been met head on and found to be not so necessary if need be, however, as the specious uncertainty paper claims, all sigmas have zero meas as all instances are shown as +/- meaning symmetric about a zero mean.

    How do you know that all errors associated with temperature are zero mean?

    If it looks Gaussian, if it feels Gaussian, if it smells Gaussian, if it sounds Gaussian, and if it tastes Gaussian, than it must be Gaussian! Q.E.D.

    QED means you proved something by demonstration when all you did was assume a Gaussian distribution. Pretty silly, actually.

    Now if you want to talk about instrument bias offsets (that vary randomly with time, as they invariably do), I’m game.

    You aren’t even in the right ballpark.

    Mark

  169. Willis writes:

    Finally, I would back Steven Mosher to the hilt when he tells people to generate some pseudo-data, add some random numbers, and see what comes out. I find that actually giving things a try is often far better than profound and erudite discussion, no matter how learned.

    w.

    So you’re suggesting creating a numerical model of reality and see what the results are? Isn’t that where all the trouble starts in the first place?

    My suggestion was buy 50 different kinds of thermometers from WalMart, put them all in the same place, read them as best you can, average the readings, and you’ll get a result where the accuracy and resolution is better than any individual instrument in the whole lot. That’s an experiment, not a model. There’s a big difference. Experiment is what’s lacking in climate science. It’s all models. Like the proverbial Texas cowboy who’s all hat and no cattle.

  170. Whew. What a long discussion.

    I think a significant potential error in the record is station adjustments (made much later) to make new instruments smoothly agree with the old ones, given that the old ones may all show an upward drift over time (the glass problem). This would transform a sawtooth error (with a period set by thermometer replacement or recalibration) that contains no long-term trend into a continuous upward slope.

    This thread also makes me want to throw up my hands and say we’d be better off using pendulums to measure temperature, using a reversed grid-iron to magnify the coefficient of thermal expansion instead of cancelling it. Counting things (like swings) and measuring time to millisecond or microsecond accuracy is one of the easier things we do. Measuring voltage, current, or the height of a bubble of mercury – not so much. Plus, physicists just love pendulums.

  171. I should first note that I am NOT the Mark from Mark’s view, i.e., I did not post this original thread.

    Back to Dave Springer:

    First, you have yet to actually reply to any of my questions that are technical/theoretical in nature. I’m guessing that is because you never actually got an engineering degree, correct? You seem to have a complex regarding that, quick to point out how good you are. That’s an interesting psychology, IMO. Maybe you have an AS degree, or a BSEET, but I’m getting the impression you actually don’t have the math background which implies no BS or anything advanced. If you did, I’m sorry, it just does not seem so – I’d expect you to have answers to my questions if you had even one statistics class under your belt.

    You warmists are always so enamored with authority, and qualifications. Just the fact that I clearly understand this topic should be sufficient, but alas, you guys need something to attack since you can’t get anybody on the technical.

    No Mark. It’s related to the fact that when 10,000 computers a day (no exageration, I was senior R&D in Dell laptops 1993-2000 after which I retired because I never needed to work for a living again) that you designed are supposed to be rolling off the production line and an unacceptable number of them are failing some bit of manufacturing test which results in the line being shut down and millions of dollars per day are going out the window you don’t sit on your ass making bar graphs with error bars to figure out what’s gone wrong. Well maybe academics do that. Someone with decades of experience uses practical knowledge and uses it quickly or starts thinking about whether his resume is up to date.

    Statements like these are what makes me believe you don’t have a degree, btw. People that become engineers through hard work but while lacking a solid educational background tend to have issues with us learned types, like a Napolean complex or something. You feel a need to point out that your experience is worth more than my education. Quite frankly, I agree that experience is worth more in the long run, but the math and statistics I learned in school isn’t something people are going to pick up by themselves. And, unfortunately for you I guess, I have both experience and education – experience and education in this exact field, btw. I teach, too, though I can’t say I enjoy it and never more than one class at a time. Extra income and a hassle once a week is what I get out of teaching.

    Instead of being obtuse why don’t you use your real name and state what your experience actually is instead of giving me a trail of breadcrumbs to follow to figure it out? What’s up with that? What are YOU hiding?

    How am I being obtuse? I understand how the LLN works and when you can actually apply it. I’m sorry if you don’t agree, but you’re wrong until you can prove the things I have pointed out – I did my job, now you do yours if you want to have any chance at credibility.

    Who I am is none of your business, btw. I don’t need to “hide” anything, but I am smart enough to know that putting my full name on the internet is foolish. I did note that I have listed my basic background over a tAV in the reader background thread, which you are free to peruse, but none of that makes any difference anyway. I could be an uneducated, inexperienced fop for all it’s worth and still be right. The assumption is in the theory for a reason.

    Mark

  172. davidc says:
    January 22, 2011 at 7:25 pm

    Yes, but if you took this interpretation the “error” coud not be less than the individual measurent error.

    Definitely true. The uniform distribution of gradation error is the best you can ever do. Such errors, if they were the only ones, WOULD be i.i.d., btw, because they are purely an artifact of the resolution of the measurement itself, and the LLN/CLT would apply nicely.

    Mark

  173. I think the issues here are that statistics is so hard to understand. As humans its hard to realize that error can head downwards when say looking at a trend…

    But when you take a large amount of stations and keep the same type of thermometer, the only thing you need to look out for as far as statistical issues is trends that get larger. Lets look at just one comment:

    “Another problem comes from taking the average temperature to be halfway between Tmin and Tmax”

    This is a problem if you are using temperature for anything but figuring out trends. As in measuring temperature, you can measure it in about 50 different ways, but statistically it will normally create the same trend no matter how you measure it. This is because you use multiple stations and you have numerous data point all over the place. The error over larger periods of time will head down farther and farther.

    However, I tried to say this earlier, but maybe was not clear enough about it. If you take stations out over time for instance, its difficult if not impossible to figure out the error this creates. If you add say urban heat effects, this is impossible to difficult to figure out as well.

    These trends add error to the actual trends which is something you can argue about when you are talking climate. Observer bias, station removal and other events will also tend to do the same thing. This is difficult in statistics and this is why the field is so difficult to understand.

    In a nutshell: we have this comment:
    Jeff Alberts says:
    January 22, 2011 at 5:53 pm
    So, would gradual, but constant, encroachment of urbanization upon the site be considered systematic?

    Yes, this is an important systematic change.

    Like I was saying before, if you use less stations and less data points, the trend will tend to be wrong by higher amounts, but as you add stations and add other areas, the error will be minimized. I am trying to explain this, and I probably failed both times, but it should be noted that since modern recording of temperatures started…. we have probably warmed by roughly 0.7 C.

    Now to say that a certain year is the warmest ever when its so close to another is probably stupid since you are removing data points and adding error like I said previously….. The two years are probably close enough to call the same, and if one year is higher then another, its better then even odds that it is warmer, but you can never say that one year is warmer then another without taking error into consideration.

    Where I was trying to go with my first post, is that Metrology is important because it is possible to determine the error using statistics. In addition, it is also possible to check the temperature record for any changes that might be artifacts of measurement error so to speak…the trends do not change, but the error does. As you eliminate bits of noise from data, other trends emerge that might not have been visible previously…

    And just a side not, like I was trying to say there are probably an infinite amount of ways we can measure temperature trends and none of them are really wrong as long as:

    The method does not change.
    The method is applied uniformly.
    Weather events are not even considered for climate. (Even one month long events are going to have large amounts of error.) The more data the better.

    In conclusion, the method we have now is probably fine. Are there other methods that might be better? Sure, they might show trends better due to different error configurations and allow different noise patterns to rise above the level of detection.

  174. Dave Springer says:
    January 22, 2011 at 9:55 pm

    So you’re suggesting creating a numerical model of reality and see what the results are? Isn’t that where all the trouble starts in the first place?

    Actually, he’s suggesting that you perform a simulation with i.i.d. data to demonstrate to yourself how the errors cancel.

    My suggestion was buy 50 different kinds of thermometers from WalMart, put them all in the same place, read them as best you can, average the readings, and you’ll get a result where the accuracy and resolution is better than any individual instrument in the whole lot.

    No kidding. Then you’d likely meet the i.i.d. requirement. Assuming the thermometers don’t all have some bias the “average” would probably be closer to the true temperature than any individual reading. If any of them have some bias, however, then your average, while very precise, will not be near the actual temperature.

    This analogy does not apply to the global temperature readings. Their error distributions are unknown (perhaps even unknowable.) All this means is that you cannot assume the LLN applies, even if it may actually apply. Nothing more, nothing less, nor have I said anything other. The results may actually be more accurate, but accuracy is not the problem, the results have an increased uncertainty without such knowledge.

    The stationarity issue Pat points out is really bad since that means the error distributions may change over time and across temperature readings. The error plot in Mark’s post above indicates a thermometer that does not have i.i.d. errors even in its own data, for example.

    That’s an experiment, not a model. There’s a big difference. Experiment is what’s lacking in climate science. It’s all models.

    No kidding. Models serve useful purposes, however, by enabling insight into a process that is otherwise difficult to test. Models are the fundamental means systems engineers get from initial requirements to design requirements. Assuming Gaussian noise for your temperature data errors is applying a model to your measurement process. Models are inescapable.

    Mark

  175. Ben D. says:
    January 22, 2011 at 10:24 pm

    I think the issues here are that statistics is so hard to understand.

    Very true…

    As humans its hard to realize that error can head downwards when say looking at a trend…

    Then you immediately provide a demonstration of said truth… sigh.

    Mark

  176. Dave Springer:
    “My suggestion was buy 50 different kinds of thermometers from WalMart”

    I think it’s an excellent suggestion to have more than one thermometer of each type at each weather station. If they disagree, they should be immediately calibrated. And they should be calibrated regularly even if they agree.
    The data from any weather station that does not have records of at least yearly calibration should be treated with suspicion. Of course that would be the majority of weather stations.

  177. Willis said;

    “Finally, I would back Steven Mosher to the hilt when he tells people to generate some pseudo-data, add some random numbers, and see what comes out. I find that actually giving things a try is often far better than profound and erudite discussion, no matter how learned.”

    I’vd done two numerical experiments (one with randon uniformly distributed noise with 2 < N < 65536, the other with a ~120 year long (Hay River, Canada an inland station)). I've also completed 23 High Arctic Weather Station (HAWS) stations all using the original raw daily records (Canadian HAWS all along the Canadian Arctic coastlines).

    The same low frequency temperature trend line always shows up for all 23 HAWS, and are all quite similar in appearence, the bottom line is the HAWS have seen ~3C rise since the early 70's (or an ~4C rise starting from the early 20's).

    It does not matter what the thermoneter accuracy actually is as I can take the 0.1C raw data and round it to either 1C increments or even 10C increments, the same low frequency trend line always shows up.

    The period of the calculated anomaly is 1951-2010 (N = 60). Integrating the anomaly curves always results in R^2 ~ 0.99, so the low frequency signal is very real, otherwise the integrated anomaly graphs would fluctuate about their mean with no apparent trend lind and R^2 would approach zero.

    So as far as I'm concerned the whole thermometer accuracy argument is a red herring and a moot point as far as I'm concerned.

    23 HAWS stations all with the exact same systematic errors (note the definition of a systematic error, as defined by the author of the specious uncertainty article, is always +/- sigma, so there can be no bias offset corrections to be made there, as bias offsets are not the subject matter of that specious uncertainty article to begin with in the first place)?

    I think NOT!

  178. So we have a large bunch of numbers representing temperature from a bunch of widely spaced thermometers with the readings taken over a year, 50 years ago and we think the uncertainly is say +/- 1deg C on each individual reading

    Now we have a large bunch of numbers representing temperature from a bunch of widely spaced thermometers with the readings taken over the last year again with the same uncertainty of +/- 1 deg C.

    We’ve taken averages of both sets of readings and one has the average come out 0.5 deg C higher.

    Trouble is they are different thermometers with different error sources and aging characteristics. Not only that, they are different numbers of thermometers in both cases and the locations aren’t even necessarily the same???

    Yet some people say all this doesn’t matter because there are large numbers of thermometers and readings in both cases and confidently state that it has become warmer because of these readings???

    ROTFL!

    I’m no longer a fan of any of the instrumental record. Surface temps for the reason above and I’ll place some faith in the satellite record when someone recovers a satellite and recalibrates the platinum RTD and associated electronics on the ground after prolonged exposure to temperature cycling and radiation. This doesn’t detract from the satellites as daily weather monitors but makes them problematical as climate monitors.

    Just use biology as climate markers. This integrates all climate change but has its own problems as life adapts and is always expanding to the physical limits of its range.

    BTW, Dave Springer. It helps to check the placards of the aircraft you are about to fly or read the POH or Flight Manual. Then you would a) Know what the calibration of the ASI was and b) it wouldn’t matter as you would have the correct numbers to fly by.

    It is also my understanding that the folks who run real production lines DO in fact make fancy bar graphs etc to keep an eye on what they are making and fix it before the line has to be shut down.

  179. Alfred Burdett says
    ——-
    This explains well the kind of underlying psychology. However, to be clear, I was not suggesting fraud, but simply the possibility that unconscious factors, including preconceptions about climate change,
    ——
    I don’t think it’s a sensible question because the people who make the measurements likely do not have an opinion about climate change.
    Afterall most of the measurements are historical and pre AGW.

    The occasional recent one who might knows climate change is real and has no need to fake data.

    Of more concern is normal human laziness and incompetence. This and many other issues is why the temperature measurement is done by satellite these days.

  180. Dave Springer says:
    January 22, 2011 at 8:44 pm

    To be clear, Tyndall did not prove a damn thing about CO2 absorption. His equipment was far too primitive to distinguish between absorption, reflection, refraction, diffusion, scattering or anything else. He incorrectly concluded that all energy missing between the source and the pile in his half baked experiments had been absorbed by CO2. Above all he ignored Kirchhoff’s law and that was his biggest mistake. The conservation of energy falsifies the “greenhouse effect” because as per Kirchhoff’s law that which absorbs, equally emits. This fact is absent from Tyndall’s ramblings and exposes him for what he was.

    Anyone who quotes John Tyndall as the man who proved the “physics” of the “greenhouse effect” displays nothing short of sheer ignorance. It is the ultimate in the bogus appeal to authority. John Tyndall was fool and a fraud. Above all he was an insider.

  181. This article is profoundly important.

    Also, what Willis said++

    I’ve taken more rocks than I care to think about over the issue of what precision to use. One point I’d add: The official guidance of NOAA for years was that if it was, for some reason, not convenient or was impossible to make an actual observation the observer was directed to guess and enter that value on the report. I’d originally linked to a NOAA page with that statement on it, but the AGW Langoliers have had that page erased. Yet the data remain in the record…

    So, if you are depending on the Law Of Large Numbers to give you precision of 1/100C as the AGW Faithful believe, then you are making a load of assumptions. Many are illustrated as false in the posting above. But add to that the point that measuring A thing 100 times is different from measuring 100 things one time and we begin to understand the statistical problem. An AVERAGE can be computed to a very fine precision indeed. BUT it has little meaning. IFF the underlaying numbers are +/- 1 F then the average of them can be computed to a very much finer degree, but the meaning of what the actual temperature might have been is not improved. Eyeballs might have been consistently looking up, or down, depending on average height of the observers. Meniscus might have been bulging up or down depending on materials used. How old were the thermometers and their glass? When was each site converted from LIG to semiconductors? What is the aging characteristic of a semiconductor system over decades?

    So yes, you can take all those numbers and make an average that can be known to 1/10000 C. But the MEANING of that number is lost. It is not saying the actual temperature is higher by 1/10000 C it is saying that something in the process of recording all the numbers, and we don’t know what is higher by 1/10000 C.

    Somehow I’ve not yet found a good way to encode that understanding into words.

    If you measured A spot on the planet ( or each spot on the planet) 10 times in the same moment, then that “law of large numbers” increase in precision would MEAN something. But if you measure 1,000,000,000,000 places ONE TIME EACH with a precision of +/- 1 C you can NOT say if things have warmed or cooled by 1/100 C based on the average. You simply do not know in which direction the error terms lie, and to assert that they “all average out” is just another kind of lie, for you do not know.

    I lack the skill to make this clear, and for that I am truely sorry. But it is simply not possible to average away the error terms by making a larger number of errors.

  182. (min+max)/2 = Average temperature – yeah, right.
    We have a local weather station, sited in a rural location along the road east of Newport. The weather data is displayed in real time on the website isleofwightweather.co.uk, and tabulated data at 5 minute intervals is downloadable in 7 day chunks.
    As an exercise, I wasted an hour of company time analysing 25 days of this information. The bit I was interested in was how well taking midway between a days max and min represents a true reading of the average temperature.
    Well, would you believe it! No correlation whatsoever! The differences between the “real” average and the (min-max)/2 figure (I really can’t bring myself to say average, and I can’t find a mathematical function name for it) varies between -1.6 and +1.7, a spread of error of 3.3oC. And the errors showed no pattern, -0.1, +0.2, +0.1, +0.0, +1.7, -0.7, +0.1, -0.1, +0.1,….. -1.6, -0.4, -0.3, -0.1, -0.6, +0.1, -0.9, +0.8.
    Yet we are supposed to believe these guys know the world’s average temperature to the nearest tenth of a degree!

  183. OMG it’s worse than we thought! I mean, the error :-) Metrology is exceedingly important (when you use real world data, that is — you can do without in models). Never enter a lab without it.

  184. Oliver Ramsay:

    Please be assured that I believe that you think you are much more objective than everyone you disagree with.

    Thanks for brightening up an otherwise dull morning. I love it! :-)

  185. SimonJ says:

    “(min+max)/2 = Average temperature – yeah, right.
    We have a local weather station, sited in a rural location along the road east of Newport. The weather data is displayed in real time on the website isleofwightweather.co.uk, and tabulated data at 5 minute intervals is downloadable in 7 day chunks.
    As an exercise, I wasted an hour of company time analysing 25 days of this information. The bit I was interested in was how well taking midway between a days max and min represents a true reading of the average temperature.
    Well, would you believe it! No correlation whatsoever! The differences between the “real” average and the (min-max)/2 figure (I really can’t bring myself to say average, and I can’t find a mathematical function name for it) varies between -1.6 and +1.7, a spread of error of 3.3oC. And the errors showed no pattern, -0.1, +0.2, +0.1, +0.0, +1.7, -0.7, +0.1, -0.1, +0.1,….. -1.6, -0.4, -0.3, -0.1, -0.6, +0.1, -0.9, +0.8.
    Yet we are supposed to believe these guys know the world’s average temperature to the nearest tenth of a degree!”

    Thanks 1. (min+max)/2 = Average temperature – yeah, right.
    We have a local weather station, sited in a rural location along the road east of Newport. The weather data is displayed in real time on the website isleofwightweather.co.uk, and tabulated data at 5 minute intervals is downloadable in 7 day chunks.
    As an exercise, I wasted an hour of company time analysing 25 days of this information. The bit I was interested in was how well taking midway between a days max and min represents a true reading of the average temperature.
    Well, would you believe it! No correlation whatsoever! The differences between the “real” average and the (min-max)/2 figure (I really can’t bring myself to say average, and I can’t find a mathematical function name for it) varies between -1.6 and +1.7, a spread of error of 3.3oC. And the errors showed no pattern, -0.1, +0.2, +0.1, +0.0, +1.7, -0.7, +0.1, -0.1, +0.1,….. -1.6, -0.4, -0.3, -0.1, -0.6, +0.1, -0.9, +0.8.
    Yet we are supposed to believe these guys know the world’s average temperature to the nearest tenth of a degree!

    Thanks SimonJ. This is precisely the point which I raised with Steve Mosher in a post on a dfferent threada week or two ago. The reply was that over a sufficent period of time (Tmin + T max)/2 – T mean was as near to zero as could be. And anyway it was the trends that counted and the trends in Tave (= (Tmin + T max)/2 and Tmean were always the same.

    I asked for any paper that had been published which supported these assertiuons and got no reply.

    As a start it might be intresting to plot the trends in Tmax and Tmin separately. Is there any evidence that these slopes are correlated let alone identical?

  186. “Oliver Ramsay says:
    January 22, 2011 at 5:36 pm

    …….I’m beginning to warm to this notion; two wrongs don’t make a right but millions of wrongs do……”

    ROFL! Priceless!

  187. I hate to bust some long held beliefs, but more observations will NOT increase accuracy – not in this case. What Willis and others are discussing are an increased number of measurements of the same condition – close in time and space and general conditions. This is not the case for daily temp readings over the last century.

    If I am measuring a mountain, yes the more measurements we take the more error is reduced. Each temp measurement has the same error bars each time unless you combine measurements from the same locale and time. There is no removal of error on a sparsely measured phenomena with high dynamics.

    A quick and easy example is satellite orbit measurement. I can take 20 measurements in an hour from one source and remove a lot of the error because the forces that disturb the orbit do not act on this time scale. Or I can take 20 measurements over an hour from a couple of different sources and really drive out error.

    But if I take 20 measurements over 20 days I do nothing to reduce the error in the computed orbit. At this cadence the system is changing due to inherent forces and I can no longer distinguish system changes from measurement error (or uncertainty or precision – whatever you favorite version is).

    It is the spacial and temporal density of the measurements which drive out error, not the number.

    Sorry folks, but that is the reality. Anthony, as you know I have been preaching from the error bar altar for years. This post cracked open the entire mathematical foundation of AGW and shown it to be shoddy and wrong. There is no reduction of error from daily measurements of a highly changing combination of forces (daily local temperature, long term global climate). Therefore the local temp error discussed in this post is the BEST one would ever get from the data taken in this manner.

    On a global scale it will be even worse – 3-5°C, as I have predicted for a long, long time. Therefore the .7°C/century ‘rise’ in global temp is a statistical ghost, not a reality. Could be lower, could be higher – we don’t have the data to know.

  188. Great post Mark. Everything you’ve written is spot on. Measurements are all about accuracy and precision, and when computing means, bias and calibration.

    With the (lack of) precision and accuracy of current and past weather instruments being well known, you cannot confidently express measurement results in a notation that goes beyond the precision of the instrument used to compile the results – Period. Those “.0#” temperature results are “falsely precise” and further, they falsely imply more accuracy in the results than is possible with the instrumentation.

    Some say they can “Sigma” the data to death, but in the absence of a precise reference value the bias is unknown, calibration is not possible and therefore, those results are also not beyond suspicion. Those that do not agree with you are deluding themselves.

    Best,

    Jose

  189. Lazy Teenager said:

    Re: Observer Preconception Bias (OPB)

    “… the people who make the measurements likely do not have an opinion about climate change.
    Afterall most of the measurements are historical and pre AGW.”

    The first presumption is bizarre. Do you know anyone without an opinion on climate change?

    As for most of the measurements being historical and therefore being free of OPB, that only confirms the potential for OPB to create a difference between recent and more remote past temperature records that is unrelated to any actual difference in temperature.

  190. Elaborating on the arguments Mark T has already put forward, and putting things in a slightly different way to make them more readily understandable by us ‘engineering types':

    In any measurement system you have:

    1) High-frequency noise – noise which is well above the frequency of the signal you’re trying to measure, and is the only type of noise which is relatively easily filtered out by averaging loads of measurements

    2) Low-frequency noise – noise which is well below the frequency of the signal you’re trying to measure. This type of noise doesn’t really apply to temperature measurements, so can be ignored. (periods of thousands of years or more)

    3) Pass-band noise – noise which is of similar frequency to the signal you’re trying to measure. This is the most problematic of all, as there are no good ways removing it without also degrading the signal. Most of the long-term errors, such as drift, ageing, urban creep etc fall into this category, unfortunately.

    4) Discontinuities – such as the unavoidable boundaries of the measurement period. Attempts to remove these introduces noise of a frequency well within the pass-band.

  191. There is an additional complicating factor: measurement error introduced by framing.

    That’s a pun – intentional since expectation plays a big role in what the individual reporter sees and writes down, but the particular frame I have in mind right now is the Stevenson Screen.

    These are made of wood, and as the paint chips off become increasingly effective as humidity moderators – in the same way that a stone fireplace holds heat long after the fire goes out, the wood retains humidity (positive and negative relative to the surrounding air) for some time after ambient conditions change.

    As a result measurements taken inside a Stevenson Screen right after a rain squall tend to over estimate relative to “real” ambient, while those taken a bit later under-estimate ambient.

    This does not average out over time or multiple locations – the actual “average error” (a dubious term) for a particular location and period depends on how often it rains, during what time of day, for how long, at what start/end temperatures, the state of the paint, how the measuring devices are mounted, and wind direction relative to the mounting points inside the box, among other factors.

    Bottom line? One measurement is a guess, one million measurements amount to one million guesses – and either way there’s absolutely nothing to support change hypotheses couched in partial degrees.

  192. Mark T
    I commend your patience at trying to extract any coherent signal from the noise being generated by certain “contributors” to this thread.

    I also thank you for the information imparted. I have learnt a lot, even if others refuse to.

  193. Yep, there’s the infamous iron rake head and
    antlers on top of the box to give us that warm
    feeling.

  194. SimonJ says:
    January 23, 2011 at 2:08 am

    “the (min-max)/2 figure (I really can’t bring myself to say average, and I can’t find a mathematical function name for it)”

    The term is used in two senses, but how about “median”?

    me·di·an

    3.
    Arithmetic, Statistics . the middle number in a given sequence of numbers,

    5.
    Also called midpoint. a vertical line that divides a histogram into two equal parts.

    The second version, “midpoint”, seems to match what is being done.

  195. To continue on with all this IID and LLN foolishness;

    Mark T says:
    January 22, 2011 at 9:50 pm
    EFS_Junior says:
    January 22, 2011 at 7:30 pm

    http://en.wikipedia.org/wiki/Iid

    AFAIK all distributions of errors in temperature measurements are Gaussian with zero mean. Therefore, all Gaussian distributions have the same probability distribution (e. g. Gaussian).

    Wow, and it continues.

    First of all, not all Gaussian distributions have the same probability distribution. If they don’t have the same variance, or the same mean, then they are not identical.
    _____________________________________________________________

    IID does not state that the PDF’s have tha EXACT same identity (1:1 relationship);

    “In signal processing and image processing the notion of transformation to IID implies two specifications, the “ID” (ID = identically distributed) part and the “I” (I = independent) part:

    (ID) the signal level must be balanced on the time axis;
    (I) the signal spectrum must be flattened, i.e. transformed by filtering (such as deconvolution) to a white signal (one where all frequencies are equally present).

    So it would seem to me that this implies symmatry of the PDF about the mean (ID);

    To test this I ran another numerical experiment, with variances of 100, 1, 0.01 and 0.0001 for four uniform distributions (RAND() in Excel 2010).

    Note that the distributions are symetric about their mean, but the variances are NOT EQUAL!

    These four time series were than averaged, and guess what? The resulting distribution was uniform, with sigma equal to the RMS sum of squares/N (e. g. SQRT (sigma1^2+sigma2^2+sigma3^2+sigma4^2)/N.

    Therefore variance1.NE.variance2.NE.variance3.NG.variance4

    Thus the IID requirement (variances must be equal, but actually you would need to claim that all statistical moments would need to be the same as per your spurious identity argument) as you stated is incorrect.

    Therefore say bye-bye to IID!

    Now on to LLN.

    Continuing on with this 4th numerical experiment I also varied N as follows, 16 < N < 65536, so if 65536 (2^16) is a large number, is 16 a large number? Actually YES, statistically speaking.

    The results were the same for all N, for my example the final sigma is 1/N (1/4 = 0.25) for all N, for a handful of trials. Even a single trial was almost always spot on.

    Therefore, say bye-bye to LLN!
    _____________________________________________________________

    I should note, btw, that the error associated with the minimum gradation is actually uniformly distributed, not Gaussian. Duh.

    If you can prove that all of the errors are a) Gaussian, b) with the same mean, and c) with the same variance, then I will believe you, btw.
    _____________________________________________________________

    See comment above, (C) is NOT a requirement (You also left out all statistical moments above N = 2, why?).
    _____________________________________________________________
    As it stands, based on some of your other comments, I’m guessing you would not even know where to begin. Hey, consult with Dave Springer and maybe the two of you can publish something.

    Each station’s temperature measurements are independent from all other stations, if they were not, than the two stations would, in fact, be the same station.

    Um, no, that is not what independence means. I’m not even sure how to tackle this one because you clearly do not have sufficient background to understand. Independence simply means that their distribution functions are independent of each other, i.e., F(x,y) = F(x)F(y) and likewise for densities f(x,y) = f(x)f(y), which is analogous to probabilities in which two events are independent if P(AB) = P(A)P(B).
    _____________________________________________________________

    Actually I do know what independence is, it's a statistical ASSUMPTION!

    Kind of like assuming a Gaussian distribution, it's a statistical ASSUMPTION!

    So for a temperature record, if I record; 0, 0, 0, 0, ad infinitum, perhaps they would not be considered statistically independent?

    And so for a temperature record, if I record; RAND(), RAND(), RAND(), RAND(), ad infinitum, perhaps they would be considered statistically independent?

    Methinks you are NOT the statistical "scholer" you claim to be. :)

  196. Solomon Green:
    “As a start it might be intresting to plot the trends in Tmax and Tmin separately.”

    NIWA in NZ have downloadable figures over about 40 years for 9am temperatures and Tmin, Tmax and something undefined that they call Mean which is probably (Tmin + Tmax)/2.
    Tmin in NZ has increased slightly, while 9am and Tmax and Mean has stayed about the same.

    Here are the 9am and (Tmin + T max)/2 temperatures over nearly 40 years at Ruakura in NZ:

    Tmin has increased over that time, almost certainly because of the UHI effect of the nearby city at night.

    I suggest that 9am temperature is a better measure than (Tmin + Tmax)/2 considering that NIWA and others are unable to work out a true mean for a day since they don’t make dozens of readings per day.
    And obviously we have to work with whatever readings have been made over many years.

  197. For some time I have been troubled by the effort to glean future predictions of temperature from extrapolations of temperature data from multiple geographical regions and locations within a region. The tacit assumptions made by the “data manipulators” have been that the time-wise trends do not have any systematic errors and therefore everything averages out. The above discussion shows that the climate community did not attempt to correct the temperature data and assumed that each measurement was absolute or sufficiently precise that it could be considered as an absolute value. Had they done so they would have come to realize that the data are not of sufficient quality to construct a statistical correlation because the errors in the actual data do not reflect the total uncertainty. To use the analogy of measuring the width of a human hair with a ruler with one mm precision to observe the growth of the hair’s diameter with time. If the AGW scientists believe that the temperature tends manifest the effect of increased CO2 in the atmosphere from man made sources and that all other sources of warming can be ignored, they still have convince the scientific community that they have properly accounted for all the systematic errors in the temperature data identified by the many contributors above. They haven’t yet!

  198. jayman:

    “Tmin in NZ has increased slightly, while 9am and Tmax and Mean has stayed about the same. ”

    as predicted by AGW theory.

    just sayin.

  199. Here is a test that you all might consider. Again, note that I stress the importance of actually doing some computational work. ( WUWT needs more guys like Willis )

    Take a look at CRN.

    http://www.ncdc.noaa.gov/crn/

    That’s a pretty good reference point for well done accurate measurement.

    You probably have 100+ stations with up to 8 years of data.

    Ok. That’s your baseline for good measurement. ( triple redundant)

    Now, I want you to create a model of bad data collection, with all the kinds of errors you are worried about. That is take the CRN as “truth” and then simulate the addition of all the errors you imagine.

    For each of those 100 stations you will then have an ensemble of stations, and envelop of what “might be” if the measurements are as screwed as as you fear.

    Then realize that every CRN is PAIRED with an old station, so you can actually go look and see how close those ‘bad’ stations are to a superb station.

    You’ll find that the old stations track the superb stations quite well and that your error estimations are too wide. This is NOT to say that the error estimations of Jones are correct, they are too narrow, but by looking at 700 station years of data (from CRN) along with the old stations they are paired with you can actually put numbers on your doubts.

  200. Dallas Tisdale (No relation to Bob, er that Bob) says:
    January 22, 2011 at 5:42 am (Edit)

    Mosher has it summed up pretty well. The overall error for a single instrument would impact local records as has been seen quite a few times. As far as the global temperature record, not so much. Bias is the major concern when instrumentation is changed at a site or the site relocated.

    Adjustments to the temperature record are more a problem for me. UHI adjustment is pretty much a joke. I still have a problem with the magnitude of the TOBS. It makes sense in trying to compare absolute temperatures (where the various errors do count) but not so much where anomalies are used for the global average temperature record. Perhaps Mosh would revived the TOBS adjustment debate.

    #########

    TOBS. Let me review the issues with TOBS.

    1. the adjustment IS required. if you dont adjust for TOBS you have a corrupt ‘raw’ record. When the time of observation changes you will infect the record with a bias.
    that bias depends upon local factors. the adjustment is made by doing long range
    empirical studies.

    2. All TOBS adjustments are EMPIRICAL models. The model needs to be developed from empirical data and properly validated.

    3. every empirical model comes with an error of prediction. For example, if you have a
    9AM observation you will be estimating the temp at midnight from this 9AM observation. That estimation comes with a SE of prediction. in the US the SE is
    on the order of say .25C. Every site needs and gets its own model.

    So since ive been around people have been complaining about TOBS. The complaint goes like this:
    “TOBS raises the temperatures, therefore it is suspect.” this is wrong. And when you make this complaint no one in science will listen. However, here are the real problems that people MIGHT listen too.

    1. we only have the TOBS calculations for the USA. the USA is 2% of the land. How did other countries do TOBS adjustments. Karl’s paper on TOBS only concerns itself with USA validation. The model is empirical. You cannot generalize it to use it in
    Asia without doing a validation for Asia ( it has input parameters for lat/lon). SO, we need to ask the question about TOBS outside the USA.

    2. The SE of prediction is larger than the instrument error. Consequently error budgets need to be calculated differently for a station that has been TOBS corrected versus on that has not been corrected.

    So basically, you have two issues with TOBS. Both of them are valid concerns. First, where is the documentation for how TOBS is employed outside the US? and second, how is the error of prediction propagated.

    Personally, I think that they claim “TOBS is wrong” is weak and climate scientists can rightly ignore unsupported claims like this. But, it’s harder to ignore questions.

    1. How are records outside the US TOBS corrected?
    2. Where are the publsihed validation studies ( karl covers the US only)
    3. how is the SE of prediction handled.

    And realize that with #3.. jones, hansen etc.. NONE of them account for an uncertainty due to adjustment. Sadly, most people focus their criticism on the wrong point.

  201. Dave Springer has made another error even larger than his average error. He is 54. He reports that winters now are milder than when he was young. Dave, Dave, Dave, HOW OLD IS THE PLANET? Oh, is it four billion years old or so? Gosh those anecdotal records of the “Weather,” wherever he was, for a few decades, are the telling blow, reveals all.

    Steven Mosher, “a phenomena”? Some problems with plurals. Not germane, sorry. “Significant digits” seems to be the issue here. Simply put, an instrument designed and manufactured to be accurate to a certain level, when recorded and analyzed to a finer level, gives MEANINGLESS results. The money was not spent to acquire data to that accuracy. Climate science, claims to be a science, based on proving that the global average temperature has risen a few tenths, of a degree C, in the last 130 years, or so, or something. Much of the data is extrapolated, as we have no thermometers in the Arctic, not enough covering the oceans, and not very many in the Southern Hemisphere. And yet, with so much Adjusting of Archives, and so much Extrapolating of Records, and a really Bizarre group of “computer models,” we are asked to accept poverty of energy so that a nebulous potential “Climate Disaster” won’t happen in a hundred or more years.

    When Mr. Obama loses in a couple of years, the new President will be tasked with ensuring that such clowns get no grants…………

  202. Solomon Green.

    You simply cannot do the test with one station.
    Try 190 stations for 10 years. I pointed you at that data.
    try CRN 100+ stations for 5+ years.
    they do part of the calculation for you

    http://www.ncdc.noaa.gov/crn/newmonthsummary?station_id=1008&yyyymm=201101&format=web

    The (Tmax+Tmin)/2 is an estimator of tave.
    Since 1845 we have known that this estimator has errors.
    The essential thing to know is that as long as you dont change your estimator the
    average trend bias will be zero.

    Start here:
    Kaemtz LF. 1845. A Complete Course of Meteorology.

    Then GIYF

    You’ll end up reading a whole bunch of stuff from agriculture. You’ll even find studies that compare all the methods of computing Tave and comparisons of the bias.

    Suppose I measure your height in the morning in shoes. Call this a heel bias
    Suppose I measure you at night in shoes. same heel bias.

    What’s the trend in your height?

    Suppose I measure your height in the morning in barefeet.
    Suppose I measure you at night in shoes.

    What’s the trend in your height?

    As the GHCN documentation points out there are about 101 ways to estimate the mean. What matters MOST is that you use the same estimator WITHIN a stations history. Some stations give reading every 3 hours, some every hour, some once a day.
    If your estimating trend, you keep the method the same and you dont introduce a trend bias.

    When you have a trend bias to show, you’ll have a publishable work worthy of attention. Until then you have a reading assignment that starts in 1845.

  203. Thanks Mark,
    My first attempt at a career was in metrology, but I was (justly) sacked after eighteen months because of my clumsiness.
    However, I understood and still understand the principles well.
    (After picking myself up, which took a while, I went on to success in another field).

    Your post rings a bell with me.
    You have highlighted yet another reason to doubt claims about “the hottest year evah”.
    Good work.

  204. EFS_Junior, “as the specious uncertainty paper claims, all sigmas have zero meas as all instances are shown as +/- meaning symmetric about a zero mean.

    That’s pretty funny, confusing the results of an empirical Gaussian fit with a claim of pure stationarity.

    AFAIK all distributions of errors in temperature measurements are Gaussian with zero mean. Therefore, all Gaussian distributions have the same probability distribution (e. g. Gaussian).

    That one’s pretty scary, coming from a climate scientist. Field calibration of surface station thermometers against precision standards show the distribution of errors can be very far from Gaussian and very far from symmetrical about the empirical mean. You really need to read the methodological literature.
    +++++++++++++++++++++++++++++++++++++

    Mark T, congratulations, you’re doing a great job.

  205. Steven Mosher says:
    January 23, 2011 at 7:47 pm

    “jayman:

    “Tmin in NZ has increased slightly, while 9am and Tmax and Mean has stayed about the same. ”

    as predicted by AGW theory.

    just sayin.”

    As I have already pointed out, the oldest temperature record in the world, The Central England Temperature Record, shows the last 15 years as a cooling trend.

    http://c3headlines.typepad.com/.a/6a010536b58035970c0147e1680aac970b-pi

    Does AGW theory predict that?

    Just sayin.

  206. Hot diggity!

    What a great post Anthony. I wanted to be in the room watching it all, catching the mud slings. And what a great paper; thanks to Pat Frank and then Mark for your well researched, written and informative article. Fantastic.

    And these posters:
    Ian W says: January 22, 2011 at 8:58 am
    richard verney says: January 22, 2011 at 12:46 pm
    Philip Shehan says: January 22, 2011 at 2:56 pm
    David A. Evans says: January 22, 2011 at 4:05 pm
    Alfred Burdett ………………. and

    E.M.Smith says: January 23, 2011 at 2:02 am ………….. This article is profoundly important.
    Michael Moon says: January 23, 2011 at 9:11 pm

    It’s about sensitivity and specificity, and as I understand it – if the instrument chosen (that’s specificity) is the wrong one, then no amount of measuring will answer your hypothesis. If one states their hypothesis! And doesn’t need to take multiple ‘data’, if in fact it exosted, for the aim of overwhelming. Type I and/or Type II errors?

    So I reckon, like Monckton partially wrote in The Australian (national newspaper) this weekend, that it might be sufficient, it might be necessary but it ain’t temporal.
    Climate Crisis ain’t necessarily so

    http://www.theaustralian.com.au/national-affairs/climate/earths-climate-crisis-aint-necessarily-so/story-e6frg6xf-1225992476627

    But the northern mob don’t have penguins I think. These little flippered scuba-birds have neat camouflage. Some call it mimesis. Some evolution.

    Speaking of evolution, getting back to an earlier WUWT quibble on Jane Goodall and REDD

    http://findarticles.com/p/articles/mi_qa3724/is_200206/ai_n9130751/

    (c/- cartoonist Gary Larson, wiki)

    I worked in the desert, near Giles meteorological station, Surveyor Generals corner, Western Australia. Long way from nowhere, planes was pretty well the way in n out, just like the mines (and the PNG and Yukon posters) and for our great soldiers overseas. It was work, not academic tourism.
    Some of the Giles gals and fellas also worked in the Antarctica.
    In the desert shade some days it was 50C. We got really adept at shaking down Hg oral thermometers- cos that is all we had for diagnostic instruments for the like of infections like kids pneumonia etc. So we had a reasonable baseline to work from (human body temp) and then measured +/- from there. Anyway, it was pointless for many months of the year as the thermometer would shoot up to 42 and even with 3 minutes under the tongue there was no accurate reading to be had. And in winter (mid-year) the mornings were freezing!
    So after a while, in the cooler months when our primitive instruments actually did work (and we had them calibrated yearly or when they were dropped), we started to look at anomalies and combinations of our other observations thereof. We got pretty accurate. In fact we also tested against the pathology companies, who seemed to regularly change their parameters and instruments. Our predictions and subsequent prescriptive regimes worked pretty well in the time lag of pathology pecimens gettin gout an dresult sgettin gin by plane. Digital came after I left. But that doesn’t work so well in the heat either. And none of the measurements actually really improved these peoples’ health in the long term. Short term, plenty of lives were saved.

    That’s lives of people.

  207. Oops.. spelling and grammer!

    ‘Our predictions and subsequent prescriptive regimes worked pretty well in the time lag of pathology specimens getting out and results getting in by plane.’
    That was a two week time lag! We had to make [informed] decisions on the spot.

  208. Good article, thanks! (To Eschenback as well.)

    Regarding the rounding example of “rounding” 15.55555556 to 15.55, that is actually truncating – rounding would give 15.56. (The rule is half or more is rounded up, less down. Rounding is supposed to average out, truncating will bias low.)

    I think people are keeping more significant places to avoid introducing more error by the conversion process, but that should average out anyway.

    A fundamental is that precision becomes more critical as the two values being compared get closer to each other, because inherent inaccuracies become a very large proportion of the result. (Or even swamp it – are larger than the result, so the result is meaningless. Comparison is usually subtraction of one value from the other.)

    Since we are talking about small changes in temperature over time, where does that leave the claimed climate temperature trend?

    Regarding funding bias (Eschenbach’s Soviet Example), Doctor Randy Knipping from ON told of a tribal group somewhere in the hills (perhaps South Asia) that he helped, who were renowned for living to quite old ages. Knipping noted the elders had a good diet and stayed active by taking care of children while parents worked the fields, so might be expected to live longer. But he also found that they inflated their age – because age was prestige. (Source: his presentation to pilots at a COPA annual event, Lethbridge AB, early 2000s.)

    Otherwise, the newer study may be interesting for dendrology, which has been discussed extensively in this forum and Climate Audit. My understanding and memory is that access to water by roots may be a substantial factor in growth rate, even at the northern locations chosen to sample. (Hmm – if vegetation shifts due to average precipitation rate, might that affect water retention that trees depend on? (I’m thinking of low vegetation, including moss, that tends to retain moisture locally. Precipitation also affects erosion, which may remove useful soil, though that is probably most affected by peak precipitation rate which may depend on severity of storms. An example of the impact of moisture retention and soil may be the areas of Lilloet and Dawson Creek B.C., the latter having subsoil structure that tends to retain moisture – the difference in average rainfall is not great between the two places, yet the amount of vegetation is.)

    My modest understanding of precipitation patterns is that moisture may increase with altitude initially, such as on the wet coast (like the Cascade mountains in WA state and coastal mountains north of Vancouver BC where rainfall rate is much higher at least partway up due to moisture-laden winds hitting the mountains – some ferocious rainfall rates occur north of Vancouver, such as near Lions Bay). I guess that may not hold true at much higher elevations.
    (Rate probably depends a great deal on local topography – and the downwind side of the mountains is usually drier (as in the Cascades – Ellensburg WA for example).

    AusieDan makes a good point on January 20 at 5:48am. Did I hear there were plants showing up in this very wet period that hadn’t been seen for decades?

    I do like to laugh occasionally. Dianne, George Turner, and “jphn S.”: thankyou!

  209. Dave Springer says:
    “My suggestion was buy 50 different kinds of thermometers from WalMart, put them all in the same place, read them as best you can, average the readings, and you’ll get a result where the accuracy and resolution is better than any individual instrument in the whole lot.”

    You might want to read my previous post. If all the thermometers were made in the same “batch”, they would most likely have a bias error. Normally this is due to the manufacturing process, materials, and human testing. So no how many readings you take, you will still have the bias error, assuming you even have a normal distribution.

  210. My thanks to SteveMosher. I have visited one of the sites hat he recommended http://www.ncdc.noaa.gov/crn/ and if one period of 23 days suffices (I think that it does) I am satisfied that as far as the US climate is concerned (Tmax + Tmin)/2 is a good approximation for Taverage (to within 1 degree C, which with rounding is as close as you are going to get). I am happy to accept his assurance that it applies throughout the globe.

    The period is far too short but it looks, at first glance, that for some stations the correlation between the min and max trends deviate.

    As a matter of curiosity I wonder why there are 23 stations sited in Colorado and 21 in New Mexico but only 7 in each of California and Texas.

    PS. Mr. Mosher, as someone who graduated in statistics and has often needed to source data and apply statistical tests thoughout his working life, I have come accross a number of instances where (Xmax + Xmin)/2 does not give a good approximation to Xmean, no matter how long the duration. As some of the correspondents on this site have indicated it is a question of the distribution.

  211. Pat Frank says:
    January 23, 2011 at 11:31 pm

    Mark T, congratulations, you’re doing a great job.

    Thanks, but my head couldn’t take it anymore. My favorite was (paraphrased) “I think it must be Gaussian QED.” Unbelievable.

    I am fortunate that most of the data sets I work with (real world data, folks) contain i.i.d. noise and/or measurement errors. Not all, of course. The conversion of analog signals (typically radar or comm in my case) to digital data suffers from some of the same problems as measuring temperature would. Quantization error, for example, is analogous to the minimum gradation problem and results in a uniformly distributed error between two levels of quantization… usually.

    The quantization process (as well as the sampling process itself) is actually non-linear resulting in differences across the full dynamic range of the part, e.g., the error between 10 and 11 may be different than the error between 60 and 61. As a result, not all errors cancel (we refer to it as a reduction in noise bandwidth, not canceling errors, btw) when averaging or integrating over time. There are a wide variety of other noises/interferences, though originating from different locations along the entire link, that do not cancel. In the end, you wind up with various spurs in the data, some related to the sampling process itself, others due to the environment in which the signal propagates, and others related to various factors contained within the rest of the “system” itself.

    If you’re good, or just lucky, you can eliminate or work around most of the errors induced by your own system (EMI/EMC precautions in particular,) even some of the environmental impediments (through filtering, adaptive cancellation/equalization, etc.,) but some of them are there to stay and may corrupt your results. Knowing which are caused by i.i.d. processes is key to understanding how to deal with them.

    Mark

  212. Solomon Green says:

    PS. Mr. Mosher, as someone who graduated in statistics and has often needed to source data and apply statistical tests thoughout his working life, I have come accross a number of instances where (Xmax + Xmin)/2 does not give a good approximation to Xmean, no matter how long the duration. As some of the correspondents on this site have indicated it is a question of the distribution.

    Yes, this is true. This is particularly true if the waveform/distribution changes over time. For example, the typical “waveform” for day/night temperature cycles is somewhat sinusoidal. If, however, it changes such that the low portion lingers for a longer period of time, then using this equation will induce a bias because the “mean” will begin to adjust downward.

    For the most part it does seem reasonable to assume this does not happen, or if it does, it happens slowly (as the orbit changes, for example, which is a pretty slow process.) Using the true mean would probably resolve this, but I don’t think that is currently feasible and certainly we would not be able to apply such a procedure to past data, so we would be starting over.

    Mark

  213. Mark T says:

    If, however, it changes such that the low portion lingers for a longer period of time, then using this equation will induce a bias because the “mean” will begin to adjust downward.

    This is stated in an unclear manner. What I meant was that using (Tmax + Tmin)/2 may result in the same answer (because Tmax and Tmin may be the same) even though the true mean may be going down (or up, for that matter) over time.

    Mark

  214. It would seem to be simple to build a glass themometer reading instrument that would be able to read the fluid level exactly the same way day in a day out …

  215. Pat Frank says:
    January 24, 2011 at 10:16 am
    Anthony’s post about my article is now off the first page at WUWT. However, for those interested, I’ve posted a point-by-point refutation of EFS_Junior’s criticism, here.

    _____________________________________________________________

    And I’ve responded with a one point refudiation of that post. Quite airtight I might add, so further commentary will not be necessary at my end going forward. I’ve already devoted much more time to this topic than I should have.

    It has been a good discussion though, and the time devoted was not wasted time. It has given me pause for thought, the numerical experiments were indeed helpful (to me at least).

    IMHO, these discussions have only strengthened my technical opinion on the subject matter at hand.

    In short, I’ve learned quit a lot, in spite of the deep differences in technical opinions we all have.

    Thanks all. :)

  216. Everybody seems to assume that the people who take the observations are always HONEST.
    US observers are Volunteers, unpaid. The Met Service in Russia in the 1980s suddenly had no money. The observers were unpaid. Why should they get out of bed in an unusually cold morning? What happens if they are ill, there is a blizzard, they want to go to a football match? Who checks the observations for accuaacy or believability? Many observers know that the bosses want to believe that temperatures are rising, so if they fake the results they had better not show that it is getting cooler. They might even be activists who slightly exaggerate, even unconsciously.

    Automatic results are submitted only hourly, not continuously, so they no longer record the true maximum and minimum.

    There is an inbuilt tendency throughout the entire sytem to make all the trends the same. They do this with the models. It is called intercomparison. This can also be unconscious or justified by elimination of “outliers” or “noise”. They do this with CO2 measurements too.

    I suspect that some of the people who participate in smoothing out different records to give uniform, desired results have justified themselves in some of the earlier comments above.

    On top of this there is the confusion documented in Clmategate.

  217. I may have missed it, but I saw no reference to the fact that temperatures were recorded to the nearest 0.5° until thermocouples appeared.

    Also I didn’t see any reference to the fact that mercury freezes at -38.8°C. It becomes increasingly less malleable as it approaches that temperature and makes low temperatures with mercury thermometers of no value. The 18th century observers of the Hudson’s Bay Company using thermometers provided by the Royal Society were unaware of the problem. They replaced them with spirit thermometers and carried out experiments with the freezing of mercury. They did compare the results of mercury and spirit readings. In January 1822 they record, “Some quick silver that had been out some time ago for trying the cold was observed to be frozen while the thermometer was only 36 below zero which proves the weather to have been six degrees colder than per the thermometer.”

  218. As I have said here:

    http://wattsupwiththat.com/2011/01/22/the-metrology-of-thermometers/#comment-580475

    Because the so called “global warming signal” is half the margin of error it is impossible to know if this claimed signal is genuine, or as Dr. Gray points out, human bias, or indeed as Dr. Ball highlights, instrumental bias. Therefore the argument is a faux debate and cannot be resolved.

    As I have said here:

    http://wattsupwiththat.com/2011/01/22/the-metrology-of-thermometers/#comment-580783

    “You cannot measure a 0.7º C average trend if your measuring equipment is not on average accurate to 1.3º C.”

    Unless you meet certain criteria as patiently and correctly maintained by Mark T and Pat Frank in this thread, in that all errors in the measuring equipment are uniform to the extent that over time they cancel. This criteria has not and cannot be met.

    Which renders the so called “global warming signal” meaningless.

    I have also said,

    “The irony of playing it this safe is that 0.7º C in 100+ years does not equate to an anomalous warming event. Particularly if the margin of error is 1.3º C.”

    There is no definable “global warming signal”.

    Which means:

    There is no real world scientific evidence of “man made” CO2 induced “global warming”.

    There is no real world scientific evidence that trace amounts of CO2 can force the 99% of the atmosphere into equilibrium with itself. In fact, quite obviously, it is the exact opposite which actually occurs.

    See here:

    http://wattsupwiththat.com/2011/01/22/the-metrology-of-thermometers/#comment-580999

    I’d say this hoax is dead. It’s time to move on to the next hoax.

    Bring on the ALIENS.

  219. I meant to say: “Tmin in NZ has increased slightly, while 9am and Tmax has stayed about the same.”
    Obviously Mean will have increased since Tmin has increased, and NIWA seem to be calculating Mean from Tmin and Tmax.

    9am temperatures have actually reduced over NIWA’s preferred nine NZ sites:

    Go and get the data for yourself if you don’t believe that. The data is available for download free.

    In NZ, Tmin has been increasing only in winter in urban areas, i.e. the places that NIWA has decided to choose for their temperature measurements.

    By what process does AGW predict that Tmin will increase in urban areas in winter? Is AGW now admitting to an Urban Heat Island effect?

  220. Will says:
    January 25, 2011 at 3:38 am
    Bring on the ALIENS.

    Thank you Willis for your responses, However no sooner said than done…………..
    They just have, and making money to boot.

    http://www.theaustralian.com.au/travel/news/crop-circle-in-indonesian-rice-paddy/story-e6frg8ro-1225994485945

    Less rice for the starving but at least the in-pocket petrol price will jump for the sight-seeing on motorbikes and the villagers will have a developed a new form of Grameen Banking. Investment in laser landforming and CAD instead of women and water buffalo?

    http://www.rga.org.au/rice/growingau.asp

  221. @ David Springer: There is no reason to believe that any mathematical operation one may care to perform on the collection of instrumental temperature records that exist will give a more – or less – accurate temperature reading for that time, date and location than the illustration I used. (I attempted to use the overall numbers the author had written in the article; perhaps I failed to do so.) Since each instrumental measurement of temperature has some unknown – and in the case of some of the records, unknowable – margin of error, nothing can be done using math, statistics or any other abstract discipline to calculate the what the “actual” temperature reading was for each of the times, dates and locations. After reading the article, that dawned on me and I expressed it in my first comment. To claim that somehow all of the measurements necessarily will “average-out” or all the errors will “cancel each other out” over a long enough period of time is an assertion unsupported by any evidence. Most of the math being referred to is not something which has been tested experimentally against real world events, but rather are mathematical constructs, assumed to be valid in in the real world because they were derived according to the “rules” of mathematics.

    Whether the “confidence” level is 90% or 95% or 98%, there is always going to be a margin of error in each and every measurement of any real world quantity. The higher the confidence, the smaller the margin of error, but in this case, despite the high confidence levels, the point of the article is that the margins of error are larger than the math-based assertions. How? Because there is a missing 2% or 5% or 10% which is a consequence of using mathematical abstractions, rather than simple arithmetical aggregations of the data. Further, it is each measurement which is a source of error and so the errors in measurement compound with one another in the production of the “global temperature” figures widely published as being authoritative and being sufficiently accurate and precise for use in guiding public policy. At the end of whatever processes are used to determine that temperature figure, there is a margin of error of unknown – and perhaps unknowable – size which is directly applicable to the final figure claimed, just as if it were an instrumental temperature measurement from a single station.

    Your respond is sophiscated and well-written, sir, but it is likewise sophistic and inapplicable to the matter of margins of error in instrumental temperature measurements.

Comments are closed.