Guest essay by Pat Frank
Presented at World Federation of Scientists, Erice, Sicily, 20 August 2015
This is a version of the talk I gave about uncertainty in the global average air temperature record at the 48th Conference of the World Federation Scientists on “Planetary Emergences and Other Events,” at Erice, Sicily, in August of 2015.
It was a very interesting conference and, as an aside, for me the take home message was that the short-term emergency is Islamic violence while the long-term emergency is some large-scale bolide coming down. Please, however, do not distract conversation into these topics.
Abstract: I had a longer abstract, but here’s the short form. Those compiling the global averaged surface air temperature record have not only ignored systematic measurement error, but have even neglected the detection limits of the instruments themselves. Since at least 1860, thermometer accuracy has been magicked out of thin air. Also since then, and at the 95% confidence interval, the rate or magnitude of the global rise in surface air temperature is unknowable. Current arguments about air temperature and its unprecedentedness are speculative theology.
1. Introduction: systematic error
Systematic error enters into experimental or observational results through uncontrolled and often cryptic deterministic processes. [1] These can be as simple as a consistent operator error. More typically, error emerges from an uncontrolled experimental variable or instrumental inaccuracy. Instrumental inaccuracy arises from malfunction or lack of calibration. Uncontrolled variables can impact the magnitude of a measurement and/or change the course of an experiment. Figure 1 shows the impact of an uncontrolled variable, taken from my own published work. [2, 3]
Figure 1: Left, titration of dissolved ferrous iron under conditions that allowed an unplanned trace of air to enter the experiment. Inset: the incorrect data precisely followed equilibrium thermodynamics. Right, the same experiment but with the appropriately strict exclusion of air. The data are completely different. Inset: the correct data reflect distinctly different thermodynamics.
Figure 1 shows that the inadvertent entry of a trace of air was enough to completely change the course of the experiment. Nevertheless, the erroneous data display coherent behavior and follow a trajectory completely consistent with equilibrium thermodynamics. To all appearances, the experiment was completely valid. In isolation, the data are convincing. However, they are completely wrong because the intruded air chemically modified the iron.
Figure 1 exemplifies the danger of systematic error. Contaminated experimental or observational results can look and behave just like good data, and can rigorously follow valid physical theory. Without care, such data invite erroneous conclusions.
By its nature, systematic error is difficult to detect and remove. Methods of elimination include careful instrumental calibration under conditions identical to the observation or experiment. Methodologically independent experiments that access the same phenomena provide a check on the results. Careful attention to these practices is standard in the experimental physical sciences.
The recent development of a new and highly accurate atomic clock illustrates the extreme care physicists take to eliminate systematic error. Critical to achievement of its 10-18 second accuracy, was removal of systematic error produced the black-body radiation of the instrument itself. [4]
Figure 2: Close-up picture of the new atomic clock. The timing element is a cluster of fluorescing strontium atoms trapped in an optical lattice. Thermal noise is removed using data provided by a sensor that measures the black-body temperature of the instrument.
As a final word, systematic error does not average away with repeated measurements. Repetition can even increase error. When systematic error cannot be eliminated and is known to be present, uncertainty statements must be reported along with the data. In graphical presentations of measurement or calculational data, systematic error is represented using uncertainty bars. [1] Those uncertainty bars communicate the reliability of the result.
2. Systematic Error in Surface Temperature Measurements
2.1. Land Surface Air Temperature
During most of the 20th century, land surface air temperatures were measured using a liquid-in-glass (LiG) thermometer housed in a box-like louvered shield (Stevenson screen or Cotton Regional Shelter (CRS)). [5, 6] After about 1985, thermistors or platinum resistance thermometers (PRT) housed in an unaspirated cylindrical plastic shield replaced the CRS/LiG sensors in Europe, the Anglo-Pacific countries, and the US. Beginning in 2000, the US Climate Research Network deployed sensors consisting of a trio of PRTs in an aspirated shield. [5, 7-9] An aspirated shield includes a small fan or impeller that ventilates the interior of the shield with outside air.
Unaspirated sensors rely on prevailing wind for ventilation. Solar radiance can heat the sensor shield, warming the interior atmosphere around the sensor. In the winter, upward radiance from the albedo of a snow-covered surface can also produce a warm bias. [10] Significant systematic measurement error occurs when air movement is less than 5 m/sec. [9, 11]
Figure 3: Alpine Plaine Morte Glacier, Switzerland, showing the air temperature sensor calibration experiment carried out by Huwald, et al., in 2007 and 2008. [12] Insets: close-ups of the PRT and the sonic anemometer sensors. Photo credit: Bou-Zeid, Martinet, Huwald, Couach, 2.2006 EPFL-ENAC.
In 2007 and 2008 calibration experiments carried out on the Plaine Morte Glacier (Figure 3) tested the field accuracy of the RM Young PRT housed in an unaspirated louvered shield, situated over a snow-covered surface. In a laboratory setting, the RM Young sensor is capable of ±0.1 C accuracy. Field accuracy was determined by comparison with air temperatures measured using a sonic anemometer, which takes advantage of the impact of temperature on the speed of sound in air and is insensitive to irradiance and wind-speed.
Figure 4: Temperature trends recorded simultaneously on Plaine Morte Glacier during February – April 2007. (¾), Sonic anemometer, and; (¾), RM Young PRT probe.
Figure 4 shows that under identical environmental conditions, the RM Young probe recorded significantly warmer Winter air temperatures than the sonic anemometer. The slope of the RM Young temperature trend is also more than 3 times greater. Referenced against a common mean, the RM Young error would enter a spurious warming trend into a global temperature average. The larger significance of this result is that the RM Young probe is very similar in design and response to the more advanced temperature probes in use world-wide since about 1985.
Figure 5 shows a histogram of the systematic temperature error exhibited by the RM Young probe.
Figure 5. RM Young probe systematic error on Plaine Morte Glacier. Day time error averages 2.0±1.4 C; night-time error averages 0.03±0.32 C.
The RM Young systematic errors mean that, absent an independent calibration instrument, any given daily mean temperature has an associated 1s uncertainty of 1±1.4 C. Figure 5 shows this uncertainty is neither randomly distributed nor constant. It cannot be removed by averaging individual measurements or by taking anomalies. Subtracting the average bias will not remove the non-normal 1s uncertainty. Entry of the RM Young station temperature record into a global average will carry that average error along with it.
Before inclusion in a global average, temperature series from individual meteorological stations are subjected to statistical tests for data quality. [13] Air temperatures are known to show correlation R = 0.5 over distances of about 1200 km. [14, 15] The first quality control test for any given station record includes a statistical check for correlation with temperature series among near-by stations. Figure 6 shows that the RM Young error-contaminated temperature series will pass this most basic quality control test. Further, the erroneous RM Young record will pass every single statistical test used for the quality control of meteorological station temperature records worldwide. [16, 17]
Figure 6: Correlation of the RM Young PRT temperature measurements with those of the sonic anemometer. Inset: Figure 1a from [14] showing correlation of temperature records from meteorological stations in the terrestrial 65-70º N, 0-5º E grid. The 0.5 correlation length is 1.4´103 km.
Figure 7: Calibration experiment at the University of Nebraska, Lincoln (ref. [11], Figure 1); E, MMTS shield; F, CRS shield; G, the aspirated RM Young reference.
Figure 7 shows the screen-type calibration experiment at the University of Nebraska, Lincoln. Each screen contained the identical HMP45C PRT sensor. [11] The calibration reference temperatures were provided by an aspirated RM Young PRT probe, rated as accurate to <±0.2 C below 1100 Wm-2 solar irradiance.
These independent calibration experiments tested the impact of a variety of commonly used screens on the fidelity of air temperature measurements from PRT probes. [10, 11, 18] Screens included the traditional Cotton Regional Shelter (CRS, Stevenson screen), and the MMTS screen now in common use in the US Historical Climate Network, among others.
Figure 8: Average systematic measurement error of an HMP45C PRT probe within an MMTS shelter over a grass (top) or snow-covered (bottom) surface. [10, 11]
Figure 8, top, shows the average systematic measurement error an MMTS shield imposed on a PRT temperature probe, found during the calibration experiment displayed in Figure 7. [11] Figure 8, bottom, shows the results of an independent PRT/MMTS calibration over a snow-covered surface. [10] The average annual systematic uncertainty produced by the MMTS shield can be estimated from these data as, 1s = 0.32±0.23 C. The skewed warm-bias distribution of error over snow is similar in magnitude to the unaspirated RM Young shield in the Plaine Morte experiment (Figure 5).
Figure 9 shows the average systematic measurement error produced by a PRT probe inside a traditional CRS shield. [11]
Figure 9. Average day-night 1s = 0.44 ± 0.41 C systematic measurement error produced by a PRT temperature probe within a traditional CRS shelter.
The warm bias in the data is apparent, as is the non-normal distribution of error. The systematic uncertainty from the CRS shelter was 1s = 0.44 ± 0.41 C. The HMP45C PRT probe is at least as accurate as the traditional LiG thermometers housed within the CRS shield. [19, 20] The PRT/CRS experiment may then estimate a lower limit of systematic measurement uncertainty present in the land-surface temperature record covering all of the 19th and most of the 20th century.
2.2 Sea-Surface Temperature
Although considerable effort has been expended to understand sea-surface temperatures (SSTs), [21-28] there have been very few field calibration experiments of sea-surface temperature sensors. Bucket- and steamship engine cooling-water intake thermometers provided the bulk of early and mid-20th century SST measurements. Sensors mounted on drifting and moored buoys have come into increasing use since about 1980, and now dominate SST measurements. [29] Attention is focused on calibration studies of these instruments.
The series of experiments reported by Charles Brooks in 1926 are by far the most comprehensive field calibrations of bucket and engine-intake thermometer SST measurements carried out by any individual scientist. [30] Figure 10 presents typical examples of the systematic error in bucket and engine intake SSTs that Brooks found.
Figure 10: Systematic measurement error in one set of engine-intake (left) and bucket (right) sea-surface temperatures reported by Brooks. [30]
Brooks also recruited an officer to monitor the ship-board measurements after he concluded his experiments and disembarked. The errors after he had departed the ship were about twice as large as they were when he was aboard. The simplest explanation is that care deteriorated, perhaps back to normal, when no one was looking. This result violates the standard assumption in the field that temperature sensor errors are constant for each ship.
In 1963 Saur reported the largest field calibration experiment of engine-intake thermometers, carried out by volunteers aboard twelve US military transport ships engaged off the US central Pacific coast. [31] The experiment included 6826 pairs of observations. Figure 11 shows the experimental results from one voyage of one ship.
Figure 11: Systematic error in recorded engine intake temperatures aboard one military transport ship operating June-July, 1959. The mean systematic bias and uncertainty represented by these data are, 1s = 0.9±0.6 C.
Saur reported Figure 11 as, “a typical distribution of the differences” reported from the various ships. The ±0.6 C uncertainty about the mean systematic error is comparable to the values reported by Brooks, shown in Figure 10.
Saur concluded his report by noting that, “The average bias of reported sea water temperatures as compared to sea surface temperatures, with 95 percent confidence limits, is estimated to be 1.2±0.6 F [0.67±0.33 C] on the basis of a sample of 12 ships. The standard deviation of differences [between ships] is estimated to be 1.6 F [0.9 C]. Thus, without improved quality control the sea temperature data reported currently and in the past are for the most part adequate only for general climatological studies. [bracketed conversions added]” Saur’s caution is instructive, but has apparently been mislaid by consensus scientists.
Measurements from bathythermograph (BT) and expendable bathythermograph (XBT) instruments have also made significant contributions to the SST record. [32] Extensive BT and XBT calibration experiments revealed multiple sources of systematic error, principally stemming from mechanical problems and calibration errors. [33-35] Relative to a reversing thermometer standard, field BT measurements exhibited ±s = 0.34±0.43 C error. [35] This standard deviation is more than twice as large as the manufacturer-stated accuracy of ±0.2 C and reflects the impact of uncontrolled field variables.
The SST sensors in deployed floating and moored buoys were never field-calibrated during the 20th century, allowing no general estimate of systematic measurement error.
However, Emery estimated a 1s = ±0.3 C error by comparison of SSTs from floating buoys co-located to within 5 km of each other. [28] SST measurements separated by less than 10 km are considered coincident.
A similar ±0.26 C buoy error magnitude was found relative to SSTs retrieved from the Advanced Along-Track Scanning Radiometer (AATSR) satellite. [36] The error distributions were non-normal.
More recently, Argo buoys were field calibrated against very accurate CTD (conductivity-temperature-depth) measurements and exhibited average RMS errors of ±0.56 C. [37] This is similar in magnitude to the reported average ±0.58 C buoy-Advanced Microwave Scanning Radiometer (AMSR) satellite SST difference. [38]
3. Discussion
Until recently, [39, 40] systematic temperature sensor measurement errors were neither mentioned in reports communicating the origin, assessment, and calculation of the global averaged surface air temperature record, nor were they included in error analysis. [15, 16, 39-46] Even after the recent arrival of systematic errors in published literature, however, the Central Limit Theorem is adduced to assert that they average to zero. [36] However, systematic temperature sensor errors are neither randomly distributed nor constant over time, space, or instrument. There is no theoretical reason to expect that these errors follow the Central Limit Theorem, [47, 48] or that such errors are reduced or removed by averaging multiple measurements; even when measurements number in the millions. A complete inventory of contributions to uncertainty in the surface air temperature record must include, indeed must start with, the systematic measurement error of the temperature sensor itself. [39]
The World Meteorological Organization (WMO) offers useful advice regarding systematic error. [20]
“Section 1.6.4.2.3 Estimating the true value – additional remarks.
“In practice, observations contain both random and systematic errors. In every case, the observed mean value has to be corrected for the systematic error insofar as it is known. When doing this, the estimate of the true value remains inaccurate because of the random errors as indicated by the expressions and because of any unknown component of the systematic error. Limits should be set to the uncertainty of the systematic error and should be added to those for random errors to obtain the overall uncertainty. However, unless the uncertainty of the systematic error can be expressed in probability terms and combined suitably with the random error, the level of confidence is not known. It is desirable, therefore, that the systematic error be fully determined.”
Thus far, in production of the global averaged surface air temperature record, the WMO advice concerning systematic error has been followed primarily in the breach.
Systematic sensor error in air and sea-surface temperature measurements has been woefully under-explored and field calibrations are few. Nevertheless, the reported cases make it clear that the surface air temperature record is contaminated with a very significant level of systematic measurement error. The non-normality of systematic error means that subtracting an average bias will not discharge the measurement uncertainty about the global temperature mean.
Further, the magnitude of the systematic error bias in surface air temperature and SST measurements is apparently as variable in time and space as the magnitude of the standard deviation of systematic uncertainty about the mean error bias. I.e., the mean systematic bias error was 2 C over snow on the Plaine Morte Glacier, Switzerland, but was 0.4 C over snow at Lincoln, Nebraska. Similar differences accrue to the engine-intake systematic error means reported by Brooks and Saur. Therefore, removing an estimate of mean bias will always leave the magnitude ambiguity of the residual mean bias uncertainty. In any complete evaluation of error, the residual uncertainty in mean bias will combine with the 1s standard deviation of measurement uncertainty into the uncertainty total.
A complete evaluation of systematic error is beyond the analysis presented here. However, to the extent that the above errors are representative, a set of estimated uncertainty bars due to systematic error in the global averaged surface air temperature record can be calculated, Figure 12.
The uncertainty bars in Figure 12 (right) reflect a 0.7:0.3 SST:land surface ratio of systematic errors. Combined in quadrature, bucket and engine-intake errors constitute the SST uncertainty prior to 1990. Over the same time interval the systematic error of the PRT/CRS sensor [39, 49], constituted the uncertainty in land-surface temperatures. Floating buoys made a partial contribution (0.25 fraction) to the uncertainty in SST between 1980-1990. After 1990 uncertainty bars are further steadily reduced, reflecting the increasing contribution and smaller errors of MMTS (land) and floating buoy (SS) sensors.
Figure 12: The 2010 global average surface air temperature record obtained from website of the Climate Research Unit (CRU), University of East Anglia, UK. http://www.cru.uea.ac.uk/cru/data/temperature/. Left, error bars following the description provided at the CRU website. Right, error bars reflecting the uncertainty width due to estimated systematic sensor measurement errors within the land and sea surface records. See the text for further discussion.
Figure 12 (right) is very likely a more accurate representation of the state of knowledge than is Figure 12 (left), concerning the rate or magnitude of change in the global averaged surface air temperature since 1850. The revised uncertainty bars represent non-normal systematic error. Therefore the air temperature mean trend loses any status as the most probable trend.
Finally, Figure 13 pays attention to the instrumental resolution of the historical meteorological thermometers.
Figure 13 caused some angry shouts from the audience at Erice, followed by some very rude approaches after the talk, and a lovely debate by email. The argument presented here prevailed.
Instrumental resolution defines the measurement detection limit. For example, the best-case historical 19th to mid-20th century liquid-in-glass (LiG) meteorological thermometers included 1 C graduations. The best-case laboratory-conditions reportable temperature resolution is therefore ±0.25 C. There can be no dispute about that.
The standard SST bucket LiG thermometers from the Challenger voyage on through the 20th century also had 1 C graduations. The same resolution limit applies.
The very best American ship-board engine-intake thermometers included 2 F (~1 C) graduations; on British ships they were 2 C. The very best resolution is then about ±(0.25 – 0.5) C. These are known quantities. Resolution uncertainty, like systematic error, does not average away. Knowing the detection limits of the classes of instruments allows us to estimate the limit of resolution uncertainty in any compiled historical surface air temperature record.
Figure 13 shows this limit of resolution. It compares the instrumental historical ±2s resolution, with ±2s uncertainty in the published Berkeley Earth air temperature compilation. The analysis applies equally well to the published surface air temperature compilations of GISS or CRU/UKMet, which feature the same uncertainty limits.
Figure 13: The Berkeley Earth global averaged air temperature trend with the published ±2s uncertainty limits in grey. The time-wise ±2s instrumental resolution is in red. On the right in blue is a compilation of the best resolution limits of the historical temperature sensors, from which the global resolution limits were calculated.
The globally combined instrumental resolution was calculated using the same fractional contributions as were noted above for the lower limit estimate of systematic measurement error. That is, 0.30:0.70, land : sea surface instruments, and the published historical fractional use of each sort of instrument (land: CRS vs. MMTS, and; SS: buckets vs. engine intakes vs. buoys).
The record shows that during the years 1800-1860, the published global uncertainty limits of field meteorological temperatures equal the accuracy of the best possible laboratory-conditions measurements.
After about 1860 through 2000, the published resolution is small smaller than the detection limits — the resolution limits — of the instruments themselves. From at least 1860, accuracy has been magicked out of thin air.
Does anyone find the published uncertainties credible?
All you engineers and experimental scientists out there may go into shock after reading this. I was certainly shocked by the realization. Espresso helps.
The people compiling the global instrumental record have neglected a experimental limit even more basic than systematic measurement error: the detection limits of their instruments. They have paid no attention to it.
Resolution limits and systematic measurement error produced by the instrument itself constitute lower limits of uncertainty. The scientists engaged in consensus climatology have neglected both of them.
It’s almost as though none of them have ever made a measurement or struggled with an instrument. There is no other rational explanation for that sort of negligence than a profound ignorance of experimental methods.
The uncertainty estimate developed here shows that the rate or magnitude of change in global air temperature since 1850 cannot be known within ±1 C prior to 1980 or within ±0.6 C after 1990, at the 95% confidence interval.
The rate and magnitude of temperature change since 1850 is literally unknowable. There is no support at all for any “unprecedented” in the surface air temperature record.
Claims of highest air temperature ever, based on even 0.5 C differences, are utterly insupportable and without any meaning.
All of the debates about highest air temperature are no better than theological arguments about the ineffable. They are, as William F. Buckley called them, “Tedious speculations about the inherently unknowable.”
There is no support in the temperature record for any emergency concerning climate. Except, perhaps an emergency in the apparent competence of AGW-consensus climate scientists.
4. Acknowledgements: Prof. Hendrik Huwald and Dr. Marc Parlange, Ecole Polytechnique Federale de Lausanne (EPFL), CH-1015 Lausanne, Switzerland, are thanked for generously providing the Plaine Morte sensor calibration data entering into Figure 4, Figure 5, and Figure 6. This work was carried out without any external funding.
5. References
[1] JCGM, Evaluation of measurement data — Guide to the expression of uncertainty in measurement 100:2008, Bureau International des Poids et Mesures: Sevres, France.
[2] Frank, P., et al., Determination of ligand binding constants for the iron-molybdenum cofactor of nitrogenase: monomers, multimers, and cooperative behavior. J. Biol. Inorg. Chem., 2001. 6(7): p. 683-697.
[3] Frank, P. and K.O. Hodgson, Cooperativity and intermediates in the equilibrium reactions of Fe(II,III) with ethanethiolate in N-methylformamide solution. J. Biol. Inorg. Chem., 2005. 10(4): p. 373-382.
[4] Hinkley, N., et al., An Atomic Clock with 10-18 Instability. Science, 2013. 341(p. 1215-1218.
[5] Parker, D.E., et al., Interdecadal changes of surface temperature since the late nineteenth century. J. Geophys. Res., 1994. 99(D7): p. 14373-14399.
[6] Quayle, R.G., et al., Effects of Recent Thermometer Changes in the Cooperative Station Network. Bull. Amer. Met. Soc., 1991. 72(11): p. 1718-1723; doi: 10.1175/1520-0477(1991)072<1718:EORTCI>2.0.CO;2.
[7] Hubbard, K.G., X. Lin, and C.B. Baker, On the USCRN Temperature system. J. Atmos. Ocean. Technol., 2005. 22(p. 1095-1101.
[8] van der Meulen, J.P. and T. Brandsma, Thermometer screen intercomparison in De Bilt (The Netherlands), Part I: Understanding the weather-dependent temperature differences). International Journal of Climatology, 2008. 28(3): p. 371-387.
[9] Barnett, A., D.B. Hatton, and D.W. Jones, Recent Changes in Thermometer Screen Design and Their Impact in Instruments and Observing Methods WMO Report No. 66, J. Kruus, Editor. 1998, World Meteorlogical Organization: Geneva.
[10] Lin, X., K.G. Hubbard, and C.B. Baker, Surface Air Temperature Records Biased by Snow-Covered Surface. Int. J. Climatol., 2005. 25(p. 1223-1236; doi: 10.1002/joc.1184.
[11] Hubbard, K.G. and X. Lin, Realtime data filtering models for air temperature measurements. Geophys. Res. Lett., 2002. 29(10): p. 1425 1-4; doi: 10.1029/2001GL013191.
[12] Huwald, H., et al., Albedo effect on radiative errors in air temperature measurements. Water Resorces Res., 2009. 45(p. W08431; 1-13.
[13] Menne, M.J. and C.N. Williams, Homogenization of Temperature Series via Pairwise Comparisons. J. Climate, 2009. 22(7): p. 1700-1717.
[14] Briffa, K.R. and P.D. Jones, Global surface air temperature variations during the twentieth century: Part 2 , implications for large-scale high-frequency palaeoclimatic studies. The Holocene, 1993. 3(1): p. 77-88.
[15] Hansen, J. and S. Lebedeff, Global Trends of Measured Surface Air Temperature. J. Geophys. Res., 1987. 92(D11): p. 13345-13372.
[16] Brohan, P., et al., Uncertainty estimates in regional and global observed temperature changes: A new data set from 1850. J. Geophys. Res., 2006. 111(p. D12106 1-21; doi:10.1029/2005JD006548; see http://www.cru.uea.ac.uk/cru/info/warming/.
[17] Karl, T.R., et al., The Recent Climate Record: What it Can and Cannot Tell Us. Rev. Geophys., 1989. 27(3): p. 405-430.
[18] Hubbard, K.G., X. Lin, and E.A. Walter-Shea, The Effectiveness of the ASOS, MMTS, Gill, and CRS Air Temperature Radiation Shields. J. Atmos. Oceanic Technol., 2001. 18(6): p. 851-864.
[19] MacHattie, L.B., Radiation Screens for Air Temperature Measurement. Ecology, 1965. 46(4): p. 533-538.
[20] Rüedi, I., WMO Guide to Meteorological Instruments and Methods of Observation: WMO-8 Part I: Measurement of Meteorological Variables, 7th Ed., Chapter 1. 2006, World Meteorological Organization: Geneva.
[21] Berry, D.I. and E.C. Kent, Air–Sea fluxes from ICOADS: the construction of a new gridded dataset with uncertainty estimates. International Journal of Climatology, 2011: p. 987-1001.
[22] Challenor, P.G. and D.J.T. Carter, On the Accuracy of Monthly Means. J. Atmos. Oceanic Technol., 1994. 11(5): p. 1425-1430.
[23] Kent, E.C. and D.I. Berry, Quantifying random measurement errors in Voluntary Observing Ships’ meteorological observations. Int. J. Climatol., 2005. 25(7): p. 843-856; doi: 10.1002/joc.1167.
[24] Kent, E.C. and P.G. Challenor, Toward Estimating Climatic Trends in SST. Part II: Random Errors. Journal of Atmospheric and Oceanic Technology, 2006. 23(3): p. 476-486.
[25] Kent, E.C., et al., The Accuracy of Voluntary Observing Ships’ Meteorological Observations-Results of the VSOP-NA. J. Atmos. Oceanic Technol., 1993. 10(4): p. 591-608.
[26] Rayner, N.A., et al., Global analyses of sea surface temperature, sea ice, and night marine air temperature since the late nineteenth century. Journal of Geophysical Research-Atmospheres, 2003. 108(D14).
[27] Emery, W.J. and D. Baldwin. In situ calibration of satellite sea surface temperature. in Geoscience and Remote Sensing Symposium, 1999. IGARSS ’99 Proceedings. IEEE 1999 International. 1999.
[28] Emery, W.J., et al., Accuracy of in situ sea surface temperatures used to calibrate infrared satellite measurements. J. Geophys. Res., 2001. 106(C2): p. 2387-2405.
[29] Woodruff, S.D., et al., The Evolving SST Record from ICOADS, in Climate Variability and Extremes during the Past 100 Years, S. Brönnimann, et al. eds, 2007, Springer: Netherlands, pp. 65-83.
[30] Brooks, C.F., Observing Water-Surface Temperatures at Sea. Monthly Weather Review, 1926. 54(6): p. 241-253.
[31] Saur, J.F.T., A Study of the Quality of Sea Water Temperatures Reported in Logs of Ships’ Weather Observations. J. Appl. Meteorol., 1963. 2(3): p. 417-425.
[32] Barnett, T.P., Long-Term Trends in Surface Temperature over the Oceans. Monthly Weather Review, 1984. 112(2): p. 303-312.
[33] Anderson, E.R., Expendable bathythermograph (XBT) accuracy studies; NOSC TR 550 1980, Naval Ocean Systems Center: San Diego, CA. p. 201.
[34] Bralove, A.L. and E.I. Williams Jr., A Study of the Errors of the Bathythermograph 1952, National Scientific Laboratories, Inc.: Washington, DC.
[35] Hazelworth, J.B., Quantitative Analysis of Some Bathythermograph Errors 1966, U.S. Naval Oceanographic Office Washington DC.
[36] Kennedy, J.J., R.O. Smith, and N.A. Rayner, Using AATSR data to assess the quality of in situ sea-surface temperature observations for climate studies. Remote Sensing of Environment, 2012. 116(0): p. 79-92.
[37] Hadfield, R.E., et al., On the accuracy of North Atlantic temperature and heat storage fields from Argo. J. Geophys. Res.: Oceans, 2007. 112(C1): p. C01009.
[38] Castro, S.L., G.A. Wick, and W.J. Emery, Evaluation of the relative performance of sea surface temperature measurements from different types of drifting and moored buoys using satellite-derived reference products. J. Geophys. Res.: Oceans, 2012. 117(C2): p. C02029.
[39] Frank, P., Uncertainty in the Global Average Surface Air Temperature Index: A Representative Lower Limit. Energy & Environment, 2010. 21(8): p. 969-989.
[40] Frank, P., Imposed and Neglected Uncertainty in the Global Average Surface Air Temperature Index. Energy & Environment, 2011. 22(4): p. 407-424.
[41] Hansen, J., et al., GISS analysis of surface temperature change. J. Geophys. Res., 1999. 104(D24): p. 30997–31022.
[42] Hansen, J., et al., Global Surface Temperature Change. Rev. Geophys., 2010. 48(4): p. RG4004 1-29.
[43] Jones, P.D., et al., Surface Air Temperature and its Changes Over the Past 150 Years. Rev. Geophys., 1999. 37(2): p. 173-199.
[44] Jones, P.D. and T.M.L. Wigley, Corrections to pre-1941 SST measurements for studies of long-term changes in SSTs, in Proc. Int. COADS Workshop, H.F. Diaz, K. Wolter, and S.D. Woodruff, Editors. 1992, NOAA Environmental Research Laboratories: Boulder, CO. p. 227–237.
[45] Jones, P.D. and T.M.L. Wigley, Estimation of global temperature trends: what’s important and what isn’t. Climatic Change, 2010. 100(1): p. 59-69.
[46] Jones, P.D., T.M.L. Wigley, and P.B. Wright, Global temperature variations between 1861 and 1984. Nature, 1986. 322(6078): p. 430-434.
[47] Emery, W.J. and R.E. Thomson, Data Analysis Methods in Physical Oceanography. 2nd ed. 2004, Amsterdam: Elsevier.
[48] Frank, P., Negligence, Non-Science, and Consensus Climatology. Energy & Environment, 2015. 26(3): p. 391-416.
[49] Folland, C.K., et al., Global Temperature Change and its Uncertainties Since 1861. Geophys. Res. Lett., 2001. 28(13): p. 2621-2624.
Discover more from Watts Up With That?
Subscribe to get the latest posts sent to your email.
Add this to a total lack of understanding of the scientific method and you have fully defined the current state of consensus climate research.
That is a great point, Joe. Everything I have seen out of “climate scientists”, especially the government funded ones, indicate they do not understand the scientific method or at least refuse to honor it.
Am I missing something? I don’t see the problem.
This is why anomalies are used, instead of absolute temperatures. As long as the errors remain constant over many years, the anomaly plot will give the correct information. The absolute temperature may be wrong on many occasions, but it will remain consistently wrong and predictably wrong, and so the anomaly differential will remain constant over time. And consistent with other stations, if they have the same instrument giving the same constant and predictable errors.
Systemic errors are only introduced into the system when something changes – like increasing UHI effects; like aircraft taxying past; like someone placing an air conditioning unit next to the sensor; or the sensor being replaced by something with a different set of constant errors etc: etc: Or, if the scientists systematically adjust historic data down, and recent data up.
So why is a consistent instrument error a problem, in this particular case?
R
Ralf:
You’re correct. It isn’t
The errors are cancelled given that the same instruments are used and they do not develop a fault. LIG thermometers won’t.
Also, I seem to recall a fuss made about the correction for the TOBs systematic error in the U.S. GISS dataset.
Ah well.
ralfellis and ToneB, it’s not immediately clear to me that either using anomalies or making a large number of measurements will alleviate issues associated with the resolution of an instrument.
If we do a simple thought experiment and consider a thermometer with a resolution of 1K. If an observer records the temperature as 283K it means that the temperature lies with equal probability at all values between 282.5 and 283.5K. Now let’s say the actual temperature is 282.7K. No matter how many times we measure it and average the results we will get 283K. This is different from the true temperature. Similarly the anomaly will also be incorrect and limited by the resolution of the thermometer.
If one were to do a full budget of the errors then the resolution contribution could be determined by dividing the resolution by the square root of 12. i.e. approximately 1K/3.5 or a little over 0.3K. Interestingly if we had a digital instrument based on A-D conversion then the error would be 1K/sqrt 3 or 0.6K. This is because of the differential non-linearity of A-D systems.
The results are not very intuitive and it takes a considerable amount of thinking to fully grasp the implications of low resolution. I only know because I have designed a measuring instrument, in this case a mass spectrometer and needed to understand the accuracy and precision of the detection systems. Ironically the mass spectrometer is to help measure pale-temperatures!
I have not read Pat Frank’s article in full but have sympathy with the frustration of people using data without an understanding of the measurement system.
Paul Dennis: Resolution is a different issue than systematic error. Use of anomalies compensates for the latter. Taking a large number of measurements compensates for the former (read up on the Law of Large Numbers).
Tom Dayton, I think you are wrong. If ones resolution is high compared to measurement precision then one can repeat the measurement many times and take the standard error of the mean. However, if the resolution is poor then no amount of repeated measurements allows one to recover a precision that isn’t there.
The reductio ad absurdum of your statement is that with a thermometer with 1K resolution and I make a thousand measurements of a signal with random noise varying between 282.5 and 283.1K with a mean of 282.8K I measure it as 283K +/- 0K! It is neither accurate nor a representation of the noise of the measurement.
Please read Pat Frank’s article because he is not just discussing systematic errors. Figure 13 and the accompanying discussion is all about instrument resolution and its impact on measurement precision.
“If we do a simple thought experiment and consider a thermometer with a resolution of 1K.”
This simply means the thermometer gives an output that is constrained to whole degrees. It says nothing about its precision.
“If an observer records the temperature as 283K it means that the temperature lies with equal probability at all values between 282.5 and 283.5K.”
Only if the the thermometer can differentiate between 282.49 and 282.51 reliably, which would mean it has a very high precision relative to the whole degree output.
“Now let’s say the actual temperature is 282.7K. No matter how many times we measure it and average the results we will get 283K.”
A more realistic scenario would be that the (whole-degree-output) thermometer says 283 in 7 out of 10 cases and 282 in 3 out of ten cases. And even this is oversimplified. However I hope you will start looking into your misconceptions.
And yes, anomalies filter out systematic biases (as long as the bias is constant and does not itself depend on the absolute, for instance when a temperature measurement bias gets higher when the absolute temperature is higher).
” This is different from the true temperature. Similarly the anomaly will also be incorrect and limited by the resolution of the thermometer.”
As shown above precision of a temperature measuring device is different from its resolution.
Met instruments are not required to be accurate to better than 0.1C. In climate we would like them to be.
However it does not matter that they may be inaccurate because they are consistent.
MIG/alcohol-in-glass will be wrong by a consistent amount – and in addition the UKMO uses platinum resistance thermometers FI, that have excellent stability that only requires calibration every 8 years.
http://www.metoffice.gov.uk/guide/weather/observations-guide/how-we-measure-temperature
In addition there are 10’s thousands of thermos around the world used in GMT datasets and I would suggest that they do not all err in the same direction. Just like tossing a coin or rolling a die will result in the convergence of the probability of a bias to heads or to 6’s – towards zero.
This is a none issue.
Wagen,
again I am sorry you are wrong here. I suggest you read some texts on measurement theory. Take your example of a precise thermometer with resolution of 1 degree. As it’s precise it will always read 283K for a temperature of 282.7K and not 7 times out of 10!
Where did I say precision was the same as resolution? My example was to highlight that it is not. I’m sorry your example seems to confuse them.
The errors are not constant, Toneb. Not in single instruments, not across instruments. That’s the impact of uncontrolled environmental variables.
pauldennis, nice to see a post from you here. 🙂
I believe the square root of 12 adjustment comes of treating resolution as triangular, rather than rectangular. I’ve never been happy with that, because it presumes a higher weight on the center of the range.
I’ve never seen a good rationale for that, and have always taken the rectangular resolution to be a more forthright statement of uncertainty.
Tom Dayton, taking anomalies does not remove the varying systematic error stemming from uncontrolled variables.
No large number of measurements improves the limits of instrumental detection. If the data are not there in one measurement, they’re not there in a sum of such measurements, no matter the number.
Toneb, it appears the meaning of the graphical calibration displays of non-normal, time-and-space-varying systematic errors is lost on you.
Pat, thank you for the welcome. I think you are right that the sqrt 12 does derive from a triangular distribution. I agree this weights for the centre of the range and I’m not sure that this is a best estimate of the uncertainty. My inclination would be that to use a rectangular distribution would be more robust. The only reason why an observer would weight for the centre of the range is that towards the limit they may bin data in the lower or upper range. However this is a supposition and unquantifiable so it seems that using sqrt 3, as you say, is more forthright and a better assessment of the uncertainty.
“Now let’s say the actual temperature is 282.7K. No matter how many times we measure it and average the results we will get 283K.”
I don’t see what situation in climate science this refers too. It is very unlikely that the exact same temperature would be repeatedly measured, especially with min/max thermometers. A much more reasonable thing to look at is a month average of varying temperatures by that thermometer. Then the number of times this 1° thermometer will round up is about equal to the number of times it will round down, cancelling in the sum. In fact, you can easily emulate this situation. Average 100 random numbers between 0 and 1. You’ll get close to 0.5. Then average 100 integer rounded such numbers. You’ll get a mean with binomial distribution, as if tossing a coin 100 times. 0.5 +- 0.05. Effectively, your thermometer does this modulo 1.
Nick, the average may or may not be close to 0.5. I have just tried and my first attempt with 100 numbers at 0.01 and 0.1 resolution gave 0.5289 (0.01) and 0.5572 (0.1). These means differ by nearly 0.03 and are at the 1 sigma uncertainty given by the resolution (0.1/sqrt 12). Of course a month is just 30 days with every possibility that the the rounding up or down is not truly random and differences between true means and the estimated mean could be greater.
Paul,
I presume you are rounding to respectively two figures and one. The problem there is that the resolution is too good; the se of the mean with perfect resolution is .1/sqrt(12)= .0289, and degrading the resolution to .01 makes virtually no difference, and .1 not much more. But resolution 1 (integer rounding) shows a difference. I got in my first three tries, 0.58, 0.38 and 0.51. Obviously poorer se than the hi-res se of 0.0289 (the theoretical is .05). But a lot better than resolution 1.
Paul,
Just expanding on the math there, I think it is that the expected standard error of the mean of 100 numbers, randomly distributed between 0 and 1 but measured to resolution r, is
sqrt((1+r)(1+r/2)/12)
So here is a table (hope format is OK)
So at resolution 0.01, the se far exceeds the res. But at 0.1, there is a definite improvement, and for 1 the improvement is very great. Of course, if you average 10000, the SE’s divide by 10, so then even at res 0.1, the SE is far more accurate at 0.0031.
sqrt((1+r)(1+r/2)/12)
should be sqrt((1+r)(1+r/2)/12/N) where N is number of readings (actually, probably should strictly be N-2)
“sqrt((1+r)(1+r/2)/12/N) where N is number of readings (actually, probably should strictly be N-2)”
What’s r, the reading?
Never mind Nick.
I see up thread it’s resolution. I read these through email, so it was only after posting I saw the answer.
Nick Stokes, the issue is resolution not random offsets. Resolution means the instrument is not sensitive to magnitude differences within the limits of that resolution.
It’s not that the data are noisy. It’s that the data are nonexistent. There is no way to recover missing data by taking averages of measurements with missing data.
Nick Stokes, here’s what JCGM Guide to Uncertainty in Measurement (2 MB pdf) says about instrumental resolution (transposing their argument into temperature):
“If the resolution of the indicating device is δx, the value of the stimulus that produces a given indication X can lie with equal probability anywhere in the interval X − δx/2 to X + δx/2. The stimulus is thus described by a rectangular probability distribution of width δx with variance u^2 = (δx)^2/12, implying a standard uncertainty of u = 0.29δx for any indication.
Thus a thermometer whose smallest significant digit is 1 C has a variance due to the resolution of the device of u^2 = (1/12)1C^2 and a standard uncertainty of u = [1/sqrt(12)]1C = 0.29 C.”
This is typical of the standard applied to reading LiG thermometers, where the resolution is taken to be 0.25 of the smallest division.
The 1/12 above represents the result of an a priori triangular resolution; a relaxation of conservative rigor.
No average of any number of ±0.29 C resolution temperatures will reduce that uncertainty.
Pat Frank,
You say “the issue is resolution not random offsets” but then quote JCGM saying:
“the value of the stimulus that produces a given indication X can lie with equal probability anywhere in the interval X − δx/2 to X + δx/2. The stimulus is thus described by a rectangular probability distribution of width δx with variance u^2 = (δx)^2/12,”
Sounds exactly like specifying random offsets. In fact, it describes exactly the arithmetic I was doing. You read one day, offset σ 0.289. Read the next day, that’s an independent offset. The variances sum, so sum has variance 1/6, or σ sqrt(1/6) But the average has half that σ, or sqrt(1/24)=0.204. And so on.
Here is a full example. On pages like this, BoM shows the daily max for each recent month in Melbourne, to one decimal place. Here is last month (Mar):
Suppose we had a thermometer reading to only 1°C – so all these were rounded, as in the JCGM description. For the last 13 months, here are the means for the BoM and for that thermometer:
Mar Apr May Jun Jul Aug Sep Oct Nov dec Jan Feb Mar 1 dp: 22.72 19.24 17.13 14.43 13.29 13.85 17.26 24.33 22.73 27.45 25.98 25.1 24.86 0 dp: 22.77 19.27 17.13 14.37 13.29 13.84 17.33 24.35 22.67 27.48 26 25.17 24.84 diff: 0.05 0.03 0 -0.06 0 -0.01 0.08 0.03 -0.06 0.03 0.02 0.08 -0.02The middle row, measured by day to 1°C, has a far more accurate mean than that resolution. As a check, the sd of the difference (bottom row) is expected to be sqrt(1/12/31) (slight approx for days in month), which is 0.052. The sd of the diffs shown is 0.045.
You can check.
Nick Stokes, you wrote, “You read one day, offset σ 0.289. Read the next day, that’s an independent offset. ”
Not correct. The meaning of the ±0.289 resolution is that one does not know where the true temperature lays within ±0.289 of the reading.
It’s not an offset. It’s a complete absence of data within that ± range.
The next day’s reading includes an identical ignorance width. The resolution uncertainties are 100% correlated. The mean of two readings has an uncertainty of ±0.41 C.
In your BOM example, given a rectangular resolution of ±0.289 C, all those daily readings would be be appended with ±0.3 C. The monthly readings would be immediately tagged as having one too many significant figures.
The resolution propagated as a rectangular uncertainty into a 30-day monthly mean is sqrt[(0.289)^2*(30/29)] = ±0.3 C.
Your 1 C rounding example includes a subtle but fatal error. That is, you assume that the significant digits are truly accurate to 0.1 C and that their rounding is therefore consistent with true accuracy.
However, limited resolution means that the temperatures are not accurate to that limit. They are accurate only to ±0.289 C. This means the recorded numbers have a cryptic error relative to the true air temperature.
So, even though the rounded values average to near the mean of the unrounded values, both means include the cryptic error due to the ±0.289 C resolution detection limit.
There’s no getting around it, Nick. Resolution is a knowledge limit, and nothing gets it back.
“The resolution uncertainties are 100% correlated.”
100% wrong. They are independent. Take the March max’s I showed above. The rounding residues, multiplied by 10 for display, are:
-3 -3 -1 0 -3 2 -1 -1 5 1 1 3 2 3 -2 4 5 -5 -2 3 -5 3 -2 2 4 2 -1 2 -1 -3 -3
The lag 1 autocorrelation coefficient is -0.057. That small value is consistent with zero and certainly not with 1. And the sd (without x10) is 0.289; coincidentally almost exactly the theoretical sqrt(1/12).
“However, limited resolution means that the temperatures are not accurate to that limit. They are accurate only to ±0.289 C. This means the recorded numbers have a cryptic error relative to the true air temperature.”
Where on Earth do you get this from? There is no reason to believe the BoM numbers are limited to 1° resolution. But in any case, there will be some other set of accurate underlying values which will give similar arithmetic. The point of my example is that degrading any similar set of T to integer resolution creates a 0.05 (sqrt(1/12/31)) uncertainty in the mean of 31.
“both means include the cryptic error due to the ±0.289 C resolution “
Again, how on Earth can you assign such an error to the BoM values? But there are no “cryptic errors” here. I have just formed the means directly and shown their observed distribution.
I have written a post on this stuff here.
Nick, the resolution uncertainties are 100% correlated because every single one of your instruments comes with an identical limit of resolution.
Your analysis compares the recorded temperatures. You’re treating them as though they are physically accurate. You’re looking at their lags and difference SDs and proceeding as though those quantities told you something about accuracy. They don’t. Internal comparisons tell you nothing about accuracy. They tell you only about precision.
Resolution — the limit of detection — tells us that all the BOM numbers are erroneous with respect to the physically correct temperatures. But we do not know the physically correct temperature, so that the true magnitude of resolution-limited measurement error is forever unknown.
You wrote, “There is no reason to believe the BoM numbers are limited to 1° resolution.” Except there is every reason to believe that, Nick, because in your example the instrumental detection limit is invariably ±0.289 C. There’s no getting away from that fact of grim instrumental reality.
The point of your example is that it misses the meaning of instrumental resolution.
You wrote, “Again, how on Earth can you assign such an error to the BoM values? But there are no “cryptic errors” here. ” Cryptic errors arise from the limits of detection, Nick. Resolution limits means there is an unknown difference between the instrumental reading and the physically true magnitude. That difference is the error that follows from the instrumental detection limit — the resolution.
Every single BOM temperature includes that error within its recorded value. The mean of those erroneous temperatures will include the mean resolution error, whatever that is. We can only estimate the resolution error in the mean as the resolution uncertainty, ±0.289 C.
All you’ve demonstrated is that perfect rounding of an erroneous series yields a mean with about the same error as the mean of the unrounded series.
Your demonstration has nothing whatever to do with the physical accuracy of the temperatures or of their mean.
Their “observed distribution is about precision, Nick, not about accuracy.
“Except there is every reason to believe that, Nick, because in your example the instrumental detection limit is invariably ±0.289 C.“
This is just nuts. No, in my example I took the actual BoM data and a hypothetical thermometer which could be (or was) read to 1°C accuracy. The latter might have that limit; there is no reason to say that it applies to the BoM instruments.
Nick, it’s not “just nuts.”
Rather, it’s that you apparently do not understand limits of resolution. Data — information — ceases at the detection limit, Nick.
The BOM values include both systematic error and resolution error with respect to the unknown true air temperatures.
For a 1 C thermometer, resolution error means that the true air temperature can be anywhere within ±0.289 C of the recorded temperature. The recorded temperature will include some magnitude of error that will forever remain unknown.
The reliability of the measurement can be given only as the ±uncertainty of the known instrumental resolution (assuming systematic error is zero).
Absent a knowledge of the true air temperatures, we (you) do not know the true magnitude of the error within the BOM values.
No amount of internal analysis will reveal that error.
When you rounded to 1 C, you perfectly rounded erroneous values. Of course the rounded means will correspond to the unrounded means. But both means will include the original resolution-limited measurement error.
There’s no getting around it.
I tried posting a reply at Nick Stoke’s site where he posted his full BOM temperature analysis that tries to show instrumental resolution doesn’t matter in averages, and which he advertised at WUWT here.
However, my reply always vanished into the ether, no matter whether I posted as “Anonymous” or under my OpenID URL.
So, I’ve decided to post the reply here at WUWT, where valid comments always find a home. One hopes the track-back notice will show up on Nick’s site.
If you visit there to take a look, notice the generous personal opinions expressed by the company.
Here’s the reply:
There seems to be little apparent understanding of instrumental resolution — limits of detection — within the comments here [at Nick’s site — P], and certainly so in Nick’s analysis.
Also, Eli’s sampling rejoinder misses the point.
Look, Nick’s BOM example claims to show that measurements limited to 1 C divisions, yield means of much higher accuracy.
However, resolution uncertainty means, e.g., a LiG thermometer with 1 C graduations cannot be read to temperature differences smaller than its detection limit of ±0.289 C (LiG = liquid-in-glass).
In such a case, all we can know is that the physically real air temperature is somewhere within ±0.289 C of the temperature readout on the thermometer.
Applying this concept to the BOM temperatures: every BOM temperature measurement has some unknown error hidden within it because of instrumental resolution. Limited resolution — the detection limit — means there is always a divergence of unknown magnitude between the readout value and the true air temperature.
Nick has shown that rounding those temperature measurements to 1 C yields a mean value very close to the mean of the unrounded values.
However, when the recorded temperatures include a cryptic error arising from limited resolution, the total set of rounded values will include erroneous roundings. These encode the error into the set of rounded values. The cryptic error is then propagated into the mean of the rounded values. The rounded temperatures then converge to a mean with the same error as the mean of the unrounded temperatures.
That is, rounding erroneous values means the rounding itself is erroneous. There is no cause to think the rounding errors are randomly distributed because there is no cause to think the resolution errors are randomly distributed.
The thermometer can not resolve temperature differences smaller its resolution. That puts errors into the measurements, and we have no idea of the true error magnitudes or of their true distribution.
The fact that Nick’s two means are close in value doesn’t imply that resolution doesn’t matter. It implies that the mean of large numbers of perfectly rounded values converges to the mean of the original unrounded values.
That is, Nick’s example merely shows that perfectly rounded erroneous temperatures converge to the same incorrect mean as the original unrounded erroneous temperatures.
Rounding does not remove the resolution-derived measurement error present in the unrounded temperature record. The rounded mean converges to the same error as the unrounded mean.
The true error distribution of those measurements can not be appraised because the true air temperatures are not known. This is obviously true for a LiG thermometer where the sensor detection limit is evident by inspection.
The same is true for digital thermometers as well, though. Suppose, for example, one has a digital thermometer of resolution ±1 C. That resolution does not refer to the readout, which can have as many digits as one likes. Resolution instead refers to the sensitivity of the instrumental electronics to changes in temperature, where “electronics” includes the sensing element.
When the instrument itself is not sensitive to temperature changes of less than 1 C, no amount of dithering will improve the ±1 C resolution because no higher accuracy information about the true air temperature is resident in the instrument.
Now suppose we have a digital sensor of ±0.1 C precision but of unknown accuracy, and so carry out a calibration experiment using a higher-accuracy thermometer and a water bath. We construct a calibration curve for our less accurate instrument — water bath temperature from the high-accuracy instrument vs. low-accuracy readout.
The calibration curve need not be linear; merely smooth, univariate, and replicable. We can use the calibration curve to correct the measured temperatures. We are now able to measure temperatures in our lab to ±0.1 C, by reference to the calibration curve.
Now the instrument goes outside, where the air temperature is variable. What happens to the response function of the instrument when the electronics themselves are exposed to different temperatures?
They no longer are bathed in 25 C air. Does the calibration curve constructed at 25 C — the lab air temperature — also apply at -10 C, at 0 C, at 10 C, or at 35 C? We don’t know because it’s not been determined. What do we do with our prior lab calibration curve? It no longer applies, except at outside air temperatures near 25 C.
This question has been investigated. X. Lin and K.G. Hubbard (2004) “Sensor and Electronic Biases/Errors in Air Temperature Measurements in Common Weather Station Networks” J. Atmos. Ocean. Technol. 21, 1025-1032 show that significant errors creep into MMTS measurements from the temperature sensitivity of the electronics.
Here’s what they say about laboratory calibrations: “It is not generally possible to detect and remove temperature-dependent bias and sensor nonlinearity with static calibration.” Any experimental scientist or engineer will agree with their caution.
From electronics limitations alone, the best-case detection limit of the MMTS sensor is ±0.2 C. This means the state of the sensor electronics does not register or convey any information about air temperature more accurate than ±0.2 C.
Following from their analysis, Lin and Hubbard noted that, “Only under yearly replacement of the MMTS thermistor with the calibrated MMTS readout can errors be constrained within ±0.2 C under the temperature range from -40 C to +40 C.” Below -40 C and above +40 C, errors are greater.
Their findings mean that under the best of circumstances, MMTS sensors cannot distinguish temperatures that differ by ±0.2 C or less. No amount of averaging will improve that condition because the recorded temperatures lack any information about magnitudes inside that limit.
And there is no greater sensitivity to temperature changes available from within the instrument itself. That is, the electronic state of the instrument does not follow the air temperature to better than ±0.2 C. Dithering the readout gets you nothing.
Instrumental resolution — detection limits — means that averaging large numbers of resolution-limited temperature readings, each of which has an unresolved error and all of which errors conform to a distribution of unknown shape, can not result in a mean with a lesser uncertainty than an individual reading.
Supposing so is to magic data out of thin air.
My friend, Carl W. (Ph.D. Stanford University) found an excellent explanation of the differences among instrumental resolution (detection limits), precision (repeatability), and accuracy (correspondence with physical reality).
It’s at Phidgets here.
For all who may still be reading, and care. 🙂
I think the point of the post is that they aren’t consistent error problems.
Alex, the author of this post claimed to be focusing on “systematic” error, which indeed does mean “consistent.”
You’re right, Alex.
Tom Dayton, systematic error means the error derives from some systematic cause, rather than from a source of random noise. It does not mean ‘constant offset.’
Uncontrolled systematic impacts can skew a measurement in all sorts of ways. When the error derives from uncontrolled variables (note: not uncontrolled constants), the error varies with the magnitude and structure of the causal forces.
That’s the case with unaspirated surface temperature sensors. Measurement error varies principally with variable wind speed and irradiance.
The error he is talking about are not removed by using an anomaly because the errors do not “average out”. This only works for normally distributed errors. These errors are skew right distribution so do not average. The anomaly argument uses the law of large numbers to justify averaging out the site. The first assumption of the law of large numbers is that the errors are normally distributed. If they are skew right distributions, the first assumption fails and the law does not apply.
Richard molineux April 19, 2016 at 9:43 am
Thanks for the link, Richard. Actually, that is a very bad statement of the law of large numbers (LoLN), no surprise given that it is wikipedia … here’s a better one, which starts:
From this we see that the law of large numbers (LoLN) only works on uncorrelated “i.i.d.” distributions, where “i.i.d.” stands for “independent identically distributed”. If the variables are NOT uncorrelated i.i.d., then we cannot assume that the law of large numbers will work. It might … but we can’t be sure of that. And even if the LoLN does work in general, in that the average of the numbers approaches the true average, it may approach the average much more slowly than the approach rate calculated by the LoLN.
And of course, a bunch of temperature measurements from some area are NOT uncorrelated, and NOT independent, and are most likely NOT identically distributed. As a result, the error values given by the Law of Large Numbers will underestimate the true error.
w.
Owen, you’re obviously unqualified to be a climate scientist. 🙂
More seriously, your understanding appears nowhere in the published literature.
Try building a house with no measuring instruments that have a resolution of less than 1 inch. E.g., you can not measure a piece of wood or cut it to any length other than in exactly 1 inch increments other than guessing the fractions. Yes every wall stud can be cut to exactly 84 inches, and the wall could be exactly 16 feet long, but what happens when you nail on the sheathing, then the siding, plasterboard, molding, etc.
Did you look at the figures. The systematic error is not constant nor is it gaussian. So taking an anomaly will not filter it out.
A big part of the problem is we assume the thermometers are measuring air temperature, by reading this article I’ve realised it’s a false assumption; the thermometers are actually measure the temperature of the themometer, which is influenced by the air temperature, the wind speed, whether the enclosure is ventilated, passively ventilated or forced ventilated, whether the enclosure is expose to sunlight and how much sunlight. Because all of these variables change, what the thermal equilibrium achieved is and how long the thermometer takes to achieve equilibrium constantly changes, in short the instrumental error is inconsistent and unpredictable.
Exactly right, Paul Jackson. Air temperature sensors measure the temperature inside their enclosure. When the enclosure is well-aspirated, the sensor can approach the outside temperature pretty well.
Without aspiration, significant, non-normal, and variable errors are produced.
The entire land-surface historical temperature record up to about year 2000 was obtained using unaspirated sensors. Today, world wide, that’s still mostly true.
ralfellis, you’re missing the point that the systematic errors in air temperature measurements are produced by uncontrolled environmental variables.
That means the errors are not constant in time or space. They are not mere offsets, they are not removed at all by taking anomalies.
The only way to deal with persistent and variable systematic errors is to evaluate their average magnitude by a series of calibration experiments, and then report that average as an uncertainty attached to every field measurement.
In fact, taking an anomaly by subtracting a measurement contaminated with systematic error, u1, from a mean that also has a systematic error contamination, u2, produces a greater uncertainty in the anomaly, u3 = sqrt[(u1)^2+(u2)^2].
That is, with non-normal and variable systematic errors, taking differences produces an anomaly with increased uncertainty.
This article showed that the systematic errors AREN’T consistent in the required sense. They vary. They don’t vary equally around zero, so averaging doesn’t help, but since they DO vary with time and place and divi dual instrument, anomalies don’t help either. It really was quite clear from the graphs.
Good luck with calibrating LIG thermometers. A calibration service will probably not calibrate it if it’s over 5 years old. You get a piece of paper that tells you where on the thermometer you have inaccuracies and what the factor is. Lab techs file the paper away and use the thermometer as is and then say it is calibrated.
I used to supply scientific equipment and visited labs. I have a reasonable idea of what goes on compared to what is supposed to happen.
Agreed, and form my experience in classified government contracting, in addition, it’s standard procedure to record measurements that are smaller than the instrument manufacturer’s stated precision in measurement.
I have no doubt that this is true and have seen many ways to get an incorrect temperature reading. i am but a humble refrigeration tech. but i would never trust a single temperature reading for something that matters without first calibrating my instrument. For refrigeration temperature readings I would calibrate a pressure gauge against atmospheric pressure and then use the observed pressure of the saturated gas to derive a temperature. Always more accurate. The idea of using ship intake temps as accurate measurement is laughable to me. Are all the intake pipes insulated, is the intake at the same depth for all readings,Daytime or nighttime,How far is the pipe run through the ship at what ambient temp, where any of the thermometers or sensors calibrated, how far does the well stick up, what is the depth of the well. ,,Is the inlet water filter clean. Ridiculous!
Very good paper. Now render it down to the ten main points for those whose attention spans are less than two minutes.
Unjustified claims of precision and accuracy are much more common than anyone thinks.
Type B errors ignored
Thanks Pat for a superb presentation and clear discussion.
It appears the “Climate Consensus” willfully ignores the international guidelines for evaluating uncertainties formally codified under true scientific consensus among the national standards labs.
See:
Evaluation of measurement data – Guide to the expression of uncertainty in measurement. JCGM 100: 2008 BIPM (GUM 1995 with minor corrections) Corrected version 20100
This details the Type A and Type B uncertainty errors.
Type A. those which are evaluated by statistical methods,
Type B. those which are evaluated by other means.
See the diagram on p53 D-2 Graphical illustration of values, error, and uncertainty.
Type B errors are most often overlooked. E.g.
See NIST’s web page Uncertainty of Measurement Results
International and US Perspectives on measurement uncertainty
Barry N. Taylor and Chris E. Kuyatt, Guidelines for Evaluating and Expressing the Uncertainty of NIST Measurement Results, NIST TN1297 PDF
Thanks for the link on “Type B” errors. I’ve been beating this drum for a long time, saying “Yes, but that is just the STATISTICAL ERROR, not the total error.”
w.
Thanks Willis. Speaking of which: Errata typo “2010” not “20100”
Evaluation of measurement data – Guide to the expression of uncertainty in measurement. JCGM 100: 2008 BIPM (GUM 1995 with minor corrections) Corrected version 2010
From: JCGM Working Group on the Expression of Uncertainty in Measurement (GUM)
Hosted by BIPM Bureau International des Poids et Mesures
News JCGM
20th anniversary of the GUM Metrologia, Volume 51, Number 4, August 2014
Revision of the ‘Guide to the Expression of Uncertainty in Measurement’ Metrologia, Volume 49, Number 6 , 5 October 2012
News: Standards for monitoring atmospheric carbon dioxide
Thanks, David. I’ve been using those references and have found them very very helpful. I think it may have been you who referred me to them originally, lo, these many years ago.
Pat, this is an excellent analysis of measurement error associated with estimation of global temperatures and anomalies. However, as you probably realize, this aspect is only part of the overall uncertainty and there are other important parts. Perhaps the most important part of the uncertainty is representativeness of the measurements, which includes microscale and mesoscale influences on the measurement site relative to the general characteristics of the typically large spatial area represented by a monitor and changes in those influences over time. I have provided a more detailed overview here, mainly pertaining to land based measurements:
https://oz4caster.wordpress.com/2015/02/16/uncertainty-in-global-temperature-assessments/
One other thought on the temperature measurement uncertainty is what happens when the temperature shield is covered in ice or snow? I have been looking at 3-hour temperature reports in the Arctic area north of North America provided by NOAA, including an archive of past data plots:
http://www.wpc.ncep.noaa.gov/html/sfc-zoom.php
On these plots, I have noticed that some of the buoy observations in the Arctic sea show little variation over time, as might be expected if the sensor shield was covered in ice or snow. And yet these observations are being reported as if they are accurate and the data may be used in weather forecast modeling and weather forecast model reanalyses for climatic assessment. One particular buoy has been reporting temperatures nearly constant at 13-15F while other buoys in the area show temperatures well below 0F and show much larger variations over time. I also have noticed a Canadian automated weather station on an island at the edge of the Arctic Ocean near Alaska reporting temperatures nearly constant at about 30F all winter while all other reports in the area are much lower. This type of problem adds considerable uncertainty to measurements over snow and ice, especially in remote locations with automated monitors.
Since the Arctic and Antarctic areas are critical to evaluating climate change, it is very important that accurate measurements are made. I would much rather see the huge amounts of money being funneled into redundant and misleading GCMs be diverted into obtaining better and more accurate coverage of the Arctic and Antarctic areas, and including better data validation and reporting. A lot of bad data are being reported as if they are good data and coverage is relatively poor compared to most populated areas.
Deserts and very complex terrain are additional examples of areas with poor coverage globally and with serious challenges to accurate and representative measurements.
> also have noticed a Canadian automated weather station on an island at the edge of the Arctic Ocean near Alaska reporting temperatures nearly constant at about 30F all winter
With just a bit more atmospheric CO2, they’ll be able to grow bananas there.
This is interesting but you may have missed the cause. I would suspect from what you have found that these sensors are often reading around the dewpoint of the air. If this is so they probably have condensation on them and if they are reading below freezing then they may have ice accumulation as well. If they continue to acquire condensation on top of the ice they will show a temperature skewed warmer by the heat released by the atmospheric moisture as it condenses. This will be the case for all sensors reading air temps around dew point unless some method has been adopted to correct for that fact. Makes me wonder about NOAA buoys at colder latitudes.
john, I managed to track down data from the buoy in question. It has the WMO ID 48507 and it frequently shows long stretches of constant temperature with only small variations over time. The last 13 observations available for today showed a constant -8.46C spanning a five hour period from 1200 to 1700 UTC. The closest buoy station, WMO ID 48731, showed temperatures ranging from -23.52C to -21.05C over this same period and every one of the 20 observations had a slightly different temperature. You can find the data here:
http://iabp.apl.washington.edu/maps_daily_table.html
So, I’m not sure what is causing the bad data at 48507, but it certainly does not look like valid temperature data and yet it is still being reported for use in weather analyses. As usual, data users beware.
john, I forgot to mention that the Canadian weather station with bad temperature data is CWND at Pelly Island. You can see data from that station on WunderGround here:
https://www.wunderground.com/history/airport/CWND/2016/04/19/DailyHistory.html
It has been reporting a high of 29F or 30F and low of 29F every day for the last 30 days. Seems highly unlikely to be real temperature measurements, especially considering all the nearby stations were around 14F to 17F at last report. Whoever is responsible for data quality control for this weather station is not doing their job to take that data offline until the problem is fixed.
oz4csater, you’re quite right about other sources of error, and weirdness in some sensors. To keep things simple, I’ve restricted myself to estimating the lower limit of error — that at the instrument itself.
As you rightly pointed out, there are many other sources of error that will not average away, and that must be included in any complete estimate.
We can expect that any full accounting of total error will reveal it to be very large, rendering most of the last 150 years of the air temperature record useless for unprecedentedness studies.
Outstanding presentation. Anyone motivated to put together a calibration suite, and go to randomly-sampled sensor locations to do comparisons? I’ve been toying with the idea of researching printed (i.e., newspaper) temperature records for a major city to see how they compare with the adjusted values.
Steve Fraser, it might be possible to do that by using the USCRN sensors as calibration thermometers for any nearby COOP stations.
One of the money quotes: [there were a few more]
The people compiling the global instrumental record have neglected a experimental limit even more basic than systematic measurement error: the detection limits of their instruments. They have paid no attention to it.
Resolution limits and systematic measurement error produced by the instrument itself constitute lower limits of uncertainty. The scientists engaged in consensus climatology have neglected both of them.
It’s almost as though none of them have ever made a measurement or struggled with an instrument. There is no other rational explanation for that sort of negligence than a profound ignorance of experimental methods.
I would like to hear Steve Moshers view on this data or Nick Stokes.
When you add in the uncertainties introduced by the procedure of calculating a “global average”, you have to wonder if there is any meaning in published trends. Not to mention of course the “homogenization” and “adjustments” that occur between the readings and the average.
The grdding process used to calculate an average of randomly distributed points with huge variations in sample density is fraught with potential error. Does anyone know anything about the gridding programs they use in Climate Science? If they are anything like what we use in geophysics/geochemistry, they will contain user-defined variables that, if you change them, can substantially change the result you get. You have to wonder if there isn’t a bit of tweaking involved there too, to produce the desired trends. Stuff like that can even be unconscious. Or conscious, but the perception of the conscious knowledge is suppressed by a sub-conscious desire to see the hoped-for outcome. That may sound a bit vague, but I’m trying to express tendencies I’ve seen in myself when playing with data sets. If you’re honest, you have to sit up, take a deep breath and remind yourself that you’re supposed to be rigorous.
“Confirmation Bias” is a good term for what I’m incoherently trying to describe. It’s so very, very tempting when you realise that you can please the people you’re working for by just a little bit of tweaking. In large organizations like those involved in Climate Science, there should be strict QA/QC protocols in place to keep it under control, I wonder of there are? After reading this post and the very illuminating comments, it looks like they don’t even have QA/QC for their raw data.
If GISS had strict QA/QC protocols, Jim Hansen could not have overwritten existing temperature data with his “adjusted” data. To do so, you must assume that your “adjusted” data is not only an improvement, but also that it is so close to perfection that you will never have to revisit the older data.
Some systematic errors can be dealt with by using anamolies. However suppose your thermometer isn’t one degree too high but is 0.3 percent (absolute) too high. At -25C, it reads -24.256 and at 30C, it reads 30.909.
“The RM Young systematic errors mean that, absent an independent calibration instrument, any given daily mean temperature has an associated 1s uncertainty of 1±1.4 C. Figure 5 shows this uncertainty is neither randomly distributed nor constant. It cannot be removed by averaging individual measurements or by taking anomalies. Subtracting the average bias will not remove the non-normal 1s uncertainty. Entry of the RM Young station temperature record into a global average will carry that average error along with it.”
This is why I chose the method I did of looking at the surface data. Obviously just averaging temperatures and making up what you didn’t have all cam up with the same basic answer.
I decided the only way to do anything about the systematic errors is don’t reference any single station directly with any other station, but only it’s day to day, and morning to night/night to morning, that’s the day to day derivative of change at that station, and then I average the station’s derivative for an area together, from 1×1 cells, to the entire world.
But the point is, if there’s a warm bias during the day, there should be the opposite bias that night., Not the night before, which is what you get if you use a clock to look at the data. Each day the planets gets a dose of energy, and at night, it cools off.
In the extra-tropics, the length of day changes through out the year. So we have a daily test, where the length of input changes, no body every think to look at the effect of that response? But I digress.
It’s the best bias removal possible for the data we have.
https://micro6500blog.wordpress.com/2015/11/18/evidence-against-warming-from-carbon-dioxide/
I forgot to add, what this method allows, is to not really care what the actual temps are, because it doesn’t matter, it’s the change in temp, it’s the only really decent number we have..
If it reads +1C, min and max are both +1, the systematic errors are removed in diff (equal to (Tmx day 0 – Tmn day 0) – (Tmx day 0 – Tmn day 1) = (Tmn day 0 – Tmn day 1) ) .
If there’s a solar warm bias, as soon as the sun stops shining on something it’s loosing heat to space. It looses heat to space all the time, it’s just overwhelmed by the Sun. I never realize how cold the optical window is, and remember, sure there’s a notch, surface temperatures are all most in the window, the other clear night, it was still in the 50’s, the grass in my front yard was in the upper 30’s, and the sky measured around -45F,
You can take the measured temp, use SB to turn it into a flux, then add the Co2 feedback (I think I saw an estimate of 22W/m2, with a 3.7W/m2 change), and then turn it back to a temp, should be close.
And surface temps dump to space all the time, unless it’s cloudy, and then it can swing 70-80F warmer. Oh, at -40F 3.7W is about 1.8F, so -40 with the change in Co2 forcing is still -37F, or -38F, and the full 22W/m2 it’s still 0F or -10F, it was a while back.
Again I digress ……
What is the error in CO2 measurements. As discussed by Greg Goodman at Judith Curry’s Climate Etc March 9, 2016) having errors in both the independent and dependent variable in models causes major problems in creating and using predictive models. https://judithcurry.com/2016/03/09/on-inappropriate-use-of-least-squares-regression. In the case of IPCC deterministic models used to forecast out on millennial time scales, aside from cherry picking the data used to fit the model, the uncertainty in both X and Y variables rends the model basically meaningless.
Though there may not be that many Stevenson screen (CRS) still in use, I’m wondering if a different coating approach might reduce thermal emissivity more, and reduce the need to recoat. The enclosure’s interior would seem to be less subject to weathering. Eg ‘interior’ coatings, such as: http://www.solec.org/lomit-radiant-barrier-coating/lomit-technical-specifications/
Wrong look at the figures. The systematic error of these instruments is not fixed nor is it Gaussian.
I have been saying it once a week on here 😀 I agree with Lindzen, Global average temp is first of all unknown and secondly is a residue not a metric of anything.
Residues are not drivers of the system they are a remnant of, because that is not possible without temporal adjustments 🙂
I agree with the Article, arguing over global temperature is akin to discussing theology
At least with theology you don’t have government funded agencies rewriting the bible/Koran/book of Mormon/whatever to support a political agenda, so I would say theology is in much better intellectual shape.
Global Average Temperature is a distraction nothing more, watch what the other hand is doing!
The uncertainty estimate developed here shows that the rate or magnitude of change in global air temperature since 1850 cannot be known within ±1 C prior to 1980 or within ±0.6 C after 1990, at the 95% confidence interval.
Is it your contention that, regarding the change in global mean temperature since 1880, 0C and +2C are reasonably and equally supported by the data?
Compared to the amount of effort and the level of detail with which the BEST team has addressed these issues, your presentation is not an improvement over them.
Apparently, BEST has a lot of room for improvement…
(Source: https://wattsupwiththat.com/2016/01/28/a-way-forward-for-best-et-al-and-their-surface-temperature-record/ )
Given Mr. Jonas’ (and others’) analysis and, moreover, given BEST man, S. M0sher’s, non-science remarks in the threads of WUWT, there is good reason to doubt the quality of the BEST product.
Re: “amount of effort and level of detail,” — the key is quality of analysis, not quantity. If BEST is biased and reckless in its underlying approach, no amount of finessing the details will save it from being wrong. Further, are you bearing in mind, that this is not the full version of Dr. Frank’s paper? “This is a version of the talk I gave …
Mr. Marler,
I spoke a bit sharply and I must apologize for my tone. Your rather harsh and, in my view, unfair, criticism struck a nerve. I hope that all is well with YOUR excellent data analysis. Whenever you share here what you have been working on, I am always impressed with your careful attention to detail and your scientific integrity.
Janice
Janice Moore: If BEST is biased and reckless in its underlying approach, no amount of finessing the details will save it from being wrong.
BEST is not biased or reckless in its underlying approach.
matthewrmarler, it’s my contention that no one knows where the correct temperature is within the uncertainty limits.
As there are many possible temperatures, both 0 C and 2 C are equally low-probability choices.
Nothing in the BEST method removes systematic error or obviates the detection limits determined by thermometer resolution.
I’ve been reading about global warming since 1997.
This is one of the best articles since then, mainly because too few people write about data error — too boring I suppose?
I’m not convinced average temperature is important to know.
I’ve also never found any compelling evidence to prove the average temperature from 1880 to 2016 has a margin of error less than +/- one degree C.
Thermometers from the 1800s tend to read low = exaggerating the warming.
Most of the world was not measured then, and is still not measured.
The accuracy of the sailors throwing buckets over the side of ships to measure 70% of the planet’s surface (if that vision alone is not enough to disqualify the data!) depends mainly on whether or not the sailor hauls up the bucket and stops to have a cigarette before he measures the water temperature.
On top of huge changes in the number of measurements, changes in weather station equipment and locations, failure to send people out in the field to carefully verify station accuracy at least once a year… the same people who make the glorious confuser model climate predictions … are also responsible for the actuals … which they repeatedly change … to show more warming and better match their predictions.
And if I have not already created enough doubt about data quality, NASA reports the average temperature of our planet in hundredths of a degree C. … while saying their margin of error is +/- 0.1 degrees C. !
They must pull their margin or error numbers out of a hat … just like the IPCC does for their 95% confidence claim !
You don’t have to be a scientist to know the temperature data are a joke.
CO2 data since 1958 are probably accurate — before that I don’t know.
The political influence is so strong I predict if the surface starts cooling in the future, the books will be “cooked” to show warming is still in progress. I am not convinced leftist bureaucrats will EVER allow global cooling to be reflected in “their” average temperature data.
Remember how the “pause” was eliminated from surface data with “adjustments” … and the satellite data are now under attack ?
Apparently there are only two kinds of data for warmunists: Good data that support the climate models, and bad data to be ignored or attacked.
Leftists treat climate change as a religion — they will lie, cheat and steal to keep their CO2 boogeyman and “green” industry alive. They seize more political power to “save the planet (i.e.; Tell everyone else how to live).
The climate today is the best it has been for humans and plants in at least 500 years.
More CO2 in the air is great news.
Slight warming of nighttime lows is great news.
Compiling the average temperature is a complete waste of money.
The fact that leftist politicians have destroyed the integrity of science, and the reputations of good scientists, will be a long -term problem. A problem with potentially very serious consequences for two reasons:
(1) The unjustified attack on fossil fuels is really an attack on economic growth and prosperity, and
(2) Some time in the future scientists may discover a real problem, unlike CO2, but they will be ignored because too many greedy scientists in the past took government money in return for false predictions of a coming environmental catastrophe … that will never come ( I’ve been waiting for 40 years so far … but the climate keeps getting better … and better ! ).
Climate blog for non-scientists
No ads/ No money for me.
A public service
http://www.elOnionBloggle.Blogspot.com
Richard Greene – 11:25 am
Excellent summary. Many of the same things could be said for the sea level boogeyman.
I agree, a fine overview/comment, Richard.
Nice rant Richard. Well said!
Thank you, Pat, for an excellent presentation on measurement error. I have had the impression that the proponents of AGW had not performed error analyses on their measurements which you confirmed. Over a dozen years ago in postings on Climate Audit I was discussing measurement error with an AGW proponent. I asked him how he dealt with measurement error. He responded that they just averaged many measurements together and that reduced the error. I realize that this person had never studied error analysis. You have pointed out that things have not improved.
The warm bias you pointed out in the older SST data further emphasizes the scientific bankruptcy of Karl’s pause buster “adjustments.” The most frightening fact that you discussed is that field calibration is the exception rather than the rule. I keep wondering why people in climate science seem to ignore error analysis of their data. Do universities no longer teach error analysis? Do those in the climate science field assume that all of their instrumentation is without error? They use the presence of errors as the rationale for adjusting data, but they do not seem to understand the errors with which they are dealing.
isthatright, thanks for your thoughtful comments. Like you, I’m totally puzzled by the choices made by the record compilers to assume all measurement error is random.
Whether they are ever taught how to assess physical error or instrumental/measurement error is a very good question.
I have yet to encounter a single climate modeler, either, who understands the first thing about it. I very much doubt they are taught anything of it.
The negligence problem perfuses consensus climatology, and I’ve published about that.
Excellent article Dr. Frank.
Thanks, Thomas.
Pat
Many thanks for posting this very interesting and informative article.
Thanks, nic. I’m glad you passed it. 🙂
“It’s almost as though none of them have ever made a measurement or struggled with an instrument. There is no other rational explanation for that sort of negligence than a profound ignorance of experimental methods….”
Nay, it is almost as though every single one of them were driven by ideology to seek a justification for something that is outside of science and that is to support an ideological goal. That goal is socialistic in nature, to reduce and destroy liberty, to reduce and destroy prosperity. That is the only goal that is consistent with all the demands that repeatedly made no matter what the science.
Buckwheaton,
It seems so to me as well . . I see virtually no chance that what has been going on is due to mere incompetence.
I don’t know John, I feel the same as you since. to me, it seems so obvious. How could anyone be as stupid as the climate alarmists? We are, as our host so eloquently demonstrates, perfectly capable of admitting the limits of our knowledge, but for some reason we don’t? Or, at the very least, for some reason some don’t.
But what is that reason? A friend and mentor of mine once passed on the wisdom of his father to me. He said “Never assume malice when ignorance will suffice”. In this example, my conversations with AGW advocates have, almost across the board, led me to believe I’m in the presence of stubborn stupidity, aka “arrogance”. It has caused me despair. Arrogance and stubborn stupidity are deadly; they can’t be overcome by reason.
John says,
“It seems so to me as well . . I see virtually no chance that what has been going on is due to mere incompetence.”
===========
Confirmation bias is a systematic error of a different breed altogether.
From the World Federation of Scientists web page on Climatology:
“Climatology
Members of the Panel:
Chairman:
Christopher Essex (CANADA)
Members:
Associate Panel Members:
(Associate Panel Members are a community of scientists who provide support and expertise for the working of the Permanent Monitoring Panel.)
Summary of the Emergency
Being revised.
Priorities in dealing with the Emergency
Being revised.”
Looks like your presentation may have had some results!
Michael Moon, thanks. Chris Essex has been head of that panel for some time. Apparently, at the end of the WFS 2013 meeting, he declared there is no climate emergency. So, that position has been in flux for some time. But I can hope to have helped it along! 🙂
Many years ago, instrument remote readouts were analog. The display was a needle which rotated through a scale which you could read to obtain a numerical value. Anyone who has used such an analog display will quickly realize that the needle is almost never stationary over one value. The needle swings back and forth depending upon its response to the amplified analog signal that it is receiving from the sensor. This swing of the needle is an indication of the error inherent in that particular instrument.
Today, most readouts are digital. Typically they will show a number which users will dutifully record as an indication of the sensor response. The amplified sensor response may be sending the identical signal to the readout, but in the case of a digital display, the signal passes through an A/D converter and is typically damped to provide a number. The number may vary slightly, but damping will reduce the observed variation. The use of digital displays can give the impression of enhanced precision even though the sensor is producing the same signal.