Are Claimed Global Record-Temperatures Valid?

The New York Times claims 2016 was the hottest year on record. Click for article.

Guest essay by Clyde Spencer

Introduction

The point of this article is that one should not ascribe more accuracy and precision to available global temperature data than is warranted, after examination of the limitations of the data set(s). One regularly sees news stories claiming that the recent year/month was the (first, or second, etc.) warmest in recorded history. This claim is reinforced with a stated temperature difference or anomaly that is some hundredths of a degree warmer than some reference, such as the previous year(s). I’d like to draw the reader’s attention to the following quote from Taylor (1982):

“The most important point about our two experts’ measurements is this: like most scientific measurements, they would both have been useless, if they had not included reliable statements of their uncertainties.”

Before going any further, it is important that the reader understand the difference between accuracy and precision. Accuracy is how close a measurement (or series of repeated measurements) is to the actual value, and precision is the resolution with which the measurement can be stated. Another way of looking at it is provided by the following graphic:

image

The illustration implies that repeatability, or decreased variance, is a part of precision. It is, but more importantly, it is the ability to record, with greater certainty, where a measurement is located on the continuum of a measurement scale. Low accuracy is commonly the result of systematic errors; however, very low precision, which can result from random errors or inappropriate instrumentation, can contribute to individual measurements having low accuracy.

Accuracy

For the sake of the following discussion, I’ll ignore issues with weather station siting problems potentially corrupting representative temperatures and introducing bias. However, see this link for a review of problems. Similarly, I’ll ignore the issue of sampling protocol, which has been a major criticism of historical ocean pH measurements, but is no less of a problem for temperature measurements. Fundamentally, temperatures are spatially-biased to over-represent industrialized, urban areas in the mid-latitudes, yet claims are made for the entire globe.

There are two major issues with regard to the trustworthiness of current and historical temperature data. One is the accuracy of recorded temperatures over the useable temperature range, as described in Table 4.1 at the following link:

http://www.nws.noaa.gov/directives/sym/pd01013002curr.pdf

Section 4.1.3 at the above link states:

“4.1.3 General Instruments. The WMO suggests ordinary thermometers be able to measure with high certainty in the range of -20°F to 115°F, with maximum error less than 0.4°F…”

In general, modern temperature-measuring devices are required to be able to provide a temperature accurate to about ±1.0° F (0.56° C) at its reference temperature, and not be in error by more than ±2.0° F (1.1° C) over their operational range. Table 4.2 requires that the resolution (precision) be 0.1° F (0.06° C) with an accuracy of 0.4° F (0.2° C).

The US has one of the best weather monitoring programs in the world. However, the accuracy and precision should be viewed in the context of how global averages and historical temperatures are calculated from records, particularly those with less accuracy and precision. It is extremely difficult to assess the accuracy of historical temperature records; the original instruments are rarely available to check for calibration.

Precision

The second issue is the precision with which temperatures are recorded, and the resulting number of significant figures retained when calculations are performed, such as when deriving averages and anomalies. This is the most important part of this critique.

If a temperature is recorded to the nearest tenth (0.1) of a degree, the convention is that it has been rounded or estimated. That is, a temperature reported as 98.6° F could have been as low as 98.55 or as high as 98.64° F.

The general rule of thumb for addition/subtraction is that no more significant figures to the right of the decimal point should be retained in the sum, than the number of significant figures in the least precise measurement. When multiplying/dividing numbers, the conservative rule of thumb is that, at most, no more than one additional significant figure should be retained in the product than that which the multiplicand with the least significant figures contains. Although, the rule usually followed is to retain only as many significant figures as that which the least precise multiplicand had. [For an expanded explanation of the rules of significant figures and mathematical operations with them, go to this Purdue site.]

Unlike a case with exact integers, a reduction in the number of significant figures in even one of the measurements in a series increases uncertainty in an average. Intuitively, one should anticipate that degrading the precision of one or more measurements in a set should degrade the precision of the result of mathematical operations. As an example, assume that one wants the arithmetic mean of the numbers 50., 40.0, and 30.0, where the trailing zeros are the last significant figure. The sum of the three numbers is 120., with three significant figures. Dividing by the integer 3 (exact) yields 40.0, with an uncertainty in the next position of ±0.05 implied.

Now, what if we take into account the implicit uncertainty of all the measurements? For example, consider that, in the previously examined set, all the measurements have an implied uncertainty. The sum of 50. ±0.5 + 40.0 ±0.05 + 30.0 ±0.05 becomes 120. ±0.6. While not highly probable, it is possible that all of the errors could have the same sign. That means, the average could be as small as 39.80 (119.4/3), or as large as 40.20 (120.6/3). That is, 40.00 ±0.20; this number should be rounded down to 40.0 ±0.2. Comparing these results, with what was obtained previously, it can be seen that there is an increase in the uncertainty. The potential difference between the bounds of the mean value may increase as more data are averaged.

It is generally well known, especially amongst surveyors, that the precision of multiple, averaged measurements varies inversely with the square-root of the number of readings that are taken. Averaging tends to remove the random error in rounding when measuring a fixed value. However, the caveats here are that the measurements have to be taken with the same instrument, on the same fixed parameter, such as an angle turned with a transit. Furthermore, Smirnoff (1961) cautions, ”… at a low order of precision no increase in accuracy will result from repeated measurements.” He expands on this with the remark, “…the prerequisite condition for improving the accuracy is that measurements must be of such an order of precision that there will be some variations in recorded values.” The implication here is that there is a limit to how much the precision can be increased. Thus, while the definition of the Standard Error of the Mean is the Standard Deviation of samples divided by the square-root of the number of samples, the process cannot be repeated indefinitely to obtain any precision desired!1

While multiple observers may eliminate systematic error resulting from observer bias, the other requirements are less forgiving. Different instruments will have different accuracies and may introduce greater imprecision in averaged values.

Similarly, measuring different angles tells one nothing about the accuracy or precision of a particular angle of interest. Thus, measuring multiple temperatures, over a series of hours or days, tells one nothing about the uncertainty in temperature, at a given location, at a particular time, and can do nothing to eliminate rounding errors. A physical object has intrinsic properties such as density or specific heat. However, temperatures are ephemeral and one cannot return and measure the temperature again at some later time. Fundamentally, one only has one chance to determine the precise temperature at a site, at a particular time.

The NOAA Automated Surface Observing System (ASOS) has an unconventional way of handling ambient temperature data. The User’s Guide says the following in section 3.1.2:

“Once each minute the ACU calculates the 5-minute average ambient temperature and dew point temperature from the 1-minute average observations… These 5-minute averages are rounded to the nearest degree Fahrenheit, converted to the nearest 0.1 degree Celsius, and reported once each minute as the 5-minute average ambient and dew point temperatures…”

This automated procedure is performed with temperature sensors specified to have an RMS error of 0.9° F (0.5° C), a maximum error of ±1.8° F (±1.0° C), and a resolution of 0.1° F (0.06° C) in the most likely temperature ranges encountered in the continental USA. [See Table 1 in the User’s Guide.] One (1. ±0.5) degree Fahrenheit is equivalent to 0.6 ±0.3 degrees Celsius. Reporting the rounded Celsius temperature, as specified above in the quote, implies a precision of 0.1° C when only 0.6 ±0.3° C is justified, thus implying a precision 3 to 9-times greater than what it is. In any event, even using modern temperature data that are commonly available, reporting temperature anomalies with two or more significant figures to the right of the decimal point is not warranted!

Consequences

Where these issues become particularly important is when temperature data from different sources, which use different instrumentation with varying accuracy and precision, are used to consolidate or aggregate all available global temperatures. Also, it becomes an issue in comparing historical data with modern data, and particularly in computing anomalies. A significant problem with historical data is that, typically, temperatures were only measured to the nearest degree (As with modern ASOS temperatures!). Hence, the historical data have low precision (and unknown accuracy), and the rule given above for subtraction comes into play when calculating what are called temperature anomalies. That is, data are averaged to determine a so-called temperature baseline, typically for a 30-year period. That baseline is subtracted from modern data to define an anomaly. A way around the subtraction issue is to calculate the best historical average available, and then define it as having as many significant figures as modern data. Then, there is no requirement to truncate or round modern data. One can then legitimately say what the modern anomalies are with respect to the defined baseline, although it will not be obvious if the difference is statistically significant. Unfortunately, one is just deluding themselves if they think that they can say anything about how modern temperature readings compare to historical temperatures when the variations are to the right of the decimal point!

Indicative of the problem is that data published by NASA show the same implied precision (±0.005° C) for the late-1800s as for modern anomaly data. The character of the data table, with entries of 1 to 3 digits with no decimal points, suggests that attention to significant figures received little consideration. Even more egregious is the representation of precision of ±0.0005° C for anomalies in a Wikipedia article wherein NASA is attributed as the source.

Ideally, one should have a continuous record of temperatures throughout a 24-hour period and integrate the area under the temperature/time graph to obtain a true, average daily temperature. However, one rarely has that kind of temperature record, especially for older data. Thus, we have to do the best we can with the data that we have, which is often a diurnal range. Taking a daily high and low temperature, and averaging them separately, gives one insight on how station temperatures change over time. Evidence indicates that the high and low temperatures are not changing in parallel over the last 100 years; until recently, the low temperatures were increasing faster than the highs. That means, even for long-term, well-maintained weather stations, we don’t have a true average of temperatures over time. At best, we have an average of the daily high and low temperatures. Averaging them creates an artifact that loses information.

When one computes an average for purposes of scientific analysis, conventionally, it is presented with a standard deviation, a measure of variability of the individual samples of the average. I have not seen any published standard deviations associated with annual global-temperature averages. However, utilizing Tchebysheff’s Theorem and the Empirical Rule (Mendenhall, 1975), we can come up with a conservative estimate of the standard deviation for global averages. That is, the range in global temperatures should be approximately four times the standard deviation (Range ≈ ±4s). For Summer desert temperatures reaching about 130° F and Winter Antarctic temperatures reaching -120° F, that gives Earth an annual range in temperature of at least 250° F; thus, an estimated standard deviation of about 31° F! Because deserts and the polar regions are so poorly monitored, it is likely that the range (and thus the standard deviation) is larger than my assumptions. One should intuitively suspect that since few of the global measurements are close to the average, the standard deviation for the average is high! Yet, global annual anomalies are commonly reported with significant figures to the right of the decimal point. Averaging the annual high temperatures separately from the annual lows would considerably reduce the estimated standard deviation, but it still would not justify the precision that is reported commonly. This estimated standard deviation is probably telling us more about the frequency distribution of temperatures than the precision with which the mean is known. It says that probably a little more than 2/3rds of the recorded surface temperatures are between -26. and +36.° F. Because the median of this range is 5.0° F, and the generally accepted mean global temperature is about 59° F, it suggests that there is a long tail on the distribution, biasing the estimate of the median to a lower temperature.

Summary

In summary, there are numerous data handling practices, which climatologists generally ignore, that seriously compromise the veracity of the claims of record average-temperatures, and are reflective of poor science. The statistical significance of temperature differences with 3 or even 2 significant figures to the right of the decimal point is highly questionable. One is not justified in using the approach of calculating the Standard Error of the Mean to improve precision, by removing random errors, because there is no fixed, single value that random errors cluster about. The global average is a hypothetical construct that doesn’t exist in Nature. Instead, temperatures are changing, creating variable, systematic-like errors. Real scientists are concerned about the magnitude and origin of the inevitable errors in their measurements.


References

Mendenhall, William, (1975), Introduction to probability and statistics, 4th ed.; Duxbury Press, North Scituate, MA, p. 41

Smirnoff, Michael V., (1961), Measurements for engineering and other surveys; Prentice Hall, Englewood Cliffs, NJ, p.181

Taylor, John R., (1982), An introduction to error analysis – the study of uncertainties in physical measurements; University Science Books, Mill Valley, CA, p.6

1Note: One cannot take a single measurement, add it to itself a hundred times, and then divide by 100 to claim an order of magnitude increase in precision. Similarly, if one has redundant measurements that don’t provide additional information regarding accuracy or dispersion, because of poor precision, then one isn’t justified in averaging them and claiming more precision. Imagine that one is tasked with measuring an object whose true length is 1.0001 meters, and all that one has is a meter stick. No amount of measuring and re-measuring with the meter stick is going to resolve that 1/10th of a millimeter.

Advertisements

307 thoughts on “Are Claimed Global Record-Temperatures Valid?

  1. Claiming accuracy of a hundredth of a degree by averaging instruments only accurate to half a degree? Evidently good enough for government work./s

    • An individual roll of a dice is going to be one of {1,2,3,4,5,6}.

      But what’s the expected average of a thousand rolls? 3.5000 +/- 0.0033.

      Notice how the precision is a lot higher when you average the rolls. It doesn’t have to be the same roll of the same die, only that you’re measuring the same thing, whether the average of dice rolls or the average global temperature anomaly.

      • That’s only true when your error is random and normally distributed.
        Neither condition is true for temperature records.

      • With dice, one is recording unchanging numbers. With thermometers, one is measuring changing numbers, so one needs to take the standard deviation and such into account. You will have a group on a graph, not an approximation of a point.

      • You make an interesting point, I hadn’t realised that a measured temperature at a set location was a random measurement between two limits and repeated under the same conditions.

      • Windchasers,
        In your example, you are averaging integers, which are exact. That is, they have infinite precision.

      • Windchasers,
        In looking up some references on the Law of Large Numbers, it appears that Wikipedia (https://en.wikipedia.org/wiki/Law_of_large_numbers) should be acceptable, since it seems to be the source that you used. Note that the explanation says that the average should converge on the EXPECTED value. That is clearly a fixed value when a theorem is involved. However, we have no way of knowing what the “expected” value for Earth is; it changes from moment to moment! Indeed, The Wikipedia article specifically states, “If the expected values change during the series, then we can simply apply the law to the average deviation from the respective expected values. The law then states that this converges in probability to zero.” The “expected” value would be some instantaneous measurement, not a series over an extended period of time. Thus, the average for a particular 30-year period would be different from the average for a series shifted by one or more years. Therefore, I maintain that the correct metric for uncertainty for any given temperature series is the Standard Deviation, and not the SD divided by the SQRT of the number of samples.

        The Wikipedia article further states, “Kolmogorov also showed, in 1933, that if the variables are independent and identically distributed, then for the average to converge almost surely on something (this can be considered another statement of the strong law), it is necessary that they have an expected value (and then of course the average will converge almost surely on that).” Temperatures are not independent variables. They are in fact strongly auto-correlated, suggesting that the conditions for the strong law are not met. How do you propose to demonstrate that all global temperatures have an expected value over any arbitrary time interval? If a station at one elevation is closed, and another opened at a different elevation, the average will change, as with changing the time interval.

      • Clyde, let me help you out here. You say, ” Therefore, I maintain that the correct metric for uncertainty for any given temperature series is the Standard Deviation, and not the SD divided by the SQRT of the number of samples.” You are mixing apples with oranges. First of all, the uncertainty for a single temperature measurement is the standard deviation. When you have more than one temperature measurement, you are dealing withe Standard Error. This is because Standard Error is a function of the individual measurements standard deviation and it includes the number of obs. The “series” is more than one measurement.

      • Of course, after the data was adjusted by unknown methods, it is impossible to make any conclusions from the data set.

      • Michael Darby,

        Let me help you out here: “In statistics, the standard deviation … is a measure that is used to quantify the amount of variation or dispersion of a SET of data values.” https://en.wikipedia.org/wiki/Standard_deviation

        A single sample may be characterized as being so many standard deviations from the mean, but the standard deviation actually defines the bounds that confine approximately 68% of the total samples.

      • Clyde makes a mistake when he posts: “the standard deviation actually defines the bounds that confine approximately 68% of the total samples.” The mistake you make is that not all random variables are normally distributed. Rookie error……when the error distribution of the measuring instrument is uniformly distributed, your statement if erroneous.

      • Tom Halla
        April 12, 2017 at 9:36 am

        With dice, one is recording unchanging numbers. With thermometers, one is measuring changing numbers, so one needs to take the standard deviation and such into account. You will have a group on a graph, not an approximation of a point.

        Not only you are measuring changing numbers, you are using a thermometer to measure the temperature in New York city, for instance, and a different thermometer to measure the temperature in Boston, which not only is changing as well but also is different from the temperature in NYC,

        So, how measuring the temperature in NYC increases the precision of the temperature measured in Boston?

        Just asking.

      • Windchasers,
        After mulling the Wikipedia article over in my mind, I have some additional comments. Your assertion that 1000 rolls of the dice will result in a precision of 3.5000 is wrong. (Also, you should truncate the last “3” and report your claim as “3.500 +/- 0.003”.) You are dealing with integers, which have unlimited precision. The problem is that the answer is not ACCURATE until you take many samples. Note the graph at the top right (https://en.wikipedia.org/wiki/Law_of_large_numbers) showing the results of many trials. The results oscillate strongly until about 500 trials. This does not speak to an increase in precision, but rather, speaks to CONVERGING to the correct answer. The calculation of the function predicting the average has infinite precision because you are using integers. Similarly, the average of the trials have infinite precision because you are using integers. The trial numbers change, however, to approach the theoretical value of 3.500…0 to have a small error that can only be expressed by showing more significant figures.

        Your claim seems to be a misconception of people who cut their teeth on computers that spit out 15 or more digits when a calculation is performed. You haven’t been taught to look at the results to determine which of those digits are meaningless in the context of the inputs and the application.

      • That figure would properly be expressed as 3.500 +/- 0.003. Only one non-zero digit in the uncertainty, and the mean can’t be more precise than the uncertainty.

    • Average Global Temperatures??? The sampling is unbelievably patchy.

      “Please Note: Grey areas represent missing data.” NOAA says.
      https://www.ncdc.noaa.gov/sotc/service/global/map-land-sfc-mntp/201702.gi

      Note that the ocean grid cells are actually land temperatures (i.e. islands) extrapolated out to fill the grid cells. A casual glance tells me that the “Global Average” is really an over sampling of North America, Europe, a swath of parts of Asia, a small fraction of Africa, a South America dominated by Argentina, Australia’s south and east.

      Check out Greenland and Antarctica, the lands of melting icecaps. Pathetic.

      • Richard G. on April 12, 2017 at 9:50 am

        1. Note that the ocean grid cells are actually land temperatures (i.e. islands) extrapolated out to fill the grid cells.

        No:
        – It is not GHCN’s role to give us information about the oceans: that is ERSST’s domain.
        – All GHCN and ERSST users (e.g., GISTEMP) have land and sea masks with finest resolution allowing them to accurately separate land and sea areas.

        2. Check out Greenland and Antarctica, the lands of melting icecaps. Pathetic.

        The first what I had to learn, Richard G., when inspecting UAH’s most detailed dataset (a 2.5° grid over the Globe) was that
        – though Roy Spencer claims for full measurement in his zonal files, neither 82.5S-90S nor 82.5N-90N are present (all cells there contain an “invalid value”);
        – regions like Greenland, Tibet, the Andes (in fact, anything clearly above 1,500 m, see their readme file) are “poorly covered”.

        Nevertheless, nobody claims the UAH dataset to be insufficient, me included.

        3. A casual glance tells me that the “Global Average” is really an over sampling of North America, Europe, a swath of parts of Asia, a small fraction of Africa, a South America dominated by Argentina, Australia’s south and east.

        When processing UAH’s grid (9,504 cells), you understand that you can perfectly approximate the Globe with an averaging of no more than 512 evenly distributed cells, i.e. about 5 % of the whole.

        This means that we need far less information to build a global average than we imagine.

      • Not only that… but the highest increases in temperature observed over the entire planet surface are found in the United States Of America?

        How can this be?

        The extreme record high temperatures for each state are found in the link below.
        https://en.wikipedia.org/wiki/U.S._state_temperature_extremes

        Alaska 100 °F / 38 °C June 27, 1915

        California 134 °F / 57 °C July 10, 1913

        Florida 109 °F / 43 °C June 29, 1931

        Kentucky 114 °F / 46 °C July 28, 1930

        New York 109 °F / 42 °C July 22, 1926

        Texas 120 °F / 49 °C June 28, 1994* (tied with 1933)

      • This has to be put down to either mind-numbingly sheer incompetence, or massive psychedelic drug consumption on a scale never previously witnessed if the true answer is not to be found within an entirely corrupted academic community of ‘Climate scientists’ and those who enable them.

      • They appear to be using 5 degree by 5 degree cells, which is absurdly coarse. You will note that cells defined by equal dimensions of latitude and longitude get smaller as they approach the poles, and become vanishingly small at the poles. Does the computation of average give weight to the actual sizes of the cells or not (i.e. bigger cells near the equator count for more than smaller cells near the poles). Do they tell you? If they do, I haven’t seen it. So right there is a way of getting the results to vary, and you can pick the numbers you like best.

        And how do they calculate the value for cells with lots of data points in them? Homogenizing, that’s what they do. What does it mean? Read it on the NOOa website and see if you can figure it out. I couldn’t.

        So there are two areas where data can be tweaked towards desired values, without even thinking about issues of precision and accuracy. It’s how they calculate the average that is the key to data manipulation.

      • Excuse me Bindidon,
        1- When it says ‘Land-Only’ and gives a grid cell value in the middle of the deep blue sea, and does not say ‘ERSST’, what else can it be but extrapolated values from measurement on land (Island)? Please educate me. My point is that ‘Land-Only’ means to me that there is no data for 70 % of the globe (the aqueous portion) depicted in their map.
        2-I am not talking about UAH. (Diversion, *ding*) All (ALL) of the grid cell values for Greenland (4) and Antarctica (8) are from coastal (maritime) locations. The interiors are gray: “represent missing data”. Unintended perhaps but it nonetheless introduces a bias into any statistical average temperature. A simple acknowledgment would be appreciated.
        3- Again, I am not talking about UAH. I am not even claiming GHCN is in any way saying this map represents a Global Average Temperature. My point is that to claim any sort of Global Average based on a woefully incomplete data record may be statistics but it is far from being good science. Look at the actual data. If the data are not there, be ware. If the data are not there you can not just make it up.

        (I am reminded of the Harry_Read_Me.txt files from HADCRUT climategate:
        “Here, the expected 1990-2003 period is MISSING – so the correlations aren’t so hot! Yet the WMO codes and station names /locations are identical (or close). What the hell is supposed to happen here? Oh yeah – there is no ‘supposed’, I can make it up. So I have :-)”

        Beware of statisticians bearing gifts.
        I posted this map to illustrate how incomplete the land based temperature record really is.
        Was it Richard Feynman who said ‘first, don’t fool your self’?

      • just another thought about the grid: location of the weather stations.

        since they changed the transport here i have to go now to my work by bike. Well i go to a small 40.000 habitants town to a village of 5000 people. This 8 km long track goes through an open rural landscape from village center to the town center.

        This road also goes through a marshy area for 1 km

        now what i never expected is how much the temperature on such a short 8 km track can vary especially on the first sunny spring days: the drops and rises in temperature of each area (patch of forest, open fields marshy area town-village are even at the inaccurate and -not precise at all- feel noticeable

        says enough about how patchy our temperature record is.

      • Richard G. – April 12, 2017 at 9:37 pm

        I posted this map to illustrate how incomplete the land based temperature record really is.

        Richard G, all of the climate “experts” and the partisan believers in/of CAGW have been touting the extreme accuracy of their claimed circa 1880 Global Average Temperatures …… but iffen you want to “display” how grossly inaccurate and utterly deceiving their claims are, …… just post another map similar to this one, to wit:

        And iffen you do, I am positive that 97+% of that circa 1880 map would contain Grey areas representing missing data.

      • Richard G. on April 12, 2017 at 9:37 pm

        Many thanks Richard G. for your interesting comment (which sounds by far more intelligent and intelligible than all those around yours).

        1. When it says ‘Land-Only’ and gives a grid cell value in the middle of the deep blue sea, and does not say ‘ERSST’, what else can it be but extrapolated values from measurement on land (Island)?

        The answer I gave already in my comment: all GHCN users use ERSST together with land/ocean masks allowing them (or better: their software packages) to determine for each grid cell what in it is land and what is sea, in order to have an appropriate land/sea ratio to compute the cells’s average.

        I confess that until now I never had kept attention to any land-only grid-based map displayed on any web site!

        And I now understand a posteriori what irritates you: the dumb extrapolation of each even smallest island to grid cell size, an extrapolation which nowhere else exists. Islands evidently should here be restricted to those exceeding grid size, i.e. about 280 x 280 km.

        See for example a picture displayed month by month on Nick Stokes’s moyhu web site, showing the integration of GHCN and ERSST:

        Be sure that this chart is a thoroughly correct integration of land and ocean data!

        2. I am not talking about UAH.

        But I am, Richard! Just because I wanted to show you that missing data is by far not a privilege of surface temperature measurement.

        So even if Greenland etc are in UAH (unlike the Poles) not regions with ‚no data at all‘, they nevertheless keep grey zones subject to high caution when evaluated by professional users.

        But let us come back to GHCN: everybody in the world (mining engineers, land surveyors, etc etc) extrapolates to obtain data where there were none! So do climate people as well.

        Their problem: climate science is worldwide the one being subject to heaviest skepticism against anything able to be put into doubt. What never has been worth any hint anywhere, namely interpolation through kriging, suddenly becomes something worth to harsh attack, especially by people who don’t know what it is neither a fortiori how it works.

        As an alternative to bare GHCN, please have a look at e.g. Berkeley data (they don’t use GHCN):
        http://berkeleyearth.lbl.gov/regions/greenland

        3. Again, I am not talking about UAH.

        But again, I am. Simply because my reference to sparse UAH data amazingly akin to the full dataset was a trial to explain you that you very often really don’t need full coverage to obtain a reasonable mean:

        See the discussion about that in the comment
        https://wattsupwiththat.com/2017/04/12/are-claimed-global-record-temperatures-valid/#comment-2474619.

        One of the best examples is a comparison of GHCN and UAH at latitudes 80-82.5N: there you have no more than 3 ridiculous GHCN stations compared with the 144 UAH grid cells of this latitude stripe.

        You think: never and never will these 3 stations be source of an accurate averaging! But if you average the 3 UAH cells encompassing them, you obtain results quite similar to those of all 144 cells.

      • Richard G. on April 12, 2017 at 9:50 am [2]

        My first answer, though incomplete, was long enough.

        I had tried to underline that considering GHCN data may be somewhat dangerous as soon as you transpose your meaning about it to your meaning about the Globe: you have to add ERSST to it.

        Let me add a bit in the same direction, as GISTEMP is a synthesis not only of GHCN and ERSST, but also of the so called SCAR data (the Scientific Comittee on Atlantic Research) through which GISTEMP adds the Antarctic stations (over 40) listed below:
        https://legacy.bas.ac.uk/met/READER/surface/stationpt.html

        Don’t tell me “What??? GISTEMP? All adjusted data! They cool the past to warm the present!”.

        In grey you see GHCN unadjusted; in red, GISTEMP land-only; in blue, GISTEMP global.

      • SC on April 12, 2017 at 3:52 pm

        Not only that… but the highest increases in temperature observed over the entire planet surface are found in the United States Of America?

        How can this be?

        SC, the graph presented by Richard G deals with anomalies wrt to UAH’s baseline (1981-2010); you tell us about absolute values. That is really quite different.

        I don’t have absolute temps for CONUS on this computer, but well for the Globe (jan 1880-dec 2016).

        Please compare the 20 highest anomalies (wrt mean of jan 1981 – dec 2010)

        1992 2 2.89
        1991 2 2.87
        1992 1 2.68
        2015 12 2.67
        1998 2 2.45
        2016 2 2.31
        2016 3 2.14
        1991 1 2.01
        1991 12 1.97
        1990 11 1.96
        1991 3 1.95
        1999 2 1.86
        2012 3 1.81
        1992 3 1.69
        1998 1 1.65
        2000 2 1.64
        1994 12 1.61
        1999 11 1.60
        1998 9 1.47
        2015 11 1.39

        with the 20 highest absolute values

        2006 7 22.94
        2012 7 22.90
        2002 7 22.87
        1901 7 22.80
        2010 7 22.68
        2005 7 22.59
        2011 7 22.56
        1998 7 22.53
        1999 7 22.51
        2016 7 22.50
        1995 8 22.42
        2003 7 22.42
        1995 7 22.39
        2003 8 22.38
        2001 8 22.35
        2008 7 22.35
        2010 8 22.31
        2007 8 22.29
        2015 7 22.29
        2016 8 22.28

        You’ll immediately understand what I mean, I guess :-)

      • Smart Rock on April 12, 2017 at 8:14 pm

        They appear to be using 5 degree by 5 degree cells, which is absurdly coarse. You will note that cells defined by equal dimensions of latitude and longitude get smaller as they approach the poles, and become vanishingly small at the poles.

        Do you really believe, Smart Rock, to be the one and only on Earth being able to think of such trivial matters ans to integrate them into your work?

        Excuse me please: you behave here like an old teacher standing in front of 12 years old students…
        Why don’t you suppose a priori that “they” do right?

        Never heard of latitude cosine weighting, Smart Rock?

      • “Bindidon April 12, 2017 at 1:37 pm

        “Richard G. on April 12, 2017 at 9:50 am

        1. Note that the ocean grid cells are actually land temperatures (i.e. islands) extrapolated out to fill the grid cells.”

        “No:
        – It is not GHCN’s role to give us information about the oceans: that is ERSST’s domain.
        – All GHCN and ERSST users (e.g., GISTEMP) have land and sea masks with finest resolution allowing them to accurately separate land and sea areas.”

        Ah, Bindy is back with condescending thread bombing and slinging red herring distractions.
        Just bin his falsehoods; absolutely nothing will be lost.
        (though one can be amused by binned’s long lists as thread bomb tactic)

        “Bindidon April 12, 2017 at 1:37 pm
        2. Check out Greenland and Antarctica, the lands of melting icecaps. Pathetic.

        The first what I had to learn, Richard G., when inspecting UAH’s most detailed dataset (a 2.5° grid over the Globe) was that
        – though Roy Spencer claims for full measurement in his zonal files, neither 82.5S-90S nor 82.5N-90N are present (all cells there contain an “invalid value”);
        – regions like Greenland, Tibet, the Andes (in fact, anything clearly above 1,500 m, see their readme file) are “poorly covered”.

        Nevertheless, nobody claims the UAH dataset to be insufficient, me included.

        How nice! A sweet binned personal history story that ends with nonsense.

        Technically, binned does end with two logical fallacies:
        A) argumentum ad ignorantiam: “regions like Greenland, Tibet, the Andes (in fact, anything clearly above 1,500 m, see their readme file)”
        – – This particular example also depends on circular reasoning

        B) Sophistic argument contradiction introducing paradox.
        – – “are “poorly covered”
        – – “Nevertheless, nobody claims the UAH dataset to be insufficient”

        By binned’s own words, a dataset is insufficient, then binned claims nobody claims the UAH dataset is insufficient.

        The strawman approach; introduce a contradictory strawman, argue against said strawman.

        “Bindidon April 12, 2017 at 1:37 pm
        3. A casual glance tells me that the “Global Average” is really an over sampling of North America, Europe, a swath of parts of Asia, a small fraction of Africa, a South America dominated by Argentina, Australia’s south and east.

        When processing UAH’s grid (9,504 cells), you understand that you can perfectly approximate the Globe with an averaging of no more than 512 evenly distributed cells, i.e. about 5 % of the whole.

        This means that we need far less information to build a global average than we imagine.

        Ah, more logical fallacies!
        -Composition Fallacy
        -Confirmation bias
        -Statistical Generalization
        -Statistical Average

        Base fallacy!
        “This means that we need far less information to build a global average than we imagine”

        Casually with condescension, binned substitutes gibberish for reality then makes a false claim; without parameters or confidence bounds.

        From all of this illogic and sophistry binned feels free to thread bomb using his fallacies.

        BOGUS!

      • ATheoK on April 14, 2017 at 12:17 pm

        Ah, more logical fallacies!
        -Composition Fallacy
        -Confirmation bias
        -Statistical Generalization
        -Statistical Average

        Base fallacy!
        “This means that we need far less information to build a global average than we imagine”

        Casually with condescension, binned substitutes gibberish for reality then makes a false claim; without parameters or confidence bounds.

        From all of this illogic and sophistry binned feels free to thread bomb using his fallacies.

        BOGUS!

        Typical nonsense produced py people who write a 30 cm long comment without even one mg of scientific contradiction.

        The perfect political antithesis to persons having interest in Science.

        Download UAH’s grid data, ATheoK, and start on some heavy working, instead of eructating your redundant, complacent and egocentric blah blah!

      • Bindidon April 13, 2017 at 9:02 am
        But let us come back to GHCN: everybody in the world (mining engineers, land surveyors, etc etc) extrapolates to obtain data where there were none! So do climate people as well.

        Extrapolation and interpolation work well when the trend of the data is well-established and the relationship can be expressed in an equation. Neither of these apply when dealing with temperatures between two points, or temps out beyond the end of the collected data. It can’t be considered “data” either, but only a prediction. One doesn’t get to plug that data into the calculation to “refine” the precision.

      • Bindidon April 13, 2017 at 9:02 am
        “The Extended Reconstructed Sea Surface Temperature (ERSST) dataset is a global monthly sea surface temperature analysis derived from the International Comprehensive Ocean–Atmosphere Dataset with missing data filled in by statistical methods.”-Extended Reconstructed Sea Surface Temperature (ERSST) v3b.
        This is the description from the NOAA website: **with missing data filled in by statistical methods.**
        The description goes on to read:
        “This monthly analysis begins in January 1854 continuing to the present and includes anomalies computed with respect to a 1971–2000 monthly climatology. The newest version of ERSST, version 3b, is optimally tuned to exclude under-sampled regions for global averages. In contrast to version 3, ERSST v3b does not include satellite data, which were found to cause a cold bias significant enough to change the rankings of months.”
        When they say “tuned to exclude under-sampled regions for global averages” this is a De-facto admission of under sampling in the historical data set. Read as *missing data*.

        You state: “See for example a picture displayed month by month on Nick Stokes’s moyhu web site, showing the integration of GHCN and ERSST:

        Be sure that this chart is a thoroughly correct integration of land and ocean data!”

        Let me correct your statement: this chart is a thoroughly correct integration of land and ocean data and MISSING data. (There, I fixed it for you.) It just doesn’t show where the missing data locations reside. It does pretend to show data where none exists. We know this from the GHCN mapping which is being honest about their deficiencies.

      • James Schrumpf on April 15, 2017 at 4:52 am

        Extrapolation …

        1. Extrapolation? Where did I mention that in the text you refer to?

        … and interpolation work well when the trend of the data is well-established and the relationship can be expressed in an equation.
        Neither of these apply when dealing with temperatures between two points…

        2. Wow! You are pretending strange things here! What about citing some valuable source a mining engineer specialised e.g. in kriging would immediately subscribe to?

        … or temps out beyond the end of the collected data. It can’t be considered “data” either, but only a prediction. One doesn’t get to plug that data into the calculation to “refine” the precision.

        3. Nobody spoke about that, especially didn’t I.

        My guess: you seem to have read my little comment in ‘ultradiagonal mode’. You didn’t react to what I wrote, but rather to what you thought I did.

      • Richard G. April 15, 2017 at 2:21 pm

        Let me correct your statement: this chart is a thoroughly correct integration of land and ocean data and MISSING data. It just doesn’t show where the missing data locations reside. It does pretend to show data where none exists.

        Yes, Richard.

        But interestingly, you are ONLY interested in missing data originating from SURFACE temperature measurements.

        Otherwise, you CERTAINLY would have shown some appropriate reaction to the information contained in my last post concerning the 90 % similarity between a full UAH grid dataset and laughable 5 % of it.

        But since it is SATELLITE data, everything seems to be OK for Richarg G. :-)

        If I had time enough to do such an incredibly stoopid job, Richard G., I would carefully thin out the GHCN V3 and the ERSST V4 data, keeping about 10 % of them, and show you the result and how little it differs from the original.

        Unluckily, that’s a huge job compared with solely selecting 512 evenly distributed cells out of an array of 9,504!

    • With statistics anything is possible! And “journalists” are too dumb to know when they’re being snow-jobbed (or they just like reporting drama-filled stories and don’t care about truth).

      • Windchasers: I think you’re talking about discrete vs continuous distributions. The math is, I believe somewhat different although the difference(s) can often be safely ignored.

      • Windchasers: I think you’re talking about discrete vs continuous distributions. The math is, I believe somewhat different although the difference(s) can often be safely ignored.

        The math for estimating a probability distribution is a bit different; for a continuous distribution you can take ever-smaller intervals whereas you can’t for discrete.

        But generally? Yeah, they’re pretty similar. And things like the Laws of Large Numbers apply either way, as you’re talking about averages.

      • the Laws of Large Numbers apply either way
        ================
        the law of large numbers only applies for a constant average and distribution, such as a coin toss. time series data rarely satisfies this requirement.

    • Claiming accuracy of a hundredth of a degree by averaging instruments only accurate to half a degree?

      Well, the instruments are precise to 0.1K but their accuracy is questionable. In principle the precision increases with more measurements, but the accuracy may still lack. To be fair, the concept of temperature sounds well-defined but in practice you measure it near a sunny and windy runway. Your results would be several degrees different in the nearby forest. So, what you need to do is adjust the forest’s cooling effect away. Using virtual temperatures the warming can well be seen. /sarc

      • I think you misunderstand. Precision and accuracy are two different things. A measuring device with a precision of +-0.1 will generate a circle around a given point with a radius of 0.1. It doesn’t matter whether you take 10, 100, 1000, or 1,000,000 measurements. You can kid yourself that the center of the circle is the actual measurement, but it is not. Any given measurement is only somewhere within the circle. Accuracy is a whole different animal. That circle of precision may give you a reading of 5 degrees +-0.1, but the actual temperature is 0 degrees. This means the accuracy sucks, while the precision is really pretty good. Regardless this paper describes the problems with not using proper measurement techniques.

        After examining this over the years, I’ve came to the same conclusions as the author. A “global temperature” is a joke. It is someone’s idea that probably started out as a simplistic way of describing earth and the media blew it up, then government said, hey lets get involved too. I have never seen any definition of what it is in physical terms. If it is 75 degrees in Kansas, what should the temperature in Timbuctoo be? I know it takes a supercomputer to manipulate the data, but that is all it is, manipulation. There is no unique, defined way to actually say what a global temperature is. It can change upon someone’s whim as to how the data should be manipulated.

    • “””””….. 1Note: One cannot take a single measurement, add it to itself a hundred times, and then divide by 100 to claim an order of magnitude increase in precision. …..”””””

      And one cannot take 100 totally separate events, and add them together, and divide by 100 and claim that as the AVERAGE event.

      I can recall two explosions which took place among a whole raft of explosions that happened a long time ago in the Japanese Islands.

      I suppose on average, those two really didn’t do much damage; on average.

      Well tropical Storm Sandy over its lifetime, really didn’t do a whole lot of damage. It is only if you cherry pick that part of its existence when it was in the vicinity of the US East Coast, that it did anything much of note.

      Different events are different, because they are supposed to be different; they are NOT anomalous occurrences of some cookie cutter event.

      G

    • Well climatists don’t deal in Temperatures; they only deal in anomalies.

      Ergo, accuracy is not even a requirement; precision is sufficient.

      You are only dealing in what the thermometer says today relative to the average of what it has read during some base period. The real Temperature could be 10% off and not bother the anomaly results much at all.

      G

      • Unfortunately, the climastrologists are not working with precision either.

        One instrument’s temperature records may have some precision.
        The accuracy of that particular instrument is never validated or verified at any portion of diurnal, seasonal or solar cycles.
        Nor does the NOAA perform any sort of measurement tracking comparisons or calibration, certification when replacing equipment, changing landscape, burning nearby, structures reflecting sunlight during diurnal or solar cycles, etc.

        Isn’t amazing that NOAA presumes to advise governments while using instruments that are not regularly calibrated or certified in place?
        Engineers in critical industries must send their measurement equipment in for calibration and recertification regularly.

        Then to hide NOAA’s error recognition avoidance, NOAA happily accumulates all of the disparate unique temperature measurement devices into one gross temperature field.

        None of the equipment installed is checked for accuracy or precision.
        Aggregating multitudes of non calibrated or certified measurements yields sums of unknown value; without accuracy or known error bounds.

        The old “Do not look a gift horse in the mouth” approach to quasi-science.

      • george e. smith on April 13, 2017 at 4:07 pm

        Well climatists don’t deal in Temperatures; they only deal in anomalies.
        Ergo, accuracy is not even a requirement; precision is sufficient.

        You are only dealing in what the thermometer says today relative to the average of what it has read during some base period. The real Temperature could be 10% off and not bother the anomaly results much at all.

        Are you sure?

        1. Even UAH’s engineer-in-chief Roy Spencer publishes every month only anomaly-based temperature series. A science man like him wouldn’t like to call a ‘climatist’: that is imho too impolite. And above all, that does not at all mean UAH wouldn‘t store any absolute data!

        2. NOAA conversely does not publish anomaly-based data for GHCN nor for the IGRA radiosonde network. All the stuff is absolute there.

        3. You may at any time, as you certainly know, reconstruct absolute values out of anomalies: you just need to add the baseline’s absolute values to anomalies in order to obtain them ( a yearly average to yearly anomalies, a monthly average to monthly anomalies, etc etc).

        You must of course ensure that
        – you choose the correct reference or baseline to obtain the absolute values; getting absolute temperatures for the Southern Hemisphere on the base of a baseline computed for the Globe is of course nonsense;
        – the uncertainty of the baseline value does not exceedingly differ from that of the anomaly it is added to.

        4. What would be the sense of an anomaly-based storage e.g. of GHCN data? GHCN’s users (NOAA itself, GISTEMP) all have different baselines, what would mean for them the need to shift from GHCN’s baseline to the own one.

        5. According to ISO 5725, accuracy denotes how close a measurement is to a true value, while precision is the closeness of agreement among a set of results. Thus accuracy hasn’t anything to do with either anomaly-based or absolute representation of what has been mesured. Accuracy is, what concerns temperature measurement, a matter of calibration.

        6. Important when dealing with temperature series is the uncertainty of trend estimates, i.e. the deviation from the series‘ mean. And here its gets interesting because this deviation increases with the difference between minima and maxima in annual cycles.

        Thus a trend estimate computed on the base of anomalies shows less deviation from the mean than it would be when absolute values are used.

        BEST Globe land-only trend estimate 1880-2013
        – using anomalies: 0.11 ± 0.02 °C / decade
        – using absolute values: 0.11 ± 0.28 °C / decade

        You see here that using absolute values, the standard error is more than twice the value to which it applies.

        It gets even worse when you compare e.g. Antarctic temperatures or its sea ice extent in the same way.

        7. Last not least, temperature series and their trend estimates aren’t used by computers only: we like or need to have a look at them.

        Did you ever see a chart superposing for CONUS, the plots of
        – the full rural GHCN stations
        – all other ones
        – UAH6.0 USA48 aka CONUS
        using absolute data instead of anomalies wrt UAH’s baseline?

  2. Having been involved with AWOS-3 calibration for a number of years (15+), I always thought it was unreasonable for anyone to use more than one digit right of the decimal for any use. Given that up to the late 1990s, our external standards were still analog thermometers and the potential bias of the reader judging the curve.

    • @ Wyatt

      Given that up to the late 1990s, our external standards were still analog thermometers and the potential bias of the reader judging the curve.

      And the literal fact is, that only a few dozen people are willing to publicly admit that the “near-surface air temperature data” that is stored somewhere in the “bowels” of the NWS and referred to as the “Historical Temperature Record”…… is of no value whatsoever to anyone anywhere other than ….. local, regional or national weather reporters/forecasters for the purpose of “appeasing their audiences” by citing the “date and location of record-breaking temperatures”.

  3. The implication here is that there is a limit to how much the precision can be increased. Thus, while the definition of the Standard Error of the Mean is the Standard Deviation of samples divided by the square-root of the number of samples, the process cannot be repeated indefinitely to obtain any precision desired!

    Sorry, but this is mathematically wrong. The Law of Large Numbers explains how with enough trials, you can get to an arbitrary degree of precision.

    Fundamentally, one only has one chance to determine the precise temperature at a site, at a particular time.

    Sure. But when you’re trying to measure the global temperature anomaly over, say, a year, then all those individual measurements add up. You have both spatial and temporal variations that largely wash out. (If you have a cold front today, it’ll probably warm up tomorrow. Or if it’s colder than average here, it’s generally warmer than average somewhere else.)

    • The claim that spatial and temporal variations wash out is a complete lie. They don’t. Spatial and temporal variations actually decrease the accuracy of your assumed readings, making the error bars bigger, not smaller.

      • The claim that spatial and temporal variations wash out is a complete lie. They don’t. Spatial and temporal variations actually decrease the accuracy of your assumed readings, making the error bars bigger, not smaller.

        Of course spatial and temporal variations will increase the error bars relative to the case where there are no variations. Yeah, sure, the temperature would be easier to measure consistently if it was literally the same everywhere at every time.

        But measuring and averaging spatial and temporal variations still decreases the uncertainty. I.e., the uncertainty of a given day is going to be higher than the uncertainty of a given year. Local variation is higher than global variation. Etc.

        Yes? You agree that annually-averaged variations are less than daily variations. The error bars are less for the former. They literally have to be, mathematically.

      • Not even close to being true.

        Great, then show it. Should be easy, neh?

        Take a sample of time-varying data, distributed however you like. I can show you that the standard deviation of the average will decrease as the length of the timespan you’re averaging over increases. Heck, it’s already shown; that’s what the Law of Large Numbers is.

      • Windchaser, I’ve already done that. Shown why what you seek to do is invalid.

        Nah, you just said that it was invalid.

        You need to mathematically back up your claim that spatial and temporal variations don’t “wash out”; that an averaged set of measurements over some time period doesn’t have a smaller standard deviation than individual measurements.

      • Spatial variations in temperature anomaly are actually anti- correlated, such that averaging them reduces the standard deviation faster than you’d expect from normally-distributed variations. And this makes sense, of course – often, one place being warmer than usual is offset by someplace else being cooler due to physical phenomena like weather fronts, the change of seasons, etc.

      • Windchasers,
        A requirement for the use of the Law of Large Numbers is that the data be IID. We know that temperature data is not, so the LLN is not applicable.

      • A requirement for the use of the Law of Large Numbers is that the data be IID. We know that temperature data is not, so the LLN is not applicable.

        No, the strong law of large numbers does not require independent and individually distributed samples. Not having IID just changes the rate of convergence; it doesn’t change the fact that convergence exists. So you use timespans or spatial lengths long enough that the correlation is low.

        On Earth, the fact that spatial variations are anti-correlated increases the rate of convergence compared to the individual measurements. Again: you can demonstrate this mathematically.

      • Windchasers,
        The LLN, both the weak and strong versions, assume that the data samples are IID. There are some special exceptions where identical distributions are not required, however the temperature data sets used in this case do not seem to meet the conditions of any of them. Perhaps you could elaborate on why you think they do. Also, in order to determine the effect of the non-identical distributions, you would need to first characterize all the distributions involved, which has not been done, and I don’t think can be done. In my opinion you are on very weak footing here, and if you are going to resort to special pleading, you had better have your facts (and data) lined up and well presented. Otherwise it’s just hand-waving and guessing.

      • “often, one place being warmer than usual is offset by someplace else being cooler”

        That’s just garbage. With weather, cloud cover often cools large areas at the same time. Look at the power generation figures from wind power. Often large areas are left without any wind at all.

      • Windchasers
        April 12, 2017 at 10:53 am

        .. often, one place being warmer than usual is offset by someplace else being cooler due to physical phenomena like weather fronts, the change of seasons, etc.

        That could be true for heat content, but not for temperature.

        You can argue that the total heat stored on Earth does not change (or that it changes stadily by changes in solar output, the amount of energy radiating out to space, etc…) that this energy is what is being transferred from one place to another and that you can calculate Earth heat content by measuring heat content in different places and averaging the result, so the more values you have, the better the precision.

        But that is not true in case of temperature. A gram of ice at 0ºC has the same temperature as a gram of water at 0ºC but they do not have the same heat content. Same happens with air with low humidity and air at the same temperature but with high humidity. Same temperature but different heat content, different values for energy.

      • “often, one place being warmer than usual is offset by someplace else being cooler”

        This is an assumption on your part. An actual review of the climate network showed that the vast majority of contaminations warmed the record.

      • PS, your assumption that one place would be warmer while another place colder would only be true if we had an adequate number of sensors distributed over the entire surface of the earth.
        The fact that less than 5% of the world’s surface is adequately sensored and that 85% of the planet comes close enough to totally unsensored that the difference isn’t measurable, puts the lie to that claim.

      • Windchasers
        April 12, 2017 at 9:29 am

        MarkW
        April 12, 2017 at 9:25 am
        The claim that spatial and temporal variations wash out is a complete lie. They don’t. Spatial and temporal variations actually decrease the accuracy of your assumed readings, making the error bars bigger, not smaller.

        Of course spatial and temporal variations will increase the error bars relative to the case where there are no variations. Yeah, sure, the temperature would be easier to measure consistently if it was literally the same everywhere at every time.

        But measuring and averaging spatial and temporal variations still decreases the uncertainty. I.e., the uncertainty of a given day is going to be higher than the uncertainty of a given year. Local variation is higher than global variation. Etc.

        Yes? You agree that annually-averaged variations are less than daily variations. The error bars are less for the former. They literally have to be, mathematically.

        The Law of Large Numbers works when one is taking multiple measurements of the same thing, such as a thousand measurements of the length of a board — or a thousand measurements of the temperature in a room. It doesn’t work if one takes a single measurement of a thousand different boards and then tries to claim one has the average length of “a board.” One can claim to have measured the average length of “a board,” but even so, since they are all different boards with one measurement each, one can’t use the multiple measurements to improve the precision. One has to use the precision of one measurement.

    • Not if you violate Identicality. The limiting factors to each measurement, as in the article, must be neglible with regards to the magnitiude and variation of the thing you are measuring. Otherwise you cannot apply large number processes such as CLT.

    • Windchasers,
      I refer you back to the quote by Smirnoff (1961) regarding a need for variability to avoid redundancy. You don’t address the estimate of the standard deviation for global temperatures being 3 or 4 orders of magnitude greater than the precision commonly quoted.

      • You don’t address the estimate of the standard deviation for global temperatures being 3 or 4 orders of magnitude greater than the precision commonly quoted.

        The precision quoted is for individual thermometers. As long as you’re using more than one thermometer, the average global temperature anomaly has greater precision due to the Law of Large Numbers.

        Basically: when you gather more data around the globe, you can get a more-precise estimate of the global temperature.

      • Windchasers- while you can measure temperatures anywhere you want global average air temperature is not a useful number no matter how accurate and precise the measurements are. Temperature is an intensive property. The climate is a heat engine and driven by energy differences- extensive properties. You can put buckets of water half filled with ice and a thermometer anywhere ond earth an measure some average temperature very close to 0degC. It won’t tell you anything at all about whether or not the ice in any or all of the buckets is melting.

        Clyde is entirely right. The accuracy and extent of the temperature record is not nearly as accurate or useful as most presentations imply. Due to the way the record was developed and why have resulted in many attempts to “improve” the data with questionable, ham-handed corrections after the fact.

      • Windchasers April 12, 2017 at 10:56 am
        The precision quoted is for individual thermometers. As long as you’re using more than one thermometer, the average global temperature anomaly has greater precision due to the Law of Large Numbers.

        Basically: when you gather more data around the globe, you can get a more-precise estimate of the global temperature.

        No!
        Errors accumulate! Addition for large amounts of measurements from disparate unique temperature measurement devices that range from thermometer guessing up to badly installed modern platinum thermistors; not forgetting NOAA’s happy habits for data substitution in lieu of missing, corrupted or unacceptable data, is wrong!

        One temperature instrument’s reading undergoes multiple stages from instrument to NOAA database.
        Each stage introduces error possibilities.

        Nothing in industry is manufactured without engineers thoroughly testing each stage for error rates. Unacceptable error rates force stage re-engineering. Low or transient errors are tolerated, but tracked.

        NOAA has not performed accuracy or error analysis for any of their stages from instrument to presentation.

        In Engineering, it is unacceptable to aggregate disparate measurements without rigorous verification.
        I already mentioned engineers having to send in their measurement equipment regularly for calibration and recertification. Just as gasoline pumps have calibration/certification stickers dated and signed; so do weight scales throughout retail and industry, personal cars are inspected, as are elevators and many other important items in use.
        NOAA has not performed equipment calibration or certification! The accuracy levels stated by NOAA are under laboratory conditions accuracy; not installed in a badly sited temperature measurement station.

        One temperature station, without error determination, can generate approximate temperatures for one very small spot. Averages for that geographical spot are only viable when stated in diurnal and seasonal range of averages. Much as realtors use a general temperature graphic for their location area.

        Pick any temperature station, hold a temperature monitor and walk the surrounding area and you will find a range of temperatures. The differences might be small, but they exist.

        Assumption that one nice anomaly number can represent the globe is only valid when every point of surface through all altitudes is carefully measured, errors constrained or controlled, 360° by 360°.

        NOAA’s anomaly approach is a joke in comparison; embarrassing in reality.
        For amusement read ”NOAA’s Newsletter What’s in that MMTS beehive anyway”; I read not too long ago, another ocean based maintenance staff’s efforts to clear buoys of algae, barnacles, pelican and gull feces, etc.

        Accurate or precise ocean or surface temperature averages? Yeah, right.

        Nor do I understand that the “strong law of large numbers” works for data sets with negative numbers with large error variances or NOAA datum manipulations.

      • PS Windchasers:

        I like most of your comments in this thread. My above response is the only quibble with your comments so far.

        Philo: Excellent comment!
        I especially agree with your praise for Clyde Spencer. (should we wonder about last names?)

    • you can get to an arbitrary degree of precision

      Who would care if the numbers still need large adjustments for their uncertain biases?

      Or the other way around, walking around a sunny day you spot hot and cold places, and then someone defines the temperature anomaly to a hundredth of a Kelvin using measurements from elsewhere in space and time, and a climate model. What good does that representation do? It tells nothing about your sunny day.

    • Windchasers: I think you are misinterpreting the law of large numbers and conflating it with statistical sampling methods. If I have measurements of some property of 100 samples taken from a population of 10,000 items, I might get an average of 50 and a Standard Deviation of 10. Assuming a normal distribution, there would be about 95 samples in the range of 30 to 70. But this data also allows me to estimate the likely mean of the population from which the sample was drawn. This is where we divide the sample SD by the square root of the number of samples, In this case the population mean is estimated at 50 +/- 2 @ 95% confidence. (50 +/- 2 x SD/SQRT(N)). BUT, this is only applicable if the sample was drawn truly at random. i.e. every one of the 10,000 members of the population had an equal chance of being selected. There is simply no way of producing a representative random sample of global surface temperatures from the existing surfaces temperature record. Therefore, the fact that AGW alarmist use data sets which average millions of data points does not mean that they can legitimately claim to know the global average temperature with any high precision. The only system that has the potential to produce a valid measurement of a global temperature is by satellites designed for the purpose.

      It can be argued that the data sets used and the annual averages can still be a reasonable indicator of long term trends. However, this would require that same instruments with the same accuracy in the same locations with the same surrounding conditions be maintained over the entire time span. The fact that this is not the case has been well documented. Thus we see the ongoing “adjustments” of the historical data to try and compensate for “known” biases (UHI, TOB, screen type, etc.).

      I think Mr. Spencer has made a very sound argument and temperature data presented to more than one decimal is nonsense. In fact I think all such data should be considered no better than +/- 1 C. Oh, and the paleo data (ice cores, tree rings, etc.) has to be much worse.

      • Rick,
        I am arguing that if one measures a parameter with a fixed value, then they are justified in quoting the Standard Error of the Mean (i.e. dividing the standard deviation by the SQRT of the number of samples). However, if what is being measured is something that is varying, and is used to construct an estimate of the central tendency of the measurements, then one must use the Standard Deviation as the estimate of the uncertainty, and it makes no sense to use significant figures much smaller than the SD.

      • However, if what is being measured is something that is varying, and is used to construct an estimate of the central tendency of the measurements, then one must use the Standard Deviation as the estimate of the uncertainty, and it makes no sense to use significant figures much smaller than the SD.

        Right – but what we’re doing here is taking individual measurements and using them to create a spatial and temporal average. The goal isn’t the individual measurements; the goal is the average.

        The actual average temperature of the Earth’s surface at any given moment in time is a fixed number. But you have to take a lot of measurements from around the globe to calculate it. You could calculate it using only one sample, but you’d have very large uncertainty. More measurements == less uncertainty for the average.

        The same also applies to the temporal average. If you were trying to calculate the monthly average of temperature where you are, and you only took temperatures one day, you’d have a high uncertainty. Take measurements once a week, and your uncertainty will be lower. Every minute, and your uncertainty will have come down quite a bit.

      • Clyde: Yes, I do understand your point. However, there are always at least three components to instrument measurement uncertainty – the standard error as you describe, 1/2 of the instrument resolution, and the stated uncertainty of the calibration reference. There are typically other sources of uncertainty such as drift, environmental effects, etc. The actual MU is the SQRT of the sum of the squares of these uncertainties.

        I’ve used many types of temperature measurement systems for over 35 years in a laboratory setting and can say that temperature measurement in most cases with an MU of less than +/- 1 F (0.5 C) is challenging and expensive.

      • Windchaser, by taking lots of samples scattered over the whole earth, you are only narrowing one of the three possible sources of error. That being the sampling error.
        Regardless, to get down an accuracy of 0.01C, you would first need sensors accurate to 0.01C and then you would need to put one sensor every 10 to 15 feet over the entire surface of the earth, including the oceans.
        And that would be one measurement.
        If you took another measurement a couple of minutes later, you could not average the first reading with the second, because you aren’t measuring the same thing. You are measuring the temperature a couple of minutes after the first reading and any rational person would expect the two numbers to be different.
        For the LLN to apply, each measurement must be of the same thing.

      • MarkW:

        I agree with most of your statement.

        Each temperature station is one component of a multistage process resulting in a number posted into a database.

        Each stage has a very real possibility of error that has never been measured, that I (we) have ever heard about.

        Then there is the NOAA habit of manipulating recorded temperatures. Each manipulation is another stage that introduces errors.

        e.g.: Courtesy “Quality control and improvement for multistage systems: A survey
        JIANJUN SHI1,∗and SHIYU ZHOU2, April 2009”
        Multistage process illustration

    • Well your law of large numbers presupposes that each of your trials is an observation of exactly the same “event”.

      If you use a meter stick to measure the width of a ditch you just dug with a spade and shovel, you are unlikely to ever repeat the measurement you just made, so the precision does not necessarily increase indefinitely.

      On the other hand, if your experiment consists of reading the number written down on a piece of paper, You can reasonably expect that given enough time you will read it and get the correct number.

      G

    • Windchasers: Your chemistry professor would rightfully lambaste you for spouting drivel about the law of large numbers. It does NOT apply in this case. These are measured values. The degree of precision of a measured value is determined solely by the quality of the instrument used to take the measurement. When dealing with measured values, the only law that applies is the law of significant figures. The law of large numbers is completely irrelevant.

      • Louis you are correct about precision in an individual measurement. But when you begin to deal with large sets of measurements, specifically averages and the confidence intervals associated with large sets of numbers….. standard error applies as a function of standard deviation.

      • Most of the data sets contain temperature measurements, so they in fact are measuring the same thing.

      • Even you should be smart enough to realize that the temperature in LA is not the same thing as the temperature in Boston. They are two separate measurements of two separate points.

    • Windchasers:
      This is a teachable moment on why climastrologists are full of crap:

      They don’t follow the scientific method.
      They don’t follow basic rules of mathematics regarding measured values.
      They follow Michael Mann’s mantra: “If you can’t dazzle them with brilliance, baffle them with bullshit statistics”.
      Then they tell you you’re too stupid to comprehend and follow their logic.

  4. I have read somewhere that the average temperature on the moon is minus 60ºC.
    That uniform temperature in our atmosphere is at about 12km altitude.
    12 x 9.8ºC/km gravitational lapse rate downwards gives 57.6ºC at the surface.
    12 x 6.5ºC/km average reduced lapse rate by water vapour gives 18ºC at the surface.

  5. there are numerous data handling practices……running an algorithm that retroactively changes past temperature data

    In effect you will never know what the current temperature slope is….

  6. I believe this was implied in your article, but I would like to expand on it.
    If I use the same instrument to take the a measurement of thing three times, it is possible that the average of the three readings can be more accurate than any individual reading.
    The proviso here is that I must be measuring the same thing, using the same instrument.

    If I take a temperature measurement. Reset the thermometer, take another reading, Lather, rinse, repeat a couple of times. Assuming temperature is varying slowly enough that there is no measurable change between my readings, then provided all the other things mentioned in the article are true, then it can be argued that the average is more accurate than any individual measurement.

    However if I take one reading today, another reading tomorrow, and yet another on the day after, even if I am using the same instrument, then averaging these readings does not improve accuracy because I’m not measuring the same thing anymore.
    This is even more true if I’m taking temperature measurements of places that are hundreds of miles apart, each measurement using a different instrument.

    • However if I take one reading today, another reading tomorrow, and yet another on the day after, even if I am using the same instrument, then averaging these readings does not improve accuracy because I’m not measuring the same thing anymore.

      The accuracy of a given temperature reading at a given time doesn’t improve, sure.

      But taking more readings at more places and more times does improve the global temperature anomaly over a given time period.

      • That’s not correct. Taking more readings of different things does not improve the accuracy of any individual reading.
        You can repeat that lie all you want, but it still remains a lie.

      • Taking more readings of different things does not improve the accuracy of any individual reading.

        I agree. Read my post again.

        The accuracy of the individual readings does not improve. The accuracy of the average does.

      • To rephrase, one is taking multiple measurements of different things, not multiple measurements of the same thing. Accuracy does not increase in that case.

      • That’s only true when certain conditions, as spelled out in the article are true.
        None of those things are true when you attempt to average temperature readings taken hours to days apart and hundreds of miles from each other. It is also never true when not using the same instrument for each reading.

      • Windchasers,
        You said, “But taking more readings at more places and more times does improve the global temperature anomaly over a given time period.” That is the assumption many make, but is exactly the point I’m contesting!

      • But taking more readings at more places and more times does improve the global temperature anomaly over a given time period.

        NO. Absolutely NOT. Your “method” is a fundamental violation of the prerequisites for applying the Law of Large Numbers. You clearly did not read the article, much less any of the references.

      • NO. Absolutely NOT. Your “method” is a fundamental violation of the prerequisites for applying the Law of Large Numbers. You clearly did not read the article, much less any of the references.

        I read the article, why else would I be quoting from it to argue against it? O.o And I’ve got a couple of probability and statistics books on my bookshelf, so I don’t know why I’d need to look up the author’s textbooks. I mean, if he’s wrong, he’s wrong. The fact that he didn’t actually lay out any of his equations or show how the assumptions are mathematically violated should really give you pause.

        The only prerequisite for the LLN is that there’s not a systematic bias. For temperature anomaly, this would mean a changing systematic bias (e.g., switching between bucket or engine intake measurements for sea surface), as non-changing systematic biases are discarded by the fact that you’re considering the anomaly, not the absolute temperature.

        Any uncertainty in systematic bias will get carried through to the final result, and it’s appropriate that uncertainties around things like the change in sea surface methodology do get carried through to the final result.

        But this post wasn’t even about that. The author specifically says so. He’s taking issue with something that can be resolved through the LLN; how the precision of individual instruments is carried through to the global average.

        Meaning, even if there were no systematic biases, the author’s argument is still that you can’t take individual measurements and get a global average with a higher precision. And this is straight-up, 100% mathematically incorrect.

      • if there were no systematic biases

        Yes, if there are systematic biases or if the defined variable is artificial or badly defined to some extent, the accuracy is questionable.

        It’s like measuring my height in micrometers. Can’t be done sensibly. Mathematicians too often fall in the trap of thinking reality is simply reducible to numbers. Real numbers seldom represent reality well. Reals are a one-dimensional model of reality, but reality is not one-dimensional.

      • “””””….. The accuracy of the individual readings does not improve. The accuracy of the average does. …..”””””

        Well sorry. The average is ALWAYS exact.

        You cannot take the average, or any other statistical mathematical algorithmic result of variable quantities.

        Statistics only works on exactly known finite real numbers; and then only on a finite number of such numbers. It is a purely mathematical result and it is always exact.

        Where the discrepancies come in is in the assignment of a real finite exact number to an observation. That may be difficult to get exact.

        Nothing in the universe pays any attention to an average; it has NO meaning other than to the statisticians, who originally defined what “average” is. You can not even observe an average if one should occur; you simply would never know it was average.

        G

      • Go back and read the article again. Concentrate on the difference between accuracy and precision. I would advise you to think in terms of shooting a rifle at a target at 100 yards. Some sniper stuff on the internet will help explain. Come back and tell us what minutes of angle means in terms of precision. Then tell us what point of aim means in terms of accuracy.

        Any given instrument has a certain accuracy and precision. Trying to average multiple instruments with their own accuracy/precision just doesn’t give you anything worthwhile. You have no way to judge if their inaccuracies add together or offset each other.

      • “Windchasers April 12, 2017 at 9:55 am

        “Taking more readings of different things does not improve the accuracy of any individual reading.”

        I agree. Read my post again.

        The accuracy of the individual readings does not improve. The accuracy of the average does.”

        Not as NOAA and many other alleged world authorities pretend averages.

        From my post above:

        “Each temperature station is one component of a multistage process resulting in a number posted into a database.

        Each stage has a very real possibility of error that has never been measured, that I (we) have ever heard about.

        Then there is the NOAA habit of manipulating recorded temperatures. Each manipulation is another stage that introduces errors.

        e.g.: Courtesy “Quality control and improvement for multistage systems: A survey
        JIANJUN SHI1,∗and SHIYU ZHOU2, April 2009”
        https://www.dropbox.com/sc/4jyeo2pzq6yjtfh/AACzlHaJfK1cLT5j8vqlx1xla

        Unless every error rate is meticulously determined for every temperature station, for every stage that measurement undergoes; there is no accurate or precise average.

        This before pointing out that allegedly temperatures are collected at the same exact time of day around the world. I seriously doubt that stations worldwide are accurately adjusted to Greenwich time before posting to databases.

        The Law of Large numbers is useless when faced with error introductions on an infinite basis. Thanks to irregular temperature adjustments during every temperature aggregation run.

    • MarkW,
      That is essentially what I was saying. We are not measuring the temperature of the Earth’s atmosphere, but instead, a varying parameter that is used to characterize a hypothetical construct that tells us something about what kind of temperatures were experienced historically. The average tells us the point about which measurements cluster, and the standard deviation tells us how much variability was experienced.

  7. Like so much of climate science “hottest year on record” is a political construct reverberating in an echo chamber. Lay persons like myself, given reasonable information by those with more expertise can understand the limitations of the science. Popular media outlets never write stories about average temperature years or fewer dramatic weather events. AGW plays right into the need for sensationalized news to grab an audience. Unfortunately it is coming from folks wearing the mantle of the scientific method.

    Combining sensationalized science with government is what we are hopefully disassembling in the US now.

    • “Like so much of climate science “hottest year on record” is a political construct”

      Central Texas experienced the “hottest March” on record this year. In the newspaper article, the tone suggested “Climate Change is Coming to Get Us Sooner Than We Thought!” Funny thing, according to my Central Texas utility, whose bill has a graphic showing the average temp for the period, March was TWO degrees cooler than the same period of 2015. So, one station’s “hottest month ever” is not even true on a regional basis?

  8. An excellent article Clyde. I agree with you about different instrumentation and different techniques. I remember from my physics at school that mercury is far more accurate than alcohol and has a wider useful range. When temperature ranges are being discussed though absolute measurement cannot be relevant from different sites with different types of thermometer. For instance if one thermometer is reading 2 degrees Celsius higher than the temperature actually is, then the range of temperature over several years is the only factor of any relevance. I assume that the satellite measurements though are correct and have shown no warming at all apart from during the last El Nino, which was widely predicted before the event due to its expected strength.

  9. Another issue is that historical data only measured daily low and daily high temperatures.
    Trying to calculate a daily average temperature from that is a fools errand.

  10. Way back when, there was a saying going around that there was nobody more likely to be wrong about reality than a businessman who’d just discovered spreadsheets.

    Same thing happened to science; “the computer must be right” mindset took over.

    I don’t know how many computer “science” graduates I have dealt with where their coursework never touched on the complete difference between the words “precise” and “accurate.”

    • Writing Observer,
      It is likely that most scientists, who are practicing today, were educated during a time when calculators and computers were used routinely, and they never learned about significant figures. I know that when I was teaching in the ’70s, the early-adopters of hand calculators typically reported answers in excess of 6 significant figures, when the problem I tasked them with only had three significant figures.

      • I was one of the last of my engineering class (late 70’s) to take junior level exams with a slide rule. I couldn’t afford a calculator. I would always note at the top of my test “SRA” for “slide rule accuracy”. It forced on me a consciousness of significant digits that I have never forgotten. I feel sorry for people like Windchaser who just don’t get it. Also annoyed, as their low-information vote cancels mine on election day.

      • This. I’ve been thinking lately that any STEM major should be required to possess three credits on “Use of the Slide Rule.”

        (I still have my big steel rule – one contemplated woodworking project is to mount it in a nice glass front case someday. With a little hammer on a chain attached…)

      • I feel sorry for people like Windchaser who just don’t get it.

        My undergraduate background is in math & computational science, so I find that I generally have an easier time ‘getting it’ if people can put their claims in mathematical terms, with equations.

        Similarly, I have difficulty helping people ‘get it’ if they have trouble with equations. Math isn’t meant to be done in common language; equations have a precision that words do not.

        But there’s so little math and so much bad math logic in the OP that… yeah, I’m never going to “get it”. There’s nothing to get. It’s mathematically incorrect. I mean, look at the way he tries to ascertain the global annual average temperature anomaly from the range at a single point in the desert. You can’t do that unless you know something about the spatial correlations of temperatures, and he doesn’t address that missing premise. The logic is sloppy and the math unsound.

        In contrast, if you pull up the methodology papers from any of the major organizations – NOAA, NASA, BEST – you can actually walk through the math yourself and check it. And I have.

        Why would I trust hand-wavey explanations with bad logic behind them over the actual equations that I’ve checked?

      • Nice, an appeal to authority.
        You must be right because those you view as authorities are using the same faulty logic.

      • Windy,

        Yup, Mann, HadCRU, NOAA, NASA and BEST can lie with fake statistics, but the fr@uds they perpetrate show either that they don’t care or actually don’t know how to conduct valid statistical analysis. Or just do what they know they can get away with, since before Trump no one in government or the media would call them on their conspiracy, except Canadian statisticians whom they denigrated.

      • Windchasers,

        You said, ” I mean, look at the way he tries to ascertain the global annual average temperature anomaly from the range at a single point in the desert.” I’m afraid that your reading comprehension is deficient! What I said was, “That is, the range in global temperatures should be approximately four times the standard deviation (Range ≈ ±4s). For Summer desert temperatures reaching about 130° F and Winter Antarctic temperatures reaching -120° F, that gives Earth an annual range in temperature of at least 250° F;…” I was using both desert temperatures and Antarctic temperatures (not just desert) to estimate the range, not anomalies. Therefore, you are disagreeing with something you clearly don’t comprehend.

        There is an equation there, but I guess it is too simple for you to get involved with it.

      • Thank you for this comment windchasers.
        It saves everyone the hassle of contesting the things you assert point by point.
        You have looked over the maths of the Warmista High Priesthood, and declared it to be Good.

        I want to point out for the benefit of anyone reading this who may be on the fence about what to believe, that people who “adjust” the data points they use in their calculations are no longer engaging in science or in statistical analysis. There are other names for such a practice.
        Lets look at the example of using a single die: What is the average of 1000 rolls of a die which, prior to be averaged, are “adjusted” to bring the individual rolls in compliance with what that single roll ought to have been. Never mind the particulars of the reasons for these adjustments or whether they are based on a preconceived bias or political agenda. Never mind that…just tell us what is the average of the rolls is when all of the early rolls are adjusted in a single direction, such that all of the rolls form a more or less smoothly ascending curve?
        Sophistry sounds good, it sounds persuasive, although only to the uninformed and the gullible and those who are True Believers.
        Of lies, damn lies, and statistics… who could think that the statistical sophistry of a bunch of charlatans is anything a serious person would want to waste their time considering?

      • Michael darby,
        Mendenhall (1975) disagrees with you. The text offers the approach as a check on the calculation of SD. In fact, in the discussion of Tchebysheff’s Theorem, it says, “Tchebysheff’s Theorem applies to ANY set of measurements,…” Indeed, the theorem predicts that AT LEAST 93.8% of the samples will lie within +/- 4SD of the mean for ANY distribution. Note that I was being VERY conservative in the estimation of the SD because I don’t know what the distribution looks like. Had I used the more conventional R~4s instead of the R~8s my estimate for SD would have been 62 instead of 31! Either way, the point is that the SD by either approach is 4 or 5 orders of magnitude larger than the commonly cited precision of the mean.

        Mendenhall’s Introduction to Probability and Statistics continues to be used as a standard text even today. I’m inclined to believe Mendenhall over you because I don’t know what book(s) you have had published and have had widely adopted as a standard statistics text. I also don’t recognize the author of the link you provided. I prefer not to appeal to authority, but when a lone “voice in the wilderness” contradicts a widely adopted text, I need a little more than just a ‘anonymous’ complaint (and link) that you disagree.

      • PS Clyde, Chebyshev theorem applies to distributions of probability You ignored the fact that surface temperatures of the earth are not random, as they are highly dependent on (read function of) LATITUDE. The temps may vary within range, but that range is constrained by latitude. So you cannot apply the range rule in this situation.

    • “Writing Observer April 12, 2017 at 9:28 am

      Way back when, there was a saying going around that there was nobody more likely to be wrong about reality than a businessman who’d just discovered spreadsheets.

      Same thing happened to science; “the computer must be right” mindset took over.

      I don’t know how many computer “science” graduates I have dealt with where their coursework never touched on the complete difference between the words “precise” and “accurate.”

      Oh Gd! You’ve gone and reminded me! Those days were horrible, but fun.

      I was pulled from the line because I had an education in computers and languages. That was immediately assumed to include spreadsheets and plotters; Lotus 1A and soon after Symphony plus some excursions into IBM’s TSO version of Calc were my bread and butter along with computer setup and repair.

      All using the original IBM PC with metal woven wire chain attached optional auxiliary drive enclosure and 10MB hard drive. (I later connected the PC to the mainframe via dial up line).

      Long before real IT positions existed with titles; part of my assignment was “teaching”.

      So, I started some Lotus, Symphony and BASIC classes at work.
      BASIC died on the vine. Such deer in headlights look everyone had. Especially since the teacher, me, was drilled in the never use ‘goto’ ever school of languages!

      Symphony went fairly well since the Industrial Engineering department was officially working with Symphony. Only worker bees attended that class.

      Lotus 1A was loved by all attendees. Only most of them didn’t understand. A concept that attending managers really didn’t get.
      The safety manager came after me with a huge safety manual. He told me he wanted me to help him put the entire book into a spread sheet. In Lotus 1A? Yeah, good luck.
      Next the Human Resources manager talked to me, about putting everything HR into a spreadsheet.
      Then the maintenance manager.

      I avoided managers for a few weeks.

      These tales go right into the newbie files along with disk drive coffee holders and newbie serial disk destroyers. e.g. “I accidently formatted my C drive again, can you restore it for me?”.
      I had one co-worker who “lost” floppy discs because he kept putting them into a narrow vent slot an inch above the disk drive. Every now and then, I had to dismantle his IBM XT to get out his special boot discs. I also tried inserting a floppy into that vent; it wasn’t easy.

      Then there was the Finance department where I was actually assigned.
      After I wrote a macro to split annual employee work hours into work weeks, I learned from Finance folks doing input to the mainframe that rounded numbers in a spreadsheet don’t add up properly in a mainframe.
      It was a shame they didn’t mention that before doing all of the input; I could’ve written a formula to eliminate generic roundings. A formula I ended up using for years in other Finance departments.

      While I loved working with spreadsheets, it did not take long before discovering there is no truly random function in either BASIC or spreadsheets. I started using the average of a series of random numbers; still not a random number, but better than the simple random function.

      Then there were published errors early logic chips incurred when working with double numbers. I was amused, one allocates double functions to increase right of decimal accuracy, but mathematically the chip introduces errors.
      Yes, I replicated some of PC’s magazine published matrices to verify chip abilities, then assigned PCs for only database or word processing (hello DBase and WordStar!).
      Another memory; Few things are as irritating as a Wordstar expert being introduced to current Word Processing programs…
      Should I mention where I arrived hours early, collected every calculator and locked them up? I returned one calculator because the user proved her one calculation only was easier and quicker with the calculator.
      I’d never confiscate slide rules!

      Writing Observer and D.J. Hawkins:
      Any slide rule user was far faster and more accurate than any computer or computer user.

      The computer’s value is for repetitive calculations. GE bought the first commercial computer to run payroll numbers.

      My Father worked as a technician on Eniac; not as one of the main members. Dad was one of the electricians sent in to find and replace vacuum tubes after every run.

      Till he passed, my Father preferred a slide rule to computers. With good reason too! Just prior to my getting reassigned to Finance, HP calculators sold for $350.
      The IBM PC was $5000 without the auxiliary 10MB hard drive and enclosure.

  11. To make matters even murkier. Temperature is itself not a very good indicator of the energy of the point being measured. Averaging without consideration of the affects of moisture just makes the analysis of anomalies worse. There is an unstated uncertainty swallowed by the averaging.

    When I do an energy balance around a steam plant, I absolutely do not average temperatures across a device. I will average the temperature readings of one point. I have to keep a running average of the actual instrument, our state equations have a tendency to blow up if I use the instantaneous readings. Thermodynamics is never violated, but latency can cause readings to not quite match reality. The calculations always pass through enthalpy before I get to an answer.

    We aren’t dealing with superheated steam in the atmosphere, but that just makes it worse. The most challenging part of the analysis doesn’t occur in the superheated region. The most challenging is in the saturation region. Our atmosphere is almost entirely in the saturation region. Well. Expect for those vast stretches of country with winter temperature dropping way below zero…

    • Congratulations. I searched all the comments for the magic word, enthalpy, and your comment was the only one to include it. Temperature is an intensive property and cannot be averaged. Extensive properties like enthalpy can be averaged. A thorough analysis of the records would require including moisture content of the air associated with each temperature measurement so that enthalpies could be averaged.

      • Give us a break, Neil. To go over all of the things wrong with the current state of “consensus climate science,” (the total ignorance of enthalpy being just one of them) requires not a comment – not a blog post – but a book – better yet, a whole series of books. (Not marketing for anyone in particular, but the regular contributors here can fill a comfortable two bookshelves with their more extensive writings.)

  12. Loved this! Great explanation of the absurdity in declaring a global temperature at all…let alone with the accuracy/precision they claim to have.

  13. The article touched on the fact that the ground based temperature network only covers a few non-representative areas of the planet. The vast majority of the planet, close to 90% of it, is either totally uncovered, or covered inadequately.

    This will also increase the error bars tremendously.

    (Non-representative because most of the planet is covered by water, and most of the sensors are on land, beyond that, urban areas aren’t representative of land either.)

      • Also note that the Mecator projection for this map is deceiving. The land area of Africa is, in the word of the Donald, HUGE. Brazil and central S. America as well. The map actually under-empasises the lack of coverage. Also check out the UHI effect in USA and Siberian under-reporting of winter temps in USSR days to get more government money for heating oil. Not myths but well documented biases (see Wattsup and Heartland).

    • MarkW on April 12, 2017 at 9:36 am

      Non-representative because most of the planet is covered by water, and most of the sensors are on land, beyond that, urban areas aren’t representative of land either.

      This is the same mistake as Richard G. made.

      Your are talking about land-only measurements you guess be used to estimate the rest. That is wrong.

      The world average is built on a mix of land stations and sea surface measurements.

      • When we talk about sea surface measurements, are you talking air of water temperature. As for coverage how many of the grids has a stable place where they make the measurement from. Argo does not do it they float, never in the same place, surface ship measuring water well talk about variables, not the same place not the same depth, not the same ship. How do you get a number that makes and sense from that? Other than satellites we do not truly have a consistent measurement system of the earth atmosphere. The variables are enormous and they do not “average” out.

      • Bindidon
        I simply posted a map from NOAA that speaks for itself about the ‘Land-Only’ data. Was that a mistake?

        Please tell me, does the map show that Africa has uniform coverage? Canada? Brazil? Amazon basin? What is the average of a lot of nothing? How about Nothing plus Something? A Global Land-Only temperature average is ridiculous at best.

      • For reasons I covered elsewhere, trying to combine air temperature measurements with water temp measurements is an exercise in dishonesty.

      • MarkW on April 13, 2017 at 7:16 am

        For reasons I covered elsewhere, trying to combine air temperature measurements with water temp measurements is an exercise in dishonesty.

        Why don’t you give us a résumé of your “covering” ? So we could manage to see what probably is wrong in it.

  14. Well done – this article goes to the heart of why I not only don’t believe the Alarmists, I don’t trust them – improper data claims. I learned this stuff as a chemistry student in high school and again as a first year chemistry student at university. The fact NASA and other supposedly scientific operations are willing to claim accuracy and precision that is laughable is damning.

    • I took every undergraduate-level chemistry class offered, and I agree with your sentiment 100%
      How many nights in chem labs did we toil for hour after tedious hour, to get a single result and report it to the appropriate confidence interval?
      The statistical flim-flammery used by warmistas would have gotten them a failing grade in every science class I ever took.

  15. Ok, but what about the more important tiny data set argument?

    http://www.maddogslair.com/blog/ok-but-what-about-the-more-important-tiny-data-set-argument

    This is a good, if “inside baseball” argument, but the killer argument is that the data set is too small to be of use in determining the earths climate over time. This data set only goes back to about 1880, precisely the time the earth was exiting the Little Ice Age. A good comparison would be to take the Average Temperature of Detroit but then begin the data set at the end of September, and only take data through the end of February. The data might be skewed a bit cold, no? Of course it would, and only a charlatan or fool would take this tactic, but this is the tactic the global alarmist cabal has taken. They truncate the evidence to sometime around 1880, so the data shows a sharp climb. But this is expected since the Little Ice Age was the longest, and one of the three coldest temperature depressions during the Holocene. It is the Holocene Winter.

    The average temperature over the totality of the Holocene is higher than it is today. The Earth is essentially still trying to climb back to average from this Holocene Winter.

    http://www.maddogslair.com/blog/the-holocene-and-how-it-undermines-climate-fear-mongers-arguments

    Mark Sherman

    • And the average of the Eemian, the previous interglacial, was even higher, if not indeed warmer than now for practically its entire 16,000 years.

  16. I wonder. Why has there been no attention paid to the fact that we are in an interglacial period, during which it is expected that the climate will continue warming until it doesn’t; then back into another ice age.

  17. I wonder. Why has there been no attention paid to the fact that we are in an interglacial period, during which it is expected that the climate will continue warming until it doesn’t; then back into another ice age.

  18. Mr. Spencer thanks for your article. I’ve looked at information on meteorological thermometers and they typically say they are accurate to +/- 0.2 degree F. It is unreasonable to then say that the globe has warmed by 0.01 degrees if these thermometers are the ones being used. That means the thermometers would have to be accurate to +/- 0.005 degree F. Additionally, there can be reading and other errors.

  19. Per the graphic in Ricard G.’s comment, it is perfectly obvious you can be as accurate and precise as you want, but until you are measuring the entire globe instead of pieces of it, accuracy/precision is all a very moot point.

    With no recording instruments in the arctic/antarctic regions or out in the vast oceans or in the jungles of Brazil and Africa or Siberia or the Sahara (and on and on), all you can really do is just guess. Maybe guess to the hundredth of a degree.. but that is about it. Even Satellite temps are indirect readings.

    Excellent article on explaining accuracy and precision, but garbage in/ garbage out and it just doesn’t matter.

  20. Good article and good reasons for everybody to go back to school to refresh the basics. References are too old for this millennium, but ask (I have) a teacher of statistics how many using it could do it (or understand it) without punching a button on a computer. Compared to what some are averaging, T is accurate, oops, precise.

    • HDH,
      I was relying on references that were readily available in my library — my old college texts. However, some facts don’t change with time. Unfortunately, new technology (i.e. computers) sometimes requires that some subjects be skipped to make time for new things. Therein lies the problem! The students don’t learn the fundamentals, and they don’t know what they don’t know.

      • When I went to college, EE’s had to take two classes in antenna design. I had chosen to concentrate on the digital side of things, and the odds of my ever having to design an antenna were low enough that I was willing to take the risk. 40 years later, I still haven’t designed an antenna and couldn’t even if I wanted to.
        Perhaps the way to make more time is to put more thought into what truly is required and what should be optional.

      • The students don’t learn the fundamentals,…

        When I went to Stevens it was notable for the very late declaration of your major if you were on the engineering track. I could be an EE, CE, ME, or ChE by declaring or shifting as late as second semester junior year. The admin and professors explained: “We have no idea what you might be doing 30 or 40 years into your career and what kind of technologies you might be dealing with. We want you to think of yourselves as engineers. Just engineers. With a little more training in one branch than the others, but above all, engineers, and capable of tackling any problem that might reasonably be assigned to the engineering sphere.” It’s been a very long time since I’ve been a chemical engineer, but I’m still an engineer.

      • D. J. Hawkins pril 12, 2017 at 11:06 am

        Stevens?

        Stevens Hoboken Institute of Technology?

        The place up the street from Michelob importers?

        Yes, I know; Stevens Institute of Technology. I had a close friend, Barry Lange, who attended Stevens, and he used the Stevens Hoboken description.
        We visited him a few times. Great place!

      • MarkW if you ever had to design digital circuit layouts for very high speed electronics that antenna design course would have been very helpful. 1ns rise and fall times make for many noisy antennae on the board if you don’t pay attention.

      • I switched over to the dark side, programming back when clock frequencies were still about 6 MHz.
        Even then, when we laid out circuits, we had to be careful to make sure that no two lines ran for long lengths parallel to each other, to prevent what we at the time called crosstalk.

    • “The students don’t learn the fundamentals, and they don’t know what they don’t know.”
      I agree completely.
      I always thought that if one knew the fundamentals one could solve practically any problem. Of course there are exceptions, but today specialization has left many graduates without these skills. I find Mechanical Engineers that can do a Finite Element Analysis on a structure, but make errors on boundary conditions or properly interpreting the results since they were never taught the fundamentals.

  21. “In general, modern temperature-measuring devices are required to be able to provide a temperature accurate to about ±1.0° F (0.56° C) at its reference temperature, and not be in error by more than ±2.0° F (1.1° C) over their operational range.”
    ______________

    Do you have a reference for this please?

  22. I have read claims that “global temperatures are increasing”,
    whereas what I thought was actually being claimed was that
    mean temperatures are increasing, which is different. After all,
    if night-times are less cold than before then this will result in
    an increase in the mean temperature.
    Or, by way of example, of the two temperature ranges 10-20
    and 12-19, the former has the higher maximum temperature
    and the latter the higher mean.
    Further I have serious doubts that temps in some places can
    be properly area-weighted … take ‘my place’ for example, New
    Zealand, in which I hazard that homogeneity of temperature
    cannot be guaranteed for more than a few kilometres from the
    point of measurement.
    I thought of one way you could derive an area-weighted mean
    (for any point in time), and it came out as

    ( T1*A1) + (T2*A2) ….. + (Tn*An )
    ——————————————-
    ( A1 + A2 ….. + An )

    where T1 etc were temperatures at n measuring stations,
    and A1, A2 etc were the areas “represented by” each station.
    Doesn’t work in NZ because we simply don’t know the areas
    in question (without a lot of pre-judgements and arbitrary
    decisions)

    As for: mean = (min + max) / 2

    forget it. This is risible, because it assumes an even progression
    from one extreme to the other, which assumption is not warranted
    if all you know is the Min and the Max.

    • Rex,
      Perhaps you didn’t see my previous article on examining the BEST data set: https://wattsupwiththat.com/2015/08/11/an-analysis-of-best-data-for-the-question-is-earth-warming-or-cooling/
      It is clear that the major early warming,as captured in the mean,was a result of the low temperatures increasing more rapidly than the high temperatures. I suspect that there are different drivers for the changes in the low and high temperatures, and only reporting an average of the two hides that.

      What doesn’t get addressed in stating an “average” surface temperature is the elevation that the temperature is appropriate for. Does it make sense to say that the average temperature is known to the hundredths or thousandths of a degree if the elevation isn’t specified? That is another reason I believe that the standard deviation has to be taken as the uncertainty.

  23. These 5-minute averages are rounded to the nearest degree Fahrenheit, converted to the nearest 0.1 degree Celsius

    Preposterous. This procedure alone introduces an almost two third centigrade wiggle into the data stream.

      • Rounding is interesting. There are systems where the 5 is rounded up or down depending on the preceding digit. From memory under those systems 23.5 would be rounded to 24 and 22.5 would be rounded to 22.

      • Latitude,
        l don’t agree with you. However, the point is that the averages rounded to 1 deg F should be converted to the equivalent C temp, with 1 significant figure.

  24. Re accuracy and precision: As far as trends go, it can be said that for periods exceeding 15 years the upper 95% error margins of the LTS satellite data trends all fall within the ‘best estimate’ range of the surface data.

    The opposite is not the case. That is, the lower 95% error margins of the surface data all still point to warming, making the warming ‘statistically significant’.

    Folks who question the reliability of global temperature data should be aware that by far the biggest uncertainty levels are associated with the satellite data, not the surface data.

    • DWR54,
      I’m not arguing that the temperatures aren’t increasing. I’m arguing that the claims for the magnitude and precision of the increases are not supportable.

    • “DWR54 April 12, 2017 at 10:45 am
      Re accuracy and precision: As far as trends go, it can be said that for periods exceeding 15 years the upper 95% error margins of the LTS satellite data trends all fall within the ‘best estimate’ range of the surface data.

      The opposite is not the case. That is, the lower 95% error margins of the surface data all still point to warming, making the warming ‘statistically significant’.

      Folks who question the reliability of global temperature data should be aware that by far the biggest uncertainty levels are associated with the satellite data, not the surface data.”

      After a truly excellent article, many comments destructing global average temperature horrifying methods, terrible accuracy, bad precision and laughable error rates?

      That makes it obvious that you did not read most of the comments.
      Why did you pass so many over? Was it the logic? Perhaps it is the numbers?

      Then you question the satellite data?

      Truly absurd and quite sad!

  25. Classic misdirection. Alarmists claim “hottest year on record” and go crazy over the fact. Skeptics claim that the alarmist claim is stupid (which it is) because of the lack of precision of the determination in the first place. So skeptics “prove” the point. Who wins?

    Its the alarmists who win. Because while the skeptics are bending over backwards to prove that the alarmists are “wrong”, the majority of the public can clearly see that 2016 is darn near the hottest on record (even if not precisely) and certainly since 2000 has been much hotter than in the 1950s regardless of the lack of precision. In the meantime – nothing is being mentioned about the significant issue that there is no reliable attribution that the general increase in temperatures (precise or not) is related to human-caused CO2 emissions.

      • Truth is mighty and will previl. There is nothing the matter with this, except that it ain’t so.
        —Mark Twain, Notebooks, 1898

  26. Good post. You descibe proper treatment of accuracy and precision error where there is data.
    The bigger uncertainty problem is that there are large swaths of land and ocean where there is no data at all prior to satellites commencing Dec 1978. So the global suface anomaly is largely an invented construct, not fit for purpose. And as shown by simple comparison of previous GAST ‘official’ estimates, both NOAA and NASA have significantl cooled the past and sometimes warmed the present.

    • Thank you Rud,
      I could have gone into the issues about sampling protocol, but at over 2500 words I was already concerned about people complaining about falling asleep while reading it. I was just taking umbrage at NASA and NOAA reporting anomalies with two, three, and even four significant figures beyond what the instrumentation reports. More egregious is that even if precision were to increase with sampling size, then it implies that there is a minimum number of samples that have to be taken before the subtraction step won’t truncate anomalies. That is, I couldn’t report today’s temperatures as an anomaly because there hadn’t been enough additional precision accumulated. The bottom line is that I don’t think that those analyzing the data have carefully examined their assumptions and justified their methodology.

      • My question with all this has been “How do you come up with the significant digits to use?”

        If you average a million samples measured to the tenths of a degree and get a “calculator figure” of 10.643, your mean is still given in tenths: 10.6 +/- 0.05. If you use those million samples to improve on the accuracy of the mean, and get an uncertainty of +/- 0.0005, does the mean become 10.600 +/- 0.0005, or do you get to use the “calculator figure” of 10.643 +/- 0.0005?

  27. Understand frustration that the public debate often seems misdirected, misinformed, and pointless. But it is not. Challenging and staying in the fight has brought us to a healthy inflection point. Far from being settled the science will now be debated without the heavy hand of government on the scales.

    It ain’t over. It has just started in earnest.

    • Forrest,
      I suspect that you are right that they don’t know what they don’t know. That was one of the motivations for me to write the article. Most of the practicing scientists are young enough to have been my former students, or their children.

  28. I haven’t heard anyone cover it “properly”. Probably my biggest concern about the way the temperature data is handled is the way they try to stitch the record into continuous records. The problem is that this gives an illusion of precision where there is none. It is all done with good intentions, or at least, I think that’s why they started it. The break-point alignment hides the very real station move uncertainty without ever actually dealing with it. Creation of virtual stations based on these methods creates additional “certainty”, Ironically by introducing something that doesn’t even exist.

    And stitching together the record the way they do does something rather interesting. Note, these steps might not all be in the right order but their impact is the same in the end.

    STEP 1: Normal maintenance (and most of this already happened long ago), the stations are moved, usually because of encroachment of urban influences, which of course leads to pronounced Urban Heat Island impacts. Sometimes this happens more than once in a region.

    STEP 2: Processing, in the attempt to make the now broken record continuous, they perform break-point alignment. The assumption is that the data is accurate, but it is skewed. Because it is normally to adjust for urbanization, the past temperature alignment results in cooling of the past, bringing the hot UHI end of the record up to the cooler UHI free temperature. Often urbanization begins anew tainting the record more.

    STEP 3: They now officially adjust for UHI all at once. They pat themselves on the back because they have good correlation with the raw data. But in reality the only thing the UHI adjustment has done is remove most (but not all) of the accumulating error from break point alignment. The UHI is still there, hidden by overly complicated processes.

    The reality is that the urbanization history is too difficult to factor in. The closest thing we could do to reality is calculate daily temperatures from whatever thermometers are available and perform the same spacial processing to account for all the holes. When we were done we’d have a much less precise “product” with a known warming bias. And likely could not say with ANY certainty if it was warmer than the last warming period ending in the mid 1940s.

    • It’s worse than that poitsplace. Modern data fiddling techniques not only do adjustments at actual station location changes. They also look at the historical record and determine where observations don’t match their theory. They then introduce artificial break points and adjust the observations around them.

      Then the data fiddlers announce that what they have done is world’s best practice. I call it arse up backwards but then I never did get with the program.

      • They also assume that urban stations are better quality because there are fewer gaps in the urban record. So they adjust rural stations to better match the “good” stations.

      • Then, because unadjusted SSTs don’t match the phony land “data”, they boost the sea surface “data”.

        The whole book-cooking, criminal enterprise is corrupt and corrupted.

      • Don’t get me started on the idiocy of trying to combine air temperature readings with the so called Sea Surface Temperature measurements.

        First off, that’s a real apples and oranges comparison.
        Beyond that, the very definition of “sea surface” has changed over time as first it was measured with canvas then metal buckets. Then it was measured from sea water intakes at a completely unrecorded and constantly changing depth.

    • Well, Forrest already mentioned the data diddling and fiddling and, well you get the idea.

      Then there is this:
      “What’s in that MMTS Beehive Anyway?”

      And they want us to trust false pretense of reliable numbers!?
      No calibration after install.
      No validation after install.
      Zero regular checks for temperature error.
      No certification for accuracy.

      Professional measurement equipment is calibrated and certified regularly.
      Gas pumps are calibrated and certified regularly.
      Weight scales in grocery stores right up to truck weigh stations are calibrated and certified annually.
      Elevators are inspected and certified annually.
      Escalators are inspected and certified annually.
      Ad infinitum

      Yet none of these devices are elevated to global attention and advocacy!

      Why is anyone listening to the freaks in the alarmist government cells!?
      Let’s put them in some other cells and see how long before they turn state’s witness.

  29. Reblogged this on Climate Collections and commented:
    Summary

    In summary, there are numerous data handling practices, which climatologists generally ignore, that seriously compromise the veracity of the claims of record average-temperatures, and are reflective of poor science. The statistical significance of temperature differences with 3 or even 2 significant figures to the right of the decimal point is highly questionable. One is not justified in using the approach of calculating the Standard Error of the Mean to improve precision, by removing random errors, because there is no fixed, single value that random errors cluster about. The global average is a hypothetical construct that doesn’t exist in Nature. Instead, temperatures are changing, creating variable, systematic-like errors. Real scientists are concerned about the magnitude and origin of the inevitable errors in their measurements.

  30. Perhaps someone is familiar enough with the datasets to answer a couple of questions I have had about the average of global surface temperatures. One of the first things I do with some new data is look at the raw data, before any manipulation occurs and get a sense for how it is distributed.

    1. If we wanted the average of global temperatures at a specific time, presumably half of the globe would be in darkness, the other half in daytime. It seems that if one wants to get at something thermodynamic (which we know an average temperature is not) we should at least try to get simultaneous measurements of surface temperature, which would at least be representative of the globe at a particular time. It seems taking readings at a particular local daytime over the globe, and even extrapolating them to some standard local time, is designed to maximize the average of the numbers. Subtle changes in this process over time could further insert trends in anomalies simply due to the averaging changes. Perhaps the local differences max-min should be tracked instead and the midpoint or some time average used to be consistent.

    2. I have often wondered what the distribution of global temperature readings looks like. By this I mean simply rank ordering them and plotting the cumulative distribution or something fancier like a q-q plot. Using anomalies would not work since they are (I think) constructed from averages themselves and are not raw temperatures. Again, the issue of the time of measurement and method of choosing that time or extrapolating would arise, but if the data was available possible effects of such differences on the distribution could be viewed, it would be interesting just to look at the distribution, and be able to look at the distribution of measurements at the same universal time, the same local time, and something constructed from the min-maxes. Looking at that distribution, it would be interesting to examine the difference between a simple average and a median, which tends to be more robust to changes in extreme values, or the time behavior of the quantiles, which can say something a little deeper about the underlying distribution. Also, such quantities could be consistently evaluated over time without having to change any averaging techniques. If the distribution turns out to be bimodal or multimodal, that would be of interest as well and would suggest some things about possible biases in computed averages.

    Does anyone know if this has been done somewhere?

  31. In a perverse way I delight in graphs such as the one at the top of the article. I go all misty eyed when I remember the former sharp peak in temperatures in 1998. Now look at it. You’d never know rom the graph that there was anything but a monotonic increase taking place (apart from the figures bouncing about all over the place).

    Poor 1998. Crushed under a tidal wave of data fiddling.

  32. Clyde, an excellent essay on the problem with use of the temperature record to draw conclusions on the ranking of “warmest years”. I’ve remarked on other threads, that if our purpose is to detect warming or cooling over time, we would be better off with a dozen or two high quality thermometer sites with 2 or three closely placed thermometers in pristine clear locations away from volcanoes etc. Collect data and be patient. Had we set out with this project in the 1970s when concerns were broadly expressed that we were headed for an ice age, we’d be over 40yrs into the plan.

    To improve precision, we could have located half the thermometers north of 70 Lat where we’ve learned that a three times amplification of temperatures occurs in a warming world.

    I think that our best bet now is to go with satellites designed for purpose. It doesn’t matter what the global avg temperature really is if we are looking for an early warning set up. Moreover, given your issues re precision, the unadjusted records for decently located sites with more than 75yrs records would serve to do the job. An analogy concerning this issue is that if sea level is going to rise a couple or more meters in a century as worriers believe, it makes no sense to be measuring it with a micrometer – a foot rule will do.

    • Gary,
      I think that rather than fewer, we need more thermometers. From what I can tell, the Earth isn’t warming uniformly. Any year there are high and low anomalies. The only way we can be sure that we are capturing the trend is to have a proper sampling protocol. Trying to use airport thermometers, which were designed to let a pilot know if he was going to be able to become airborne or if he was going to encounter icing, doesn’t cut it for climatology.

      • Absolutely. A million standard thermometers at the same height above AGL, or one for every ~195 square miles of earth’s surface. Those moored in the ocean might be expensive hazards to navigation, but invaluable in recording the regions of the planet in which the gatekeeping book-cookers make up their most imaginative flights of fancy.

      • I’d prefer to have one for every square mile. Even that is probably to few to get a decent spatial accuracy.

      • Mark,

        Of course more are better, with continuous readings, but one per sq mi IMO might pose a hazard to navigation at sea, or at least risk the destruction of valuable scientific apparatus by merchant vessels.

        One per sq mi would mean almost 200 million stations.

      • To get a true reading of the atmosphere, the sensor network needs to extend vertically as well as horizontally.

  33. A very good article Clyde. I am pleased that there are so many who remember these concepts.

    Then there is your comment that “The global average is a hypothetical construct that doesn’t exist in Nature”. Not only is it a hypothetical construct, but the calculations involved throw away so much information in the quest for a single representative number.

    Heck, even the NOAA globe diagram Richard G helpfully shows is more informative. At least that diagram might cause some people to ask what is going on with the dark red patches in the USA and the patches of blue at various locations. Global warming? Yeah right!

  34. What evidence do you have Clyde, when you say, “The global average is a hypothetical construct that doesn’t exist in Nature.” Is this an axiom you take on faith?

    • To come up with an accurate “average temperature” you would have to measure the energy contained in every molecule of the atmosphere at the same instant in time.
      This can’t be done. The best you can do is take samples distributed in space and as close to the same time as you can manage.
      In reality, you aren’t measuring the “temperature” the atmosphere, instead you are measuring the temperature of discrete points and making the assumption that the temperature of the points not being measured is close enough to the points that are being measured that the difference won’t matter.

      This is another reason why the claims of 0.01 or even 0.001 C accuracy are absurd. The differences between your sensor and any spot within 100 feet, much less 100 miles is going to be orders of magnitude greater than that. There is no way to plot what those differences are. So you have to include a reasonable variance to account for what you can’t know.

      • MarkW, you are confusing the measurement of a single temperature with the estimation of a value of a population (i.e. sampling)

      • MarkW on April 12, 2017 at 12:16 pm

        In reality, you aren’t measuring the “temperature” the atmosphere, instead you are measuring the temperature of discrete points and making the assumption that the temperature of the points not being measured is close enough to the points that are being measured that the difference won’t matter.

        Sounds correct!

        But my (very little) experience in processing temperature time series has teached me that the results of averaging processes based on far less points than I thought to be needed, are much nearer to averaging all points available than I ever had imagined.

        Let us please consider UAH’s satellite temperature measurement record, which is available as a 2.5° grid over the planet (three southernmost and northernmost latitude zones excluded).

        If instead of averaging all 66 x 144 = 9,504 cells, you select only 512 evenly distributed ones, you obtain a temperature series which here and there well does differ from the the full average, sometimes quite heavily.

        But if now you build, over the two times series, 60 month running means (which in fact are for us of far bigger interest than single monthly anomalies) you see that they remarkably fit each to another:

        Conversely, there is few hope that you will obtain, for the global average, a result for running means more accurate than that obtained with the 9,504 cells when moving e.g. to a 1° grid.

        The differences you will rather experience in small latitude or regional zones.

    • Michael,
      If I were to ask you to determine some intrinsic property of a steel ball, such as its conductivity or coefficient of elasticity, would it make any difference where or when you measured the property? On the other hand, temperature varies with time, location, and elevation. It is, at best, something that is used to estimate the amount of heat energy, although that is rarely acknowledged.

      • Clyde, you correctly identified the global average as a “hypothetical construct.” You have failed to prove it does not exist. The number “7” is a hypothetical construct also, but you can neither prove it exists, nor can you prove it does not.

    • Michael,
      Our system of counting and mathematical manipulation requires that the number that we have chosen to call “7” exist. It is fundamental to mathematics, as are all numbers. On the other hand, if I speculate that there is some approximation to the “speed of dark,” that number doesn’t necessarily exist, except in my imagination. Implicit in the acceptance of the idea that there is an average global temperature for a specified period of time, it assumes that there is a standard elevation for that temperature and it can be measured to infinite precision if we make an infinite number of measurements. There is no accepted standard elevation for temperature measurements (except with respect to the ground), and we can’t make an infinite number of measurements. What we are left with is an approximation to a hypothetical value. The question becomes, “To what purpose do we calculate said average, and what precision can we justify?” I maintain that we can know the defined average global temperature only very imprecisely, and with unknown accuracy.

    • Clyde, you blew it….”7″ does not exist, here’s a simple test…..point it out to me. You can’t. You can’t point to “7.” The symbol on a piece of paper or on your screen is not “7” its a representation of the construct. You can line up a bunch of things then attempt to associate a verbal sound (like “one” or “too” or “twee”) to collections of the things, but again, the verbal sound is a representation of a construct.
      The point I was making is that “hypothetical constructs” do not exist, they are figments of our minds. The procedure for measuring the “average global temperature” is just that, a set of tasks one completes to arrive at some result that BY DEFINITION OF THE PROCEDURE” is the average global temperature.
      Now, as to you question of “purpose” it’s pretty simple. Once you have the procedure defined, you repeat said procedure, and low and behold you discover that as time goes on the measurement you get is slowly rising.

  35. Both imprecision and inaccuracy are sources of error in measurements, but it is nigh impossible to determine from error analysis of a set of measurements (‘deconvolution’) which problem contributes how much to the total error. The situation is similar when one tries to consider the relative contributions of various factors to allegedly observed changes in global temperature.
    The temperature problem is far more complicated because it requires ‘meta-analysis’ – the aggregation of data from various sources which are not homogeneous. The use of a non-comprehensive set of data aggregated from different types of measurements performed by different methods and protocols under different and often non-compatible circumstances totally fails the requirement for data homogeneity.
    For example, the variances from each individual instrumental record, properly handled, should be added to obtain an overall variance for the aggregate.
    Any attempt to synthesize a common result from a combination of disparate data sets such as urban, rural, marine, aerial, and satellite data sets becomes simply an exercise in arithmetic, and any attempt to assign significance to the composite is a self-deluding fantasy.

  36. Clyde. Thanks for posting this. It seems well done and nothing is obviously wrong. I have decided that I’m lousy at statistics and that an awful lot of folks are even worse. Moreover, I’m not sure I care how many standard errors can dance on the head of a pin, even if I truly understood how to do the math properly. So I’ll forego commenting on the comments.

    But I would point out that global surface temperature as we currently define looks to be a truly unfortunate metric no matter what the precision/accuracy. It is very sensitive to ENSO excursions of warm water into the Eastern Pacific and also to poorly known past sea surface temperatures. IMHO, “Climate Science” really should consider replacing the current metric with something/anything that looks to meaningfully track long term warming/cooling of the planet

  37. Don K,
    I don’t claim to be good at statistics either. I had to go back to my text books and review material I had studied decades ago. Basically, I’m claiming that the Standard Deviation, and not the Standard Error of the Mean, is the appropriate metric for the uncertainty in global temperature measurements, and the anomalies derived from them.

    • “Basically, I’m claiming that the Standard Deviation, and not the Standard Error of the Mean, is the appropriate metric for the uncertainty in global temperature measurements”

      Yeah … maybe. As I understand it (and I’m probably wrong), Standard Deviation is a measure of the dispersion of the data whereas Standard Error is the dispersion in the estimate of the arithmetic mean of the observations — How likely is it that a given observation is valid vs how likely would it be for an independent set of observations of the same system over the same timespan to yield the same result?

  38. All of these issues are why my interest is in the explicit , as calculable as pi , physical audit trail between all the parameters we measure . As I most recently put it at http://cosy.com/#PlanetaryPhysics , I’m

    Seeking an executable understanding of the differential in a voxel because mapping
    it over a sphere is rather trivial in an APL like CoSy

    I only got thru the implementation After a handful of APL expressions computing the mean temperature of a gray sphere surrounded by a sphere with an arbitrary radiant temperature map , which given the parameters of the Sun and our orbit gives a temperature of about 278.6 +-2.3 from peri- to ap-helion .

    Even that non-optional computation is extremely poorly understood .

    I have yet to have anyone either say “yes of course” , or offer an alternative algorithm , or an experiment test of the extension of the computation to arbitrary object absorption=emission spectrum presented at http://cosy.com/Science/warm.htm#EqTempEq .

    This field desperately needs to return to the classical experientially experimentally quantitative abstractions of its basis in applied physics .

    We need YouTubes reasserting these quantitative realities with the simple brilliance of Rictchie’s experiment a hundred & eighty some years ago .

  39. A good and interesting article. However, I am not sure where your definition of precision comes from.

    From http://www.itl.nist.gov/div898/handbook/glossary.htm#precision
    We have:-

    precision:

    in metrology, the variability of a measurement process around its average value. Precision is usually distinguished from accuracy, the variability of a measurement process around the true value. Precision, in turn, can be decomposed further into short term variation or repeatability, and long term variation, or reproducibility.

    It has nothing to do with Resolution and significant figures.

    http://www.itl.nist.gov/div898/handbook//mpc/section4/mpc451.htm

    NIST says:

    Resolution:

    is the ability of the measurement system to detect and faithfully indicate small changes in the characteristic of the measurement result.

    I my language, precision is defined as how close you achieve the same measured value, if you keep repeating the measurement e.g. take thermometer from fridge to cup of tea, measure, record, repeat.
    If your recorded tea temps are very close, you have a precise thermometer.

    In my language, resolution is defined as – what is the smallest change in value my measurement system can respond to eg if I add drops of boiling water to my cup of tea, it will slowly increase in temperature. A higher resolution thermometer will respond and indicate to, say, 0.1 degree change, whereas a lower resolution device will respond and indicate, to say, 0.5 degree change. Nothing to do with the number of digits.

    Your second diagram, the 4 cross hairs, is spot on, you can have a very precise sensor that is very inaccurate. Many folks don’t get that at first.

    • Steve1984,
      The formal definition of precision has been changed in recent years. Unfortunately, in my opinion, it is a defective definition because, unlike with the use of significant figures, it is difficult to know what the actual or implied precision is. Note, however, in the definition that you have provided, that as the precision increases, it will be necessary to increase the number of significant figures to convey the increase in precision. That is why my diagram showed a finer scale on the top row than on the bottom row.

      Your definition of precision sounds to me like repeatability. I equate resolution with precision.

    • “steverichards1984 April 12, 2017 at 1:45 pm
      A good and interesting article. However, I am not sure where your definition of precision comes from.”

      Indeed?
      How very odd.

      Why did you not start at the beginning with what is metrology?

      “metrology study
      Sometimes called a gauge capability study, or measurement capability assessment.
      Such a study quantifies the capabilities and limitations of a measurement instrument, often estimating its repeatability, reproducibility, and sometimes its sensitivity.”

      That doesn’t sound like meteorology, now does it?
      From Merriam-Webster
      “Definition of metrology
      1: the science of weights and measures or of measurement
      2: a system of weights and measures”

      Next at the link you provided:

      “precision 1. in metrology, the variability of a measurement process around its average value.
      Precision is usually distinguished from accuracy, the variability of a measurement process around the true value.
      Precision, in turn, can be decomposed further into short term variation or repeatability, and long term variation, or reproducibility.

      Precision 2. A fuzzy concept term for the general notion that one knows more or has shorter confidence intervals if one has more data; that is, more data gives greater precision in answers and decisions.”

      “fuzzy concepts Concepts that, by their greater abstraction, admit both generalization and alternative approaches.
      For example, the average is usually calculated to estimate the typical value of a set of numbers.
      The average is a specific concept, whereas “typical value” is a fuzzy one.”

      It looks like NIST understands precision, measurement and fuzzy sloppy concepts.

      Amazingly, this also from the same link, just a different chapter.

      “2.1.1.3. Bias and Accuracy
      Definition of Accuracy and Bias:
      Accuracy is a qualitative term referring to whether there is agreement between a measurement made on an object and its true (target or reference) value.
      Bias is a quantitative term describing the difference between the average of measurements made on the same object and its true value.
      In particular, for a measurement laboratory, bias is the difference (generally unknown) between a laboratory’s average value (over time) for a test item and the average that would be achieved by the reference laboratory if it undertook the same measurements on the same test item.

      Depiction of bias and unbiased measurements
      depiction of unbiased measurements Unbiased measurements relative to the target

      depiction of biased measurements Biased measurements relative to the target

      Identification of bias:
      Bias in a measurement process can be identified by:
      • 1.Calibration of standards and/or instruments by a reference laboratory, where a value is assigned to the client’s standard based on comparisons with the reference laboratory’s standards.
      • 2.Check standards , where violations of the control limits on a control chart for the check standard suggest that re-calibration of standards or instruments is needed.
      • 3.Measurement assurance programs, where artifacts from a reference laboratory or other qualified agency are sent to a client and measured in the client’s environment as a ‘blind’ sample.
      • 4.Interlaboratory comparisons, where reference standards or materials are circulated among several laboratories.

      Reduction of bias:
      Bias can be eliminated or reduced by calibration of standards and/or instruments.
      Because of costs and time constraints, the majority of calibrations are performed by secondary or tertiary laboratories and are related to the reference base via a chain of intercomparisons that start at the reference laboratory. ”

      And from your link:

      Definition from (MSA) manual:
      The resolution of the instrument is δ if there is an equal probability that the indicated value of any artifact, which differs from a reference standard by less than δ, will be the same as the indicated value of the reference.

      Good versus poor:
      • A small δ implies good resolution — the measurement system can discriminate between artifacts that are close together in value.
      • A large δ implies poor resolution — the measurement system can only discriminate between artifacts that are far apart in value.

      Warning The number of digits displayed does not indicate the resolution of the instrument.

      Are you sure you read those links?

  40. “One cannot take a single measurement, add it to itself a hundred times, and then divide by 100 to claim an order of magnitude increase in precision.”

    But one can take 100 different independent measurements of the same thing and divide the S.D. by the square of 100 to get the std error. If you make 1000 measurements, you get to divide by about 30. And you can be smaller than the graduations on the measuring device. It’s just elementary statistics, in the early chapters actually.

    And the S.E. is what is what is actually being used in Student’s T test to come up with probabilities, not the S.D.

    • 1000 sensors, at different point of the globe are not measuring the same thing. So you can’t average them to get a more accurate reading.

      • RS, no they aren’t. They are taking measurements while on the globe. Not the same thing at all.

      • If I took one measurement on Earth, and another measurement on Venus, could I average them? After all, it’s just one solar system.

      • MarkW on April 12, 2017 at 2:59 pm

        10 sensors, at different points of the area around Berlin, Germany are not measuring the same thing as well. Different places with different character (oooooh, UHI here ‘n there, horrible), different elevations, etc.

        But weather prediction software packages on Internet do very well average and interpolate them and get accurate results.

        That is the reason why temperature, rainfall and wind are so pretty good predicted for the place where I live, though there is no weather station available.

        MarkW, you are perfect in sophism.

    • You must mean Smirnov? I thought he only worked with non-parametric statistics, which does indeed have limitations.

      • ReallySkeptical,
        Smirnov and Smirnoff come out of different bottles. Did you look at the references?

    • Yes, indeed.
      OT I imagine but,
      A system in which water is piped into iron radiators works very well for distributing heat from a furnace to each room of a house.
      If those pipes were just bars of solid iron, not so much.
      If the radiators were just pools of water with no iron skin, again, not so much.
      Together…wondrously good system.
      I grew up in a big old house in which the system was originally gravity driven, with huge pipes near the furnace that got smaller as they branched out the various zones and individual rooms.
      So logical, and so efficient…and a hundred and fifty years later still works like a charm, although a small pump now pushes the water along since the 6″ iron pipes that converged into the original boiler have been replaced with smaller copper ones that fit the new furnace.

      • I spent part of my life in an old 1700’s house that had a coal converted badly to oil boiler.

        All my radiator did was make noise. Never a change in temperature. I considered buying that house from my Father, but that old boiler cost too much to run.

  41. Since the temperature distribution is not a normal bell curve, the Gaussian method should be used to calculate the standard deviation.

  42. Gavin claims that only a very small sample of stations are needed for a valid GASTA result.

    https://realclimatescience.com/2017/01/gavin-schmidt-explains-why-noaa-data-tampering-is-illegitimate/

    Since he makes up most of the “data” anyway, sure, why not? If one station can represent a radius of 1200 km, why not have just one station for every 4.5 million square kilometers, ie 113 stations? IIRC, which I might not, he once suggested that 50 stations would suffice.

  43. Another factor, which probably doesn’t belong in this discussion, is possible confirmation bias in CAGW activist compliers and curators of the data which makes up the supposed historical instrumental data.

    • I think that discussion of that bias belongs in every discussion of global temperature data.
      Tainted data makes any conclusion based on it worthless.

  44. There are serious issues with virtually all GAST estimates that have little to do with the accuracy of thermometers or the precision of station averages. The most salient question is how representative are the available stations, overwhelmingly urban world-wide, of the area in which they are located? And, since temporal variability of climate is the key issue, how is that variability affected by constantly changing the set of stations? In other words, what is the ultimate reliability of UHI-corrupted data when used in piece-meal fashion to manufacture “global” time-series. Nobody, least of all the index-makers, pay serious attention to these pivotal issues.

  45. You forgot to mention that until recent years RECORDING accuracy in USA was +/-0.5 deg.
    CRU’s Dr Jones et al’s calculation of +/-0.001 deg accuracies for HADCRUT data only works for HOMOGENEOUS data – which global temperature is not.

    • dradb,
      I did mention that formerly temperatures were only reported to the nearest degree, which implies an uncertainty of +/-0.5 deg.!

  46. Back in a previous millennium when I took chemistry, we measured temperatures with large mercury filled thermometers. We could calibrate them in baths of ice water. We could use magnifiers and verniers to read them. We our readings as good as 0.1°?

    Now Imagine that you are an observer in a remote base in the 19th century. Are your instruments calibrated? Did you have a magnifier or a vernier with which to read them? How did you see them at night? did you hold a candle next to the thermometer? If you decided to stay indoors and make up the numbers on a really bitter night, would anybody have known?

    • If you weren’t feeling good and asked your wife or one of the kids (none of whom have had training in how to take a reading) to go take the reading for you, would anyone have known?

    • How often did thermometers get taken somewhere for another purpose and laid down flat afterwards?

      Then when picked up, the mercury was separated. What to do!?

      Why bang the thermometer bottom with something to break the bubble and drain all the mercury down.

      It doesn’t take many hits before thermometers start shifting in their metal band mountings.

  47. The main issue in the building up of global average temperature anomaly curve, accuracy & precision are not the main issues but the extrapolation and interpolation of data over space and time under climate system and general circulation patterns is critical issue. In other words data network.

    Dr. S. Jeevananda Reddy

    • Not to mention blatant “adjustments” designed to cook the books. Now NOAA’s minions and imps put their thumbs on the scales of the raw data. The whole process from top to bottom is corrupted. The ringleaders and senior perps need to go to jail and the willing accomplices be fired.

  48. “It says that probably a little more than 2/3rds of the recorded surface temperatures are between -26. and +36.° ”
    Perhaps I am missing something here, or maybe these numbers should be in degrees C, but this seems unlikely, given that 40% of the globe is between the latitudes of the tropics of Capricorn and Cancer, where, unless one is high up on a mountain, it never gets as cold as 36 F.
    Is this quoted range and percentage verified? Over 67% of global surface temperature readings are 36 F or lower?
    Again, huge parts of the globe never get that cold…ever. Over half, conservatively. And another large part rarely gets that cold for more than a brief and occasional interval.

    • Menicholas,
      What you missed is that I said that the median value, which would equal the mean if the distribution were symmetrical, was far enough below the commonly accepted mean to strongly suggest a long tail on the cold side. That is, you are correct that high temperatures are going to be more common in the real world.

      • Clyde,
        Thank you for your reply.
        I understand the long tail, and agree.
        My only question is regarding that one single sentence…it just does not seem possible to me, if it refers to all measurements taken at all places and at all times of year.
        Outside of polar regions, and in the places where most all of us live, temps are higher than that for at least half of the year…even at night.
        Are they not?
        I do not wish to be disagreeable, as I very much enjoyed your article.

    • I think the apparent discrepancy lies in the word “recorded.” The majority of recorded temperatures are from locations in the temperate zone of the northern hemisphere.

      • Right.
        Temperate zones are known for being temperate, for much of each year. Temps near freezing are rare for many months of the year outside of the polar regions.

      • Menicholas,
        Then let me rephrase my statement. IF the mean were 5.0 deg F, then one would expect that 68% of the readings would lie between -26 and +36 deg F, with a SD of 31. Knowing that the actual mean is closer to 59 deg F, it is evidence that the distribution is strongly skewed.

  49. Clyde,
    As you know, I have been blogging for years that one of the main blunders of climate workers has been their failure to report proper, formal, comprehensive estimates of error. Some authors have, but too many are in Pat Frank’s class, of which he says that he has never met a climate scientist who could explain the meaningful difference between accuracy and precision.
    A formal error analysis that leads to proper error bounds is a boon in hard science. It allows the reader a snapshot of the significance of the planks in the hypothesis under test. As an example, if the first Hockey stock graphs had correct error bounds illustrated, they would likely not have passed peer review.

    Please allow 2 comments, one from theory and one from practice to support your case.
    Theoretical.
    Reference –
    https://en.wikipedia.org/wiki/Sampling_(statistics)
    “sampling is concerned with the selection of a subset of individuals from within a statistical population to estimate characteristics of the whole population”
    It involves several stages.
    • Defining the population of concern
    • Specifying a sampling frame, a set of items or events possible to measure
    • Specifying a sampling method for selecting items or events from the frame
    • Determining the sample size
    • Implementing the sampling plan
    • Sampling and data collecting

    Most bloggers here have objected to the conventional path to a global temperature average because it fails to properly and completely comply with the stages above. If the population of concern is the surface temperature at each point on the globe, it fails because, for example, there is insufficient and uneven coverage of the globe. There are other blogger objections, such as temperature being an intensive property that cannot be averaged, and difficulties arising from constructing time series where there is daily movement from max to min on a rotating, precessing globe with many complications related to time of observation, let alone moisture complications..
    Until the several stages above are completed in proper details, and shown to have been satisfied, one must regard as unreliable the current estimates of global mean surface temperature.
    Practical.
    As you note from an NOAA manual,
    “Once each minute the ACU calculates the 5-minute average ambient temperature and dew point temperature from the 1-minute average observations… These 5-minute averages are rounded to the nearest degree Fahrenheit, converted to the nearest 0.1 degree Celsius, and reported once each minute as the 5-minute average ambient and dew point temperatures…”
    NOAA receives data from various countries. From Australia’s BOM, readings are taken differently.
    Firstly, we receive AWS data every minute. There are 3 temperature values:
    1. Most recent one second measurement
    2. Highest one second measurement (for the previous 60 secs)
    3. Lowest one second measurement (for the previous 60 secs)
    Relating this to the 30 minute observations page: For an observation taken at 0600, the values are for the one minute 0559-0600.
    As Ken Stewart notes further,
    https://kenskingdom.wordpress.com/2017/03/21/how-temperature-is-measured-in-australia-part-2/
    “An AWS temperature probe collects temperature data every second; there are 60 datapoints per minute. The values given each half hour (and occasionally at times in between) for each station’s Latest Weather Observations are spot temperatures for the last second of the last minute of that half hour, and the Low Temp or High Temp values are the lowest and highest one second readings within each minute. There is NO quality assurance to flag rogue values. There is no averaging to find the mean over say one minute.”
    One can examine if this Australian CLIMAT data is compatible with the in-house data of NOAA. It is not. There are biases. Currently, I believe that these are uncorrected, if indeed correction is possible.
    ………………………..
    What is more, for many years the only relevant data collected each day were Tmax and Tmin from Liquid-in-Glass thermometers designed to record only those parameters. To obtain a mean, it was common to use (Tmax+Tmin)/2. However, many do not realise that in November 1994 (IIRC) the BOM changed from this mean, to report “average” daily temperatures as the mean of 4 temperatures taken 6 hours apart. The BOM has provided comparisons and the differences range to nearly 1 deg C over a long term comparison.
    …………………………..
    What is more more, the BOM reported years ago that metrication has an effect, one that is ignored –
    http://www.waclimate.net/round/
    This is a bias error of up to 0.1 deg C.
    ……………..
    More, more, more.
    I could go on to show the effects of rounding from deg F to deg C and back, but you can do this yourself.

    ……………………………….
    The point is, when one examines the accuracy of the historic temperature record in detail, depending on its origins, one can find many inaccuracies, some well known, other unknown to many, that have 2 important aspects:
    1. The magnitude of the accuracy error overall is if the order of 0.1 deg C
    2. The error is a bias. The negative excursions do not wipe out the positives.
    One can show that in the comparison of Australian data with NOAA global data, there are errors of this type that cumulatively probably exceed 0.2 deg C. Given that the official BOM figure for warming of Australia is 0.8 deg for the 20th century, this is a large error and one that cannot go away with more sampling, or with more “valid” adjustment.
    …………………………….
    The global surface temperature average of which you write, Clyde, does not stand close inspection.
    As you noted, and I support.
    When people like my old mate Steven Mosher write that the historic reconstruction of it is accurate, they are dealing with data that perhaps unknown to them contains errors of the type I have described above. So you can take such claims with a grain of salt.
    Geoff.

    • I have a Tmax – Tmin mercury thermometer right outside of my front door. It is wonderful.

      I tend to only shake the markers down when I need to know how cold it really got the night before.
      I don’t care about how hot it got during the day. If it’s hot stay out of the sun!

      Mad dogs and Englishmen…

      • There are Australian BOM records of an observer who read the wrong end of the peg for a time. Hope you followed the instruction manual yourself?
        Cheers. Geoff

  50. This kind of arcane, nitpicking discussion doesn’t do a thing to help the General Public understand that the Natural Climate Change Deniers have hoodwinked them into believing that the Earth is Warming when it is actually Cooling.

    • 1) Is this discussion aimed at the general public?
      2) Getting the general public to recognize the lunacy of trying to declare temperature records when the differences are being measured in hundredths or thousandths of a degree is part of getting them to recognize that global warming ain’t what it’s cracked up to be.

    • mickeldoo,
      I’m sorry that you feel that the discussion is “nitpicking.” I was attempting to avoid the arcane mathematics that Windchasers so badly wanted, and make the case that the General Public should not blindly accept the claims made about the claimed warming. I might suggest that you prepare a guest editorial making the case for cooling and submit it to Anthony.

    • mickeldoo,
      This is not nitpicking.
      If you accumulate the various errors that are not yet but included in official analysis, you can plausibly account for a quarter of the officially estimated global warming over the last 100 years. Some analysts go even to half of the warming.
      That is of fundamental importance for policy.
      That is why some of us who care continue to invest precious time in attempts to clean up the records, despite the abuse from those less competent.
      Geoff

  51. (Short version of a longer essay that disappeared). Clyde, essentially we are in agreement. Here are 2 posts in succession to keep them shorter.
    Part one. Formal matters.
    From wiki, https://en.wikipedia.org/wiki/Sampling_(statistics)
    “…concerned with the selection of a subset of individuals from within a statistical population to estimate characteristics of the whole population.” And
    “The sampling process comprises several stages:
    • Defining the population of concern
    • Specifying a sampling frame, a set of items or events possible to measure
    • Specifying a sampling method for selecting items or events from the frame
    • Determining the sample size
    • Implementing the sampling plan
    • Sampling and data collecting”

    The first stage, to define the population of concern, requires great care. In shorthand form here, in terms of Clyde’s essay, the population might be the temperature at ground level of all points near the surface of the globe. This then requires an immediate qualification, for there are infinitely many of these places. And so, the stages have to be worked through. Other bloggers here have already pointed out many of the complications in the way of a precise and accurate framework to compute a ‘global mean surface temperature”. Others have raised additional theoretic problems, such as temperature being an intrinsic property unable to be averaged; and trying to make a time series for a globe that is rotating and precessing while the temperature at a surface point ranges from Tmax to Tmin each day, raising a host of requirements to deal with just time of observation.
    In summary, while a global mean surface temperature is an easy concept for a paper exercise, it is devilishly complicated. The best conclusion is that such an entity, a global mean temperature, is invalid and ought to receive a great deal more thought in its specification and measurement. Specifically, proper and accepted error estimates should be made at each stage of its construction. The visual use of error envelopes on a graphic can be a boon to readers. Used properly, it allows a rapid initial assessment of the significance of the hypothesis.
    It is quite rare to read a climate paper whose treatment of accuracy, precision and error, including error propagation, is satisfactory and of the standard expected for hard science. This is one of the main reasons why some of us are sceptical of the climate field. There needs to be an enforcement of the proper treatment of errors in every paper, along the lines of that laid out for example by the Paris based Bureau of Weights and Measures, BIPM. The BIPM material requires detailed study. Here is a site where the measurement of one parameter, the standard metre, is covered.
    https://www.google.com.au/webhp?sourceid=chrome-instant&rlz=1C1CHBF_en-GBAU734AU734&ion=1&espv=2&ie=UTF-8#q=international+bureau+of+weights+and+measures+meter
    If the first of the hockey stick graphs had shown proper error estimates, they might have failed at the peer review stage. If they were applied now, one would find it hard to take it seriously.
    Geoff.

    • Any paper, and I do mean any, that uses a temperature data series should start with the raw data and include the adjustments that have been made and why. This may mean using NASA, NOAA, or whoever’s algorithms, but they should be explicitly shown so that everyone can see and judge them. As you say, the treatment of accuracy, precision, and error should also be a requirement.

      • Jim ,
        Have you seen any papers from the last 20 years retracted by their authors because they relied on temperature data that by now has been officially altered?
        I have not.
        Geoff

      • That doesn’t mean that papers shouldn’t include that algorithm AND explain the errors it introduces AND detail the possible outcomes those errors cause.

  52. (Short version of a longer essay that disappeared). Clyde, essentially we are in agreement. Here are 2 posts in succession to keep them shorter.
    Part Two. Practical matters.
    NOAA in USA collects temperature data from many sources for aggregation into a global composite. Australia sends its data in a CLIMAT file. There is a question of whether Australian data is immediately compatible with NOAA data. It is not. There are several examples why.
    1. Reading electronic temperature devices. Clyde provides a quote from an NOAA brochure.
    “Once each minute the ACU calculates the 5-minute average ambient temperature and dew point temperature from the 1-minute average observations… These 5-minute averages are rounded to the nearest degree Fahrenheit, converted to the nearest 0.1 degree Celsius, and reported once each minute as the 5-minute average ambient and dew point temperatures…”
    Compare this with the Australian method, source, an email from a BOM officer (“we”) a few weeks ago, repeated on a Ken Stewart blog with some of his analysis of it.
    “Firstly, we receive AWS data every minute. There are 3 temperature values:
    1. Most recent one second measurement
    2. Highest one second measurement (for the previous 60 secs)
    3. Lowest one second measurement (for the previous 60 secs). Relating this to the 30 minute observations page: For an observation taken at 0600, the values are for the one minute 0559-0600.”
    Some implications follow from Ken’s site, https://kenskingdom.wordpress.com/2017/03/21/how-temperature-is-measured-in-australia-part-2/
    It is apparent that the NOAA method will not give the same answers as the Australian method. There is a bias and it is of a type where the positives and the negatives have no reason to cancel each other.
    2. Converting automatic readings to a daily mean.
    In Australia, the Liquid-in-Glass thermometer was largely replaced with the electronic temperature device about 1993. Before then, mean daily temperatures were calculated from the purpose built Max and Min thermometers by (Tmax + Tmin)/2. Since Nov 1994 the convention has changed. The mean is now that of 4 temperatures taken 6 hours apart each day. This introduces a bias in the Australian data, examples of which are given by Australia’s official BOM in a publication that shows the difference for a number of sites over several years each. There is a bias and again there is no good reason why the positives match the negatives. Indeed, where tested, they do not. The magnitude of the error is about 0.2 deg C tops.
    3. Metrication.
    4. Australia converted from deg F to deg C in September 1972. The BOM published here http://cawcr.gov.au/technical-reports/CTR_049.pdf that “The broad conclusion is that a breakpoint in the order of 0.1°C in Australian mean temperatures appears to exist in 1972, but that it cannot be determined with any certainty the extent to which this is attributable to metrication, as opposed to broader anomalies in the climate system in the years following the change. As a result no adjustment was carried out for this change.”
    So here is another bias that makes it hard to incorporate Australian data seamlessly into a global average. It is a bias in one direction, so the positives and negatives do not cancel.
    There are several of these bias components in the Australian data. It is not known to me if the NOAA procedures cope with this bias when blending a global average. (I doubt if they know all of them).
    The conclusion has to be that the accumulated effect of such bias allows a rough estimate of up to about 0.25 deg C that is carried forward in the CLIMAT data. This has direct relevance to the topic of Clyde’s post. It means that it is often pointless to use a figure after the decimal for relevant temperatures. Those, like my old mate Steven Mosher who argue that exercises like comparing NOAA and BEST to each other and to satellite data like UAH and RSS are showing high precision and/or accuracy are not in full possession of the relevant data. In cases such as I have given here briefly, you can take their claims of goodness with a grain of salt.
    Geoff.

    • Geoff,
      Thank you for the additional information. I found it interesting. As I said in my introduction, “The point of this article is that one should not ascribe more accuracy and precision to available global temperature data than is warranted, after examination of the limitations of the data set(s).” It seems that most of the commenters understand that the temperature data set is a can of worms and that in attempting to use meteorological data for climatological purposes, a lot of problems have been overlooked. For example (ignoring the complaint that it is not even valid to average temperatures, which I need to spend some time thinking about) I think that the only way a better approximation to the average temperature could be obtained would be to determine the lapse rate at every weather station for the time a temperature is recorded, and then convert every ground temperature to a standard elevation, say 10,000 feet, and then average all the temperatures for the standard elevation. Although, air density and humidity will also have an impact on temperature, so I’m not sure it is really a tractable problem. Probably the only way that this could be done is with satellite observations.

      • Thanks, Clyde.
        While my comments deal mainly with Australia’s flawed contribution to guesses about a global average temperature, you might imagine many other countries whose contributions contain even greater errors.
        Thank you for your essay, a platform for more comments by bloggers.
        Geoff

      • I believe this issue has been overthought, given its purpose. If we want a figure for the average temperature at the Earth’s surface for a given day/month/year, then use the raw temperature data as it is. If it’s 110 at Death Valley at noon UTC and 55 at Fisherman’s Wharf in SF, then those are the temperatures at the Earth’s surface at those times. Why should they be adjusted for elevation, humidity, etc. to achieve some “standard”?

        This is from the GISTEMP FAQ:
        Q. Why can’t we use just raw data?
        A. Just averaging the raw data would give results that are highly dependent on the particular locations (latitude and elevation) and reporting periods of the actual weather stations; such results would mostly reflect those accidental circumstances rather than yield meaningful information about our climate.

        This seems questionable to me. The raw temperature is “highly dependent on the particular location”? Well, of course it is. It’s colder at the poles and warmer at the Equator — and that is because of their “particular location.” The temperature at a given point and time is the temperature at that place and at that time. Whether or not that point is at -20′ or +5000′ ASL makes no difference — that is the temperature there at that time. What more can be asked?

        I’m not sure what is meant by the “reporting period” of the weather stations, and the FAQ answer helpfully does not elaborate. If they mean the reporting schedule, what does it matter, since the entire record is being used? If station A records a temperature every five minutes and station B records a temperature four times a day, then that’s the data one has for the day. It can be averaged, and the resulting error can be calculated, and that’s your record for the day.

        With all of the homogenization and adjustments made to the raw data, one would think they’re trying to arrive at the average temperature of some hypothetical geoid at sea level rather than the actual Earth’s surface. Is that the actual goal?

  53. “Of the 17 hottest years ever recorded, 16 have now occurred since 2000.”

    This irksome claim is just more of the lame alarmist narrative. From before 2000 to the present global temperatures have been on a PLATEAU. To say that 16 of 17 numbers on a plateau are the highest is near meaningless. And it completely ignores the situation before “recorded” history – eg. the Minoan, Roman and Medieval Warm periods.

    • Richard,
      Thank you for the link. It was interesting. While reading it, I had a thought: Because temperature will change with the specific heat of a parcel of air, which is a function of density and absolute humidity, (ala the universal gas law) can we expect weather station anomalies to change with elevation? That is, will anomalies tend to be different for stations at low elevations versus stations in mountains?

      • Clyde Spencer:

        You ask me

        can we expect weather station anomalies to change with elevation? That is, will anomalies tend to be different for stations at low elevations versus stations in mountains?

        Probably yes.
        That is one of the many imperfections in the measurement equipment requiring a compensation model if a global temperature is considered as being a physical parameter with a unique value for each year.

        Richard

    • Thank you Richard. Your letter to the investigation board is an excellent summation.

      I also find it intriguing that the climate team blustered, denied, lied and obfuscated their responses to your facts. They effectively displayed their anti-social anti-science characteristics. Which implies they were already told that nothing would happen to the team.

  54. Thanks for a great article on the vicissitudes of measurement. You are spot on in a subject where people lose their way frequently.

    The discussion here was much more fruitful than I originally thought. For example, Windchasers had me flummoxed for a while, and at first I regarded his comments as diversions. They aren’t, and his comments actually opened my eyes to a couple of things. What he’s getting at is that if the quantity being measured is “the global average temperature,” then thousands of measurements averaged together would be more “accurate” (i.e. closer to the “global average temperature”) than would tens of measurements. He’s actually correct, but only if the spread of measurements across the globe were uniform spatially and temporally. Even then, it is a weak improvement until every square foot is averaged.

    But his comment did bring me back to the question: what does a global average temperature even mean? (Yes, I know we are talking about “anomaly,” which makes the problem of significant digits much worse than averaging of measured temperatures.) For the life of me, I can’t see any value in the figure whatsoever until the entire globe is measured daily. And even then, it is only of use is determining (along with a complete humidity map) the energy of the atmosphere. The variations from weather station to weather station within the same locality are amazingly large. To think that the coverage we have today (30% of the earth’s surface) could be extrapolated with any meaning is probably not correct.

    I welcome this kind of discussion, but without the snark – funny as a lot of it is.

    • “Michael S. Kelly April 13, 2017 at 4:12 pm

      To think that the coverage we have today (30% of the earth’s surface) could be extrapolated with any meaning is probably not correct.”

      The accuracy and precision approaches are rebutted above, repeatedly and quite effectively by many commenters.

      The coverage we have today is grossly spotty. 30% of the Earth’s surface is not covered.
      Yes, the USA has allegedly reasonable coverage, and Western Europe’s coverage is similar.

      Alaska, Canada, Russia, Africa, South America, New Zealand’s coverage is sparse and even then concentrated around certain cities.

      Australia’s coverage is somewhere in-between, but closer to sparse coverage since large areas are without cities or towns.

      Antarctica’s coverage is lousy. The Arctic’s coverage is worse.

      There used to be a cam shot of a temperature station in Antarctica that for a long period showed a temperature station bent over and covered in snow. When the team finally returned to right the station, the cam disappeared. Now we can’t see if the pole is bent and the station is covered.

      Not to worry! NOAA has a program that spots anomalous readings and they’ll happily fill those anomalous datums with historical averages or preferably, with some other station’s temperature smudged up to 1200 km.

  55. Michael,

    Please see the comment I made to Windchasers earlier today as a followup to his first comment. He does not make a case for the Law of Large Numbers justifying increased precision.

    Thousands of temperature measurements may be more representative of the average temperature, i.e. more accurate, but I still contest that they improve the precision. The accuracy improves for the same reason that the average of runs of die tosses improves, which is related to sampling. I maintain that the appropriate metric for uncertainty is the standard deviation and not the standard error of the mean. There are a number of caveats that have to be met. And, as more samples are taken in non-representative areas there is the potential for increasing the standard deviation. Also, no one has proposed how to handle differences in elevation. The reason Death Valley sees such high temperatures is because it is below sea level. So, it is difficult to make a blanket statement. What is needed is a thorough analysis of the propagation of error when combining many data sets.

    One can’t calculate an anomaly until a multi-year average is calculated and it is critical how many significant figures are justified in that average because it bears on how the subtractions are handled.

    To give the Devil his due, I think that the intent of calculating a global temperature average is to determine to what extent the biosphere is warming and try to anticipate the impact on living things. Yet, the focus should be on the daily highs, not the average. Also, the many caveats are being ignored in trying to press into service weather observations to answer climatological questions.

    • Clyde, you are absolutely correct when you say, ” I maintain that the appropriate metric for uncertainty is the standard deviation.” But that only applies to a given instrument measuring a specific item. Now, if we had a device that could measure the global temperature, then “standard deviation” would apply. The crux of the problem is that no such instrument exists. So, absent a global thermometer, we do the next best thing, and that is using regular thermometers to sample temperatures of the earth at various geographical locations. We then use all these measurements as an estimate of global temperatures. Now as a result of this procedure, we invoke all the math associated with the statistics of sampling. The “global temperature” now is subject to standard error, and not standard deviation. Because we are now using our sample to estimate the population mean, more observations are the rule. Now if you understand the concept of a closed thermodynamic system (which the earth is: http://www.bluffton.edu/homepages/facstaff/bergerd/nsc_111/thermo2.html ) then you’ll realize that the best measure of the temperature of the earth is with satellites observing outbound radiation from our planet. However, as with any methodology of estimating the population mean, even if you use satellites to measure this outbound radiation, you’ll realize that you are still constrained by the limitations of using a small sample to estimate the population mean. So, in the case of using microwave brightness as a proxy for surface temperatures, the “standard deviation” lies in the characterists of the bolometer in orbit, but the resultant measurement is subject to standard error in statistical sampling. The issue you have with the differences between standard deviation and standard error can be summarized as follows: I can measure the height of my son or daughter by making a mark on the door jam wood work. I may not know how many centimeters the mark is, but if I make the measurement every three months, it’s plainly obvious that my son/daughter is growing.

      • ‘I maintain that the appropriate metric for uncertainty is the standard deviation”

        Only if you are measuring the same thing,…for instance an object in a laboratory maintained at a specific temperature. Then measuring the temperature multiple times will allow you to calculate the mean, standard deviation etc for measuring at that temperature with that instrument.

        When temperature is recorded with a thermometer in a stevenson screen it is a variable that is being measured. Temperature changes through the day, from hour to hour, from second to second. The temperature measured at this moment will not be the same as the temperature at that moment.

        Normal/parametric statistical methods don’t apply

  56. Michael darby,
    You said, “Now, if we had a device that could measure the global temperature, then ‘standard deviation’ would apply.” Once again, I’m afraid that you are confused. If we had such an instrument, and took one reading to determine the temperature, there would be no standard deviation because the standard deviation is a measure of dispersion of samples around a mean. If we took several readings over time, we would get the changes and trend and a standard deviation would be unnecessary for the series of readings because the series gives us what we are looking for without the necessity of calculating an average at any point. (Unless you want to calculate the average of the series because you like to calculate averages.)

    Yes, if determining the average global temperature was as easy as putting a mark on the woodwork, then determining trends and rates would be trivial. Unfortunately, it isn’t at all that simple and for that reason it is a poor analogy.

    I’m signing off. Don’t bother to reply unless you insist on having the last word.

    • Clyde, you say: ” If we had such an instrument, and took one reading to determine the temperature, there would be no standard deviation” Do you have a smidgen of comprehension of measurement? Please clue me in on an existing thermometer that measures temperature perfectly without error bounds. All measurement instrumentation I’ve ever encountered will provide you with a +/- limitation. I suggest you refresh your understanding of how instrumentation error is determined before you reply to this.

      Additionally, the analogy I’ve provided is perfect with regards to what is currently going on in climatology. We might not have the exact readings in degrees Kelvin/Celsius of Earth, but the marks on our thermometers are telling us that Earth is warming.

  57. Clyde Spencer,
    I am sure you are correct when you suggest that error variance is underassessed and underestimated in temperature series. However, I think that you yourself are overassessing the importance of measurement precision when temperature averages are taken over space or time. Measurement precision is totally swamped by the problem of accuracy of resolution. Your reference to Smirnoff – ”… at a low order of precision no increase in accuracy will result from repeated measurements” is relevant to repeat measurements targeted on estimation of a fixed parameter value, but is not highly relevant to a problem of computing an average index, where the precision of any one measurement is small relative to the total range of variation of the individual measures.
    To illustrate the point, consider a hypothetical example where you set up 101 thermometers in a region and you aim to track daily movement in an average temperature index for the region. For each thermometer, you take a single daily temperature reading at the same time each day. You determine by calibration experiments that each individual thermometer reading has a standard deviation of 2 deg C relative to the true integrated average daily temperature in the immediate locale. Your daily temperature reading however is recorded only to the nearest degree C.
    Each individual reading then carries two error terms, the first with sd of 2 deg C (a variance of 4) and the second – due to rounding error – with a sd of 0.2887 (a variance of 0.083). (These latter values associated with rounding error are exactly derived from a uniform distribution with a range of one deg C.)
    The error variance in your final average of the 101 temperature readings is then given by the sum of the contributions of the accuracy error and the precision error, divided by the number of thermometers. This is equal to (4 + .083)/101 = 0.0404. This corresponds to a sd of 0.201. For comparison, if you had recorded your single thermometer readings to a precision of 3 decimal places, instead of to the nearest degree, the sd of the average temperature would be reduced to 0.199 – a tiny change.
    Under these circumstances, you are justified in citing your average temperature index to at least one decimal place, despite the initial rounding error, but can make no claim to significant change in temperature between days until you see a difference of around 0.4 deg C. This, you note, is controlled by the accuracy problem, not the precision problem.

      • James,
        If you are suggesting what I think you are, then you quadruple the variance associated with the rounding problem.
        The rounding problem is identical to the binning problem when you are constructing a histogram. If you are using normal rules of rounding then the interval of each bin is identical. For example, if you are rounding to the nearest tenth of a unit, then the intervals are all equal to exactly one tenth. In the examples given here, where temperature is rounded to the nearest degree, the intervals are all exactly equal to one. For example, all of the values between 3.500001 and 4.4999999 will go into one bin, corresponding to the rounded value of 4.0. With your proposed scheme, the bin labelled “4.0” captures everything from 3.0 to 5.0 and has therefore twice the interval size, double the sd and four times the variance. This definitely makes the error from the rounding problem larger.
        Did you perhaps mean rounding to the nearest whole number rather than to the nearest even number?
        If so, then that is exactly what my example calculation above implies. As I have illustrated, the rounding problem does not go away, but the error it introduces into the overall averaging problem is very small – always providing that the range of variability of the input values is much larger than the precision of the measurement, as it is in my example and in realworld temperature measurements.

      • kribaez,

        I must not have expressed myself very well. You said:

        In the examples given here, where temperature is rounded to the nearest degree, the intervals are all exactly equal to one. For example, all of the values between 3.500001 and 4.4999999 will go into one bin, corresponding to the rounded value of 4.0. With your proposed scheme, the bin labelled “4.0” captures everything from 3.0 to 5.0 and has therefore twice the interval size, double the sd and four times the variance.

        What I meant to say — and I saw this method suggested specifically to balance rounding errors — was that everything from 3.50000 to 4.50000 would round to 4. A value of 3.49999 is less than 3.5, and so would round to 3. A value of 4.50001 is greater than 4.5, and so would round to 5. It was only the exact x.500000 values that rounded to the nearest even number.

        In the example, where would 3.50000 go? In the method I suggested, it would go into the 4 bin. In the example, where would 4.50000 go? In my method, it would also go into the 4 bin.

        If one always rounds up, the error always is positive; if one always rounds down, the error is always negative. By rounding the x.50000 values to the nearest even number (I suppose one could use odd numbers just as well), the rounding error gets more evenly distributed, up and down.

    • You make an assumption here that may or may not be true. ” (These latter values associated with rounding error are exactly derived from a uniform distribution with a range of one deg C.)”. What if the manual readings are skewed? Say, people rounded high temps up and low readings down? This could easily have happened with land and ship readings.

    • Kribaez,

      You come across as knowing more about the problem than I do. I don’t claim to be an expert, unlike some others here who hold themselves up as experts on the problem. I only know enough to get myself into trouble! What I set out to do is to call into question the practices in climatology in the hopes that someone like you might point the way to a more rigorous assessment of data handling and error propagation. It seems that there is a lot of hand waving in climatology and neither the Media nor the laymen they write for are in a position to question what they are told. Perhaps you could expand on your remarks above to provide Anthony with a piece to address my concerns.

  58. Excellent article. It should be a sticky or a reference article for newcomers to this site.

    I wonder whether it might be possible to use a spacecraft sufficiently far from earth, with a bolometer or other temperature sensor, which would read the “smeared” temperature of the planet. Not one focused on any particular spot of water or land or clouds, or the areas at night or during the day. I would image that it would have to have a polar position, probably one for each pole, and then average the two readings to get a “global” average. But then it would miss out on the tropics, so maybe the two satellites should be located right above the terminator on each side of the globe. The idea would be to measure the actual radiative temperature of the entire globe. I wonder whether anyone has done this for other planets in the solar system.

    Just a wild thought.

  59. Thank you for an excellent article Clyde Spencer!

    Also a big Thank You to all the well informed commenters who helped improve the discussion tremendously!

    Of course, not you binned!

    Thank you, Brad Tittle, Neil Jordan and Writing Observer for teaching me about enthalpy! Now I have a bunch of links to further my education.

  60. The accuracy of real time temperature data since 1850, and average temperature calculations, are both worthy of debate.

    However, the real time data we have excludes 99.9999% of Earth’s climate history.

    Even worse, the real time data we have is ONLY for one warming trend, probably still in progress, so new “records” are TO BE EXPECTED until that warming trend ends, and a cooling trend begins.

    More important than the data quality, or extrapolating short-term temperature trends into infinity, are the following two questions:
    :
    (1) Is the current climate healthy for humans plants, animals and plants?

    My answer is “yes”.

    (2) Have climate changes in the past 150 years been unusual?

    My answer is “no”

    The answers to the two questions strongly suggest there is no current climate problem, and recent climate changes are not not unusual.

    Therefore, people claiming a climate change catastrophe IS IN PROGRESS TODAY, are ignoring reality in an effort to scare people.

    What the warmunists want is attention, money and power.

    The false coming climate change catastrophe, allegedly in progress since 1975, is nothing more than a boogeyman used by leftists seeking political power … using the clever excuse they need more power to save the Earth.

    It seems that leftists recognize that most voters are stupid, and gullible — so gullible they believe in a invisible climate crisis in spite of the fact that the climate is wonderful in 2017.

    Anyone living in one place for several decades might have noticed slightly warmer nights … if they worked outdoors at night … and there is nothing else to notice!

    Where’s the bad climate news?

    Ice core studies tell us there were hundreds of mild climate cycles in the past one million years.

    Each cycle consists of several hundred years of warming followed by several hundred years of cooling.

    The length of each full cycle is estimated to average 1,000 to 2,000 years.

    For real time measurements, and compilations of average temperature, we have data for only about 150 years — probably less than half of a full cycle, based on climate history.

    Claiming that 2016 is the hottest year on record means very little because so few years of actuals are available.

    If 2016 is really the hottest year since 1850, based on our very rough measurements, that just tells us the warming trend we believe started in 1850 is still in progress, and the cooling trend expected to follow has not yet started.

    So what ?

    Our planet is always in a warming trend or a cooling trend.

    The measurements are very rough, so we can’t be sure the 1850 warming is still in progress — there has been a flat temperature trend between the temporary 1998 and 2015 El Nino temperature peaks — perhaps that flat trend was a transition between a multi-hundred-year warming and cooling trends?

    Only time will tell.

    • Richard Greene on April 15, 2017 at 12:28 pm

      …there has been a flat temperature trend between the temporary 1998 and 2015 El Nino temperature peaks.

      RG, instead of useless complaining about the fact that you write on WUWT nearly the same stuff since longer time, I rather will give you a hint warmunistas certainly would prefer to hide: between 1850 and today, there were more flat temperature trends!

      For example:
      – between 1880 and 1910;
      – between 1940 and 1975.

      And these two were quite a bit harsher than that 1998 and 2015.

      • 1880 to 1910 have too little data, especially from the Southern Hemisphere.

        1940 to 1975 is considered a downtrend, not a flat trend.

        You call my comments “useless complaining”, yet it is obvious you read my comment, and replied to it.

        If you think my comments are “useless complaining”, and yet you read them, that would make you seem dumb.

        First Hint:
        When you see any comment by “Richard Greene”, which is my real name, stop reading and move on to the next comment.

        I suspect English is not your first language, and hope you understand my hint.

        If not, please reply here, and next time I will type my hint slower, so even you can understand !

        Second Hint:
        Do not go to my climate website for non-scientists, at
        http://www.elOnionBloggle.Blogspot.com

  61. I did not read every comment, I am aging too fast as it is, but the point seems to be missed that it is not even of much interest if 2016 was the warmest on record.
    Everyone seems to agree that it has warmed, is warming and will likely continue to warm, for at least the short term, at some arguable rate. As long as we agree that some warming is happening then it would not be possible not to have record warm periods at multiple points until cooling ensues.
    In other words, record warm years are totally normal and totally within the scope of the arguments of both warmists and luke warmists.
    Since the degree of warming indicated by the record is insignificant and probably an outlier in the data, it has no validity as an argument for anyone. Media and alarmists are simply using it, as a factoid (of dubious value), in inappropriate ways.

    • I don’t know your age, but you display great wisdom that usually comes with age.

      It does not matter if 2016 is slightly warmer or cooler than 1880.

      Pick any two dates in the past 4.5 billion years and the average temperature will almost certainly be different (assuming we had accurate data to make a comparison).

      The only part of your comment I’d like to correct, is one sentence:
      “Everyone seems to agree that it has warmed, is warming and will likely continue to warm”

      My comments:
      I don’t agree that “it has warmed” — 1 degree C. of claimed warming since 1880, with a reasonable measurement margin of error of +/- 1 degree C., could mean the claimed warming is nothing more than measurement error — maybe not likely, but possible.

      I don’t agree that it “is warming” — there has been a flat trend between the temporary 1998 and 2015 El Nino peaks and I have no idea if pre-1998 warming will continue after the “pause”, nor does anyone else know for sure.

      I don’t agree it “will likely continue to warm” because no one can predict the future climate, therefore any “prediction” would be nothing more than a meaningless wild guess.

  62. Clyde Spencer on April 13, 2017 at 6:17 pm

    I maintain that the appropriate metric for uncertainty is the standard deviation and not the standard error of the mean.

    Stays in contradiction with a number of sources you may download.

    One of them is this:
    https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1255808/

    where you read

    So, if we want to say how widely scattered some measurements are, we use the standard deviation.

    If we want to indicate the uncertainty around the estimate of the mean measurement, we quote the standard error of the mean.

    The standard error is most useful as a means of calculating a confidence interval. For a large sample, a 95% confidence interval is obtained as the values 1.96×SE either side of the mean. We will discuss confidence intervals in more detail in a subsequent Statistics Note.

    The standard error is also used to calculate P values in many circumstances.

    It is easy to understand even for the layman in statistics: the uncertainty decreases when the sample size increases. And the standard error is nothing else than the standard deviation divided by the square root of the sample size.

    • But there is no standard error for global temperature estimates because they are using area averaging, also called grid averaging. The temperature data is averaged separately for each grid cell, of which there are over a thousand. Each grid cell average has a standard error, but then these grid cell averages are averaged to get the global average. There is no known way to combine the various cell standard errors to get a global standard error. One can treat the grid cell averages as sampling data but that is completely misleading, because it ignores the errors in the cell averages.

      • David Wojick on April 18, 2017 at 9:23 am

        I would like to thank you for your polite response which, as opposed to Greene’s aggressive and unscientific blah blah, is well worth a reply.

        You write:

        But there is no standard error for global temperature estimates because they are using area averaging, also called grid averaging.

        1. That sounds indeed very nice! But do you really not know that
        – satellite temperature readings are collected exactly the same way, namely in grids (e.g. for UAH, of 2.5° per cell, i.e. for the Globe 144 cells per latitude stripe x 72 latitudes – of which however the northernmost and southernmost three contain no valid data);
        – UAH’s temperature record DOES NOT CONTAIN ANY even simple treatment of standard errors visible to anybody on Internet?

        Please compare the temperature data published by e.g. Hadley
        http://www.metoffice.gov.uk/hadobs/hadcrut4/data/current/time_series/HadCRUT.4.5.0.0.monthly_ns_avg.txt
        described here
        http://www.metoffice.gov.uk/hadobs/hadcrut4/data/current/series_format.html
        with that of UAH
        http://www.nsstc.uah.edu/data/msu/v6.0/tlt/uahncdc_lt_6.0.txt

        While UAH publishes simple averages without any standard error information, the surface temperature record additionally contains 95% confidence intervals. Even Bob Tisdale did mention that at least for the HadCRUT4 data..

        That is the reason why I load UAH data into Excel, what allows me to compute, for all nine zones and regions, a standard error using Excel’s linear estimate function; even if it ignores matters like autocorrelation, it gives excellent results.

        2. David, I am a layman like you certainly are too. I lack knowledge about complex statistic methods, so your question of how to integrate single standard error estimates into a global one I can only answer with a hint on what I discovered using Google.

        The very first link was interesting enough:
        https://stats.stackexchange.com/questions/55999/is-it-possible-to-find-the-combined-standard-deviation?noredirect=1&lq=1

        This is of course not the reference I would like to give you here (a complete article would be far better). But at least it represents a good approximation of how the math solution shall look like.

        3. Everybody may process UAH’s grid data out of the files

        http://www.nsstc.uah.edu/data/msu/v6.0beta/tlt/tltmonamg.1978_6.0beta5
        through
        http://www.nsstc.uah.edu/data/msu/v6.0/tlt/tltmonamg.2017_6.0

        in order to obtain lots of additional info, e.g. the linear OLS trend per single grid cell, or of a latitude stripe, or of specific regions not provided in UAH’s file, e.g. NINO3+4 (5S:5W — -170W:-120W), etc etc.

        It would be interesting to compute, following the description above, the combined standard deviation (or error) for arbitrary subsets of the 9,504 UAH cells.

        But that’s a lot of work because GNU’s Scientific library probably does not contain the appropriate function like it offers for single estimates, so one has to do all the bloody job by hand.

      • I am not referring to combining sample sets, but rather to the standard error of an average that is made by averaging the averages of over a thousand temperature sample sets, one for each grid cell. If what you have pointed to does that then great, but I do not see how.

        I am not exactly a layman. I know little about advanced statistical methods but a great deal about what are called the “foundations” of statistical sampling theory. My Ph.D. is in analytical philosophy and the foundations of mathematics are a part of that which I do work in. It has to do with looking at the postulates that underlie a body of math. In particular, statistical sampling theory is based on probability theory, which imposes some very strict requirements. It appears that grid cell averaging violates several of these, but that is a research question.

    • You are clueless Bindiddidion — the temperature data can’t be trusted because they are collected and compiled by people who can’t be trusted.

      Statistics can not make faulty, biased, made up, and wild guessed temperature data more accurate — although most people can be fooled by false precision claims.

      The warmunists claim CO2 controls the climate, which is false, and expect global warming because CO2 levels are rising, based only on laboratory experiments, unproven in real life.

      Every month at least half the planet has no temperature data, so government bureaucrats use their own wild guesses.

      At least half of the claimed warming since 1880 is from government bureaucrat “adjustments” to raw data, which often “disappears”.

      Much of the temperature data are from instruments with +/- 1 degree C. accuracy, and yet government bureaucrats present global temperatures with two decimal places, and claim one year was hotter than another by +0.02 degrees C., which is grossly false precision.

      Most of the warmunists claim to KNOW the future average temperature, which is false.

      Some claim to have 95% confidence in their prediction of the future temperature, which is a nonsense number with no scientific meaning, and certainly false after three decades of very wrong global climate model predictions.

      You are clueless Bindiddidion, because you don’t realize the coming global warming catastrophe is not science at all — it is a political tool used to scare people into accepting more and more powerful central governments.

      Wild guess predictions of the future average temperature are not science.

      Very rough estimates of the average temperature, compiled by people with a huge global warming bias, ARE DATA NOT WORTHY OF STATISTICAL ANALYSES.

      The use of statistics on faulty data is a form of propaganda when used by the warmunists — the average temperature of Earth presented in hundredths of a degree C., for one example, impresses people a lot more than being rounded to the nearest one-half degree.

      The use of statistics on faulty data is a form of stupidity when used by the warmunists — I have read articles at this website, for one example, examining monthly temperature anomalies in thousandths of a degree C. — three decimal places !

      Statistics are nearly worthless when people collecting the raw data have an agenda, a near-religious belief in a coming global warming disaster, and a propensity to make repeated “adjustments” to the raw data that usually result in more ‘global warming’.

      Statistics are nearly worthless when 99.9999% of historical climate data are not available for analyses, and no one has any idea what a “normal” average temperature is.

      Statistical analyses of very rough temperature data, with not even close to global coverage, available only for a very tiny percentage of Earth’s history, and collected by bureaucrats hired only if they believe in CAGW, is mathematical mass-turbation.

      • I wrote:
        “The use of statistics on faulty data is a form of stupidity when used by the WARMUNISTS — I have read articles at this website, for one example, examining monthly temperature anomalies in thousandths of a degree C. — three decimal places !”

        My intended sentence was:
        The use of statistics on faulty data is a form of stupidity when used by the SKEPTICS — I have read articles at this website, for one example, examining monthly temperature anomalies in thousandths of a degree C. — three decimal places !

    • Note that the only uncertainty that decreases with increased sample size is the pure (that is, probabilistic) error of sampling. There are may other forms of uncertainty, some of which can increase with sample size, such as systemic bias.

  63. What we really need is a serious NOAA research program on all of the uncertainties in these temperature estimates. Here is my outline. Comments welcome.

    A needed NOAA temperature research program

    NOAA’s global and US temperature estimates have become highly controversial. The core issue is accuracy. These estimates are sensitive to a number of factors, but the magnitude of sensitivity for each factor is unknown. NOAA’s present practice of stating temperatures to a hundredth of a degree is clearly untenable, because it ignores these significant uncertainties.

    Thus we need a focused research program to try to determine the accuracy range of NOAA temperature estimates. Here is a brief outline of the factors to be explored. The goal is to attempt to estimate the uncertainty each contributes to the temperature estimates.

    Research question: How much uncertainty does each of the following factors contribute to specific global and regional temperature estimates?

    1. The urban heat island effect (UHI).
    2. Local heat contamination or cooling.
    3. Other station factors, to be identified and explored.
    4. Adjustments, to be identified and explored.
    5. Homogenization.
    6. The use of SST proxies.
    7. The use of an availability sample rather than a random sample.
    8. Interpolation or in-fill.
    9. Area averaging.
    10. Other factors, to be identified and explored.

    To the extent that the uncertainty range contributed by each factor can be quantified, these ranges can then be combined and added into the statistical temperature model. How to do this is itself a research need.

    The resulting temperature estimates will probably then be in the form of a likely range, or some such, not a specific value, as is now done. The nature of these estimates remains to be seen. Note that most of this research will also be applicable to other surface temperature estimation models.

    David

    • Why waste more money?

      Let NOAA provide ONLY useful information on weather today, and for the coming week.

      The world has wasted far too much money compiling average temperature data, and then making policy decisions using that very rough data … in spite of no evidence the current climate is bad, or even much different than 150 years ago.

      • Because the research results would render these claims permanently useless, including in future Administrations. As Walt Warnick says, when you capture the enemy’s cannon, don’t throw it away, turn it on them.

      • Reply to Mr. Wojick comment on April 18, 2017 at 12:24 pm

        Do you trust government bureaucrats to honestly compile temperature actuals?

        Do you think that if we knew the average temperature in the PAST 150 years with 100% accuracy, those data could refute claims of a COMING climate catastrophe that’s off in the FUTURE ?

        Bureaucrats can “cook the books” so global cooling can’t happen and every year is a few hundredths of a degree warmer than the prior year — wait a minute, it seems they are already doing that !

        And please remember — runaway global warming is ALWAYS going to happen in the future — it’s been coming for 30 years, and must have gotten lost in New Jersey, because it has never arrived … and it will never arrive.

        The perfect political boogeyman is one that is invisible, and always off in the future — that’s exactly what CAGW is.

      • My goal is to discredit the surface statistical models by exposing their many uncertainties and error sources. In addition to NOAA there is GISS, HadCRUT and BEST, maybe more. Only NOAA and GISS are federally funded so zeroing them will not stop the flow of alarmist numbers.

        The statistical community has long said informally that a lot of the climate statistics are unreliable. I want them to say it formally via a research program. Turn the statistical cannon around and fire it on the alarmists.

    • David Wojick on April 18, 2017 at 9:13 am

      1. NOAA’s global and US temperature estimates have become highly controversial.

      Like GISTEMP’s, NOAA’s data is, as far as land surfaces are concerned, based on the GHCN V3 station network, whose three day temperature datasets (minimum, maximum, average) exist in two variants: unadjusted and adjusted.

      To start, here is a comparison of GHCN unadjusted average data with GISTEMP’s land-only and land+ocean (I did never manage to obtain NOAA’s land-only data because it wouldn’t differ enough from GISTEMP’s):

      You clearly can observe that GISTEMP’s data (the result of huge homogenisation and infilling) shows
      – less deviations from the mean than does the raw, harsh GHCN record, thus less uncertainties;
      – a lower linear trend than GHCN.

      The linear trends ± 2 σ for the Globe, 1880-2016 in °C / decade:
      – GHCN unadj: 0.214 ± 0.058
      – GISTEMP land: 0.101 ± 0.001
      – GISTEMP land+ocean: 0.071 ± 0.001

      { Please apologise for the exaggerated precision: when you build averages of averages all the time and shift lots of data from a 1951-1980 baseline to one at 1981-2010, you move into rounding errors when using only one digit after the decimal point. }

      So if the GISTEMP people ever had the intention to cool the past in order to warm the present, as is so often pretended: why did they not keep their data exactly at the GHCN level?

      2. The urban heat island effect (UHI)

      It is known since longer time that if we draw
      – a plot of GHCN containing only data from stations with population type ‘R’ (rural) and nightlight level ‘A’ (least)
      and
      – a plot of all the data
      the latter’s trend for the Globe inevitably will be somewhat higher than that of the rural part:

      The linear trends ± 2 σ for the Globe, 1880-2016 in °C / decade:
      – GHCN unadj rural: 0.185 ± 0.079
      – GHCN unadj all: 0.214 ± 0.058

      Why should there be no UHI traces in temperature records? The Globe is what Mankind has made of it, huge towns and plants of 1 GW-el each producing 3.3 GW-th included. That is part of the warming process (yes yes yes: not ‘due to’ CO2).

      3. Last not least: a similar comparison for the CONUS

      CONUS is, in comparison with the Globe, an incredibly stable piece of land, and so show temperatures for it!

      And that you see best when comparing there, during 1979-2016, GHCN surface data with UAH’s satellite record:

      • I am calling for a research program and it sounds like you might apply for a grant. I take it that you do not disagree with any of my topics.

  64. It has been a week or 2 since I took (or taught) a digital math class, but as I recall, we had some customers who required that after numerous geometrical manipulations (rotate, translate, intersect, zoom/resize…) we had to retain at least 8 decimal digits. Every multiply or divide potentially lost a bit.

    The traditional way to minimize losses was a double-sized register for intermediate values. But there have long been software packages which kept designated numbers of bits.

    Question: What effect does missing data (temperature readings not recordable due to broken thermometer…) on precision and/or accuracy? Can interpolation techniques avoid distortions in trends/series? And just for completeness (and because a warmist hysteric mathematician friend made a claim): Can accuracy and/or precision be “recovered” or “improved” by interpolating to fill in missing data? :B-)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s