The Met Office UK: Our Heros

Guest Essay by Kip Hansen

 

clip_image002

Those following the various versions of the “2014 was the warmest year on record” story may have missed what I consider to be the most important point.

The UK’s Met Office (officially the Meteorological Office until 2000) is the national weather service for the United Kingdom. Its Hadley Centre in conjunction with Climatic Research Unit (University of East Anglia) created and maintains one of the world’s major climatic databases, currently known as HADCRUT4 which is described by the Met Office as “Combined land [CRUTEM4] and marine [sea surface] temperature anomalies on a 5° by 5° grid-box basis”.

The first image here is their current graphic representing the HADCRUT4 with hemispheric and global values.

The Met Office, in their announcement of the new 2014 results, made this [rather remarkable] statement:

“The HadCRUT4 dataset (compiled by the Met Office and the University of East Anglia’s Climatic Research Unit) shows last year was 0.56C (±0.1C*) above the long-term (1961-1990) average.”

The asterisk (*) beside (+/-0.1°C) is shown at the bottom of the page as:

“*0.1° C is the 95% uncertainty range.”

 

So, taking just the 1996 -> 2014 portion of the HADCRUT4 anomalies, adding in the Uncertainty Range as “error bars”, we get:

clip_image004

The journal Nature has a policy that any graphic with error bars” – with quotes because these types of bars can be many different things – must include an explanation as to exactly what those bars represent. Good idea!

Here is what the Met Office means when it says Uncertainty Range in regards HADCRUT4, from their FAQ:

“It is not possible to calculate the global average temperature anomaly with perfect accuracy because the underlying data contain measurement errors and because the measurements do not cover the whole globe. However, it is possible to quantify the accuracy with which we can measure the global temperature and that forms an important part of the creation of the HadCRUT4 data set. The accuracy with which we can measure the global average temperature of 2010 is around one tenth of a degree Celsius. The difference between the median estimates for 1998 and 2010 is around one hundredth of a degree, which is much less than the accuracy with which either value can be calculated. This means that we can’t know for certain – based on this information alone – which was warmer. However, the difference between 2010 and 1989 is around four tenths of a degree, so we can say with a good deal of confidence that 2010 was warmer than 1989, or indeed any year prior to 1996.” (emphasis mine)

This is a marvelously frank and straightforward statement. Let’s parse it a bit:

• “It is not possible to calculate the global average temperature anomaly with perfect accuracy …. “

Announcements of temperature anomalies given as very precise numbers must be viewed in light of this general statement.

• “…. because the underlying data contain measurement errors and because the measurements do not cover the whole globe.”

The reason for the first point is that the original data themselves, right down to the daily and hourly temperatures recorded in humongous data sets, contain actual measurement errors – part of this includes such issues as accuracy of equipment and units of measurement – and errors introduced by methods to attempt to account for “measurements do not cover the whole globe” – various methods of in-filling.

• “However, it is possible to quantify the accuracy with which we can measure the global temperature and that forms an important part of the creation of the HadCRUT4 data set. The accuracy with which we can measure the global average temperature of 2010 is around one tenth of a degree Celsius.

Note well that the Met Office is not talking here of statistical confidence intervals but “the accuracy with which we can measure” – measurement accuracy and its obverse, measurement error. What is that measurement accuracy? “…around one tenth of a degree Celsius” or, in common notation +/- 0.1 °C. Note also that this is the Uncertainty Range given for the HADCRUT4 anomalies around 2010 – this uncertainty range does not apply, for instance, to anomalies in the 1890s or the 1960s.

• “The difference between the median estimates for 1998 and 2010 is around one hundredth of a degree, which is much less than the accuracy with which either value can be calculated. This means that we can’t know for certain – based on this information alone – which was warmer.”

We can’t know (for certain or otherwise) which is different from any of the other 21st century data points that are reported as within 100ths of a degree of one another. The values can only be calculated to an accuracy of +/- 0.1˚C

And finally,

• “However, the difference between 2010 and 1989 is around four tenths of a degree, so we can say with a good deal of confidence that 2010 was warmer than 1989, or indeed any year prior to 1996.”

It is nice to see them say “we can say with a good deal of confidence” instead of using a categorical “without a doubt”. If two data are 4/10ths of a degree different, they are confident of a difference and the sign, + or -.

Importantly, Met Office states clearly that the Uncertainty Range derives from the accuracy of measurement and thus represents the Original Measurement Error (OME). Their Uncertainty Range is not a statistical 95% Confidence Interval. While they may have had to rely on statistics to help calculate it, it is not itself a statistical animal. It is really and simply the Original Measurement Error (OME) — the combined measurement errors and lack of accuracies of all the parts and pieces, rounded off to a simple +/- 0.1˚C, which they feel is 95% reliable – but has a one in twenty chance of being larger or smaller. (I give links for the two supporting papers for HADCRUT4 uncertainty at the end of the essay.****)

 

UK Met Office is my “Hero of the Day” for announcing their result with its OME attached – 0.56C (±0.1˚C) – and publicly explaining what it means and where it came from.

[ PLEASE – I know that many, maybe even almost everyone reading here, think that the Met Office’s OME is too narrow. But the Met Office gets credit from me for the above – especially given that the effect is to validate The Pause publically and scientifically. They give their two papers**** supporting their OME number which readers should read out of collegial courtesy before weighing in with lots of objections to the number itself. ]

Notice carefully that the Met Office calculates the OME for the metric and then assigns that whole OME to the final Global Average. They do not divide the error range by the number of data points, they do not reduce it, they do not minimize it, they do not pretend that averaging eliminates it because it is “random”, they do not simply ignore it as if was not there at all. They just tack it on to the final mean value – Global_Mean( +/- 0.1°C ).

In my previous essay on Uncertainty Ranges… there was quite a bit of discussion of this very interesting, and apparently controversial, point:

Does deriving a mean* of a data set reduce the measurement error?

Short Answer: No, it does not.

I am sure some of you will not agree with this.

So, let’s start with a couple of kindergarten examples:

Example 1:

Here’s our data set: 1.7(+/-0.1)

Pretty small data set, but let’s work with it.

Here are the possible values: 1.8, 1.7, 1.6 (and all values in between)

We state the mean = 1.7 Obviously, with one datum, it itself is the mean.

What are the other values, the whole range represented by 1.7(+/-0.1)?:

1.8 and every other value to and including 1.6

What is the uncertainty range?: + or – 0.1 or in total, 0.2

How do we write this?: 1.7(+/-0.1)

Example 2:

Here is our new data set: 1.7(+/-0.1) and 1.8(+/-0.1)

Here are the possible values:

1.7 (and its +/-s) 1.8, 1.6

1.8 (and its +/-s) 1.9, 1.7

What’s the mean of the data points? 1.75

What are the other possible values for the mean?

If both data are raised to their highest value +0.1:

1.7 + 0.1 = 1.8

1.8 + 0.1 = 1.9

If both are lowered to their lowest -0.1:

1.7 – 0.1 = 1.6

1.8 – 0.1 = 1.7

What is the mean of the widest spread?

1.9 + 1.6 / 2 = 1.75

What is the mean of the lowest two data?

1.6 + 1.7 / 2 = 1.65

What is the mean of the highest two data:

1.8 + 1.9 / 2 = 1.85

The above give us the range of possible means: 1.65 to 1.85

0.1 above the mean and 0.1 below the mean, a range of 0.2

Of which the mean of the range is: 1.75

Thus, the mean is accurately expressed as 1.75(+/-0.1)

Notice: The Uncertainty Range, +/-0.1, remains after the mean has been determined. It has not been reduced at all, despite doubling the “n” (number of data). This is not a statistical trick, it is elementary arithmetic.

We could do this same example for data sets of three data, then four data, then five data, then five hundred data, and the result would be the same. I have actually done this for up to five data, using a matrix of data, all the pluses and minuses, all the means of the different combinations – and I assure you, it always comes out the same. The uncertainty range, the original measurement accuracy or error, does not reduce or disappear when finding of the mean of a set of data.

I invite you to do this experiment yourself. Try the simpler 3-data example using the data like 1.6, 1.7 and 1.8 ~~ all +/- 0.1s. Make a matrix of the nine +/- values: 1.6, 1.6 + 0.1, 1.6 – 0.1, etc. Figure all the means. You will find a range of means with the highest possible mean 1.8 and the lowest possible mean 1.6 and a median of 1.7, or, in other notation, 1.7(+/-0.1).

Really, do it yourself.

This has nothing to do with the precision of the mean. You can figure a mean to whatever precision you like from as many data points as you like. If your data share a common uncertainty range (original measurement error, a calculated ensemble uncertainty range such as found in HADCRUT4, or determined by whatever method) it will appear in your results exactly the same as the original – in this case, exactly +/- 0.1.

The reason for this is clearly demonstrated in our kindergarten example of 1, 2 and 3-data data sets – it is a result of the actual arithmetical process one must use in finding the mean of data each of which represent a range of values with a common range width*****. No amount of throwing statistical theory at this will change it – it is not a statistical idea, but rather an application of common grade-school arithmetic. The results are a range of possible means, the mean of which we use as “the mean” – it will be the same as the mean of the data points when not taking into account the fact that they are ranges. This range of means is commonly represented with the notation:

Mean_of_the_Data Points(+/- one half of the range)

– in one of our examples, the mean found by averaging the data points is 1.75, the mean of the range of possible means is 1.75, the range is 0.2, one-half of which is 0.1 — thus our mean is represented 1.75(+/-0.1).

If this notation X(+/-y) represents a value with its original measurement error (OME), maximum accuracy of measurement, or any of the other ways of saying that the (+/-y) bit results from the measurement of the metric then X(+/-y) is a range of values and must be treated as such.

Original Measurement Error of the data points in a data set, by whatever name**, is not reduced or diminished by finding the mean of the set – it must be attached to the resulting mean***.

 

# # # # #

 

* – To prevent quibbling, I use this definition of “Mean”: Mean (or arithmetic mean) is a type of average. It is computed by adding the values and dividing by the number of values. Average is a synonym for arithmetic mean – which is the value obtained by dividing the sum of a set of quantities by the number of quantities in the set. An example is (3 + 4 + 5) ÷ 3 = 4. The average or mean is 4. http://dictionary.reference.com/help/faq/language/d72.html

** – For example, HADCRUT4 uses the language “the accuracy with which we can measurethe data points.

*** – Also note that any use of the mean in further calculations must acknowledge and account for – both logically and mathematically – that the mean written as “1.7(+/-0.1)” is in reality a range and not a single data point.

**** – The two supporting papers for the Met Office measurement error calculation are:

Quantifying uncertainties in global and regional temperature change using an ensemble of observational estimates: the HadCRUT4 data set

Colin P. Morice, John J. Kennedy, Nick A. Rayner, and Phil D. Jones

and

Reassessing biases and other uncertain ties in sea-surface temperature observations measured in situ since 1850, part 2: biases and homogenisation

J. J. Kennedy , N. A. Rayner, R. O. Smith, D. E. Parker, and M. Saunby

***** – There are more complicated methods for calculating the mean and the range when the ranges of the data (OME ranges) are different from datum to datum. This essay does not cover that case. Note that the HADCRUT4 papers do discuss this somewhat as the OMEs for Land and Sea temps are themselves different.

# # # # #

Author’s Comment Policies: I already know that “everybody” thinks the UK Met Office’s OME is [pick one or more]: way too small, ridiculous, delusional, an intentional fraud, just made up or the result of too many 1960s libations. Repeating that opinion (with endless reasons why) or any of its many incarnations will not further enlighten me nor the other readers here. I have clearly stated that it is the fact that they give it at all and admit to its consequences that I applaud. Also, this is not the place continue your One Man War for Truth in Climate Science (no matter which ‘side’ you are on) – please take that elsewhere.

Please try to keep comments to the main points of this essay –

Met Office’s remarkable admission of “accuracy with which we can measure the global average temperature” and that statement’s implications.

and/or

“Finding the Mean does not Reduce Original Measurement Error”.

I expect a lot of disagreement – this simple fact runs against the tide of “Everybody- Knows Folk Science” and I expect that if admitted to be true it would “invalidate my PhD”, “deny all of science”, or represent some other existential threat to some of our readers.

Basic truths are important – they keep us sane.

I warn commenters against the most common errors: substituting definitions from specialized fields (like “statistics”) for the simple arithmetical concepts used in the essay and/or quoting The Learned as if their words were proofs. I will not respond to comments that appear to be intentionally misunderstanding the essay.

# # # # #

Advertisements

284 thoughts on “The Met Office UK: Our Heros

      • Later this year when the Homogenization errors start making some headway into the mainstream press, there will be a horrible realization…that without the adjustments, there is almost no statistically significant warming since the early 1940s.. At that point the facade of certainty will crumble, except for the faithful who may well continue the green religion well into the next glacial period. At that point the hindcasting of the models will be invalidated.
        I expect a soul-crushing loss of “faith” in science when this falls apart. People never should have had faith in science to begin with…it truly has no place there. People will once again realize that academics aren’t the best people to consult on real-world problems. In an academic world you can twist the world to your will with thought experiments and ignore the consequences. But when you f*ck with the real world…the real world f*cks you back.

      • Poitsplace said “I expect a soul-crushing loss of “faith” in science when this falls apart.”
        That’s already happened to some of us, thank God. But politicans have their sights on gaining a vast amount of power and wealth, with the Green movement as their justification. And so they’re dumping vast amounts of money into science to get the results they need. Even when more people awaken to the hoax, how can anyone stop that momentum?

      • What bothers me is that none of these global temperature anomalies derive their 90% confidence interval directly from the distribution of the underlying means- they appear to be applying the Central Limit Theorem to variables that do not have identical probability distributions. The CLT has rules and the method used to determine the uncertainty of the mean temperature anomalies appears to break those rules (in a number of ways).
        How can a variable be independent when it is subject to homogenisation, for instance?
        How can we assume that the distribution of monthly mean temperature changes (from an historical baseline) is identical for every calendrical month or at every location on the earth?
        How can we accept that any change in monthly mean anomaly distribution occurs simultaneously at every place on the planet?
        Surely local climate changes often do occur in isolation.

      • giving a combined uncertainty of +/- 0.16 K

        With that being the case, then any anomaly that is 0.32 lower than the 2014 anomaly average could be statistically tied with 2014. We therefore have a 21 way tie for first place according to Hadcrut4.3 with 1990 the earliest year.

    • Reply to Comment Thread started by Git ==> All good points. Monckton correctly points out that it is quite possible to find a greater overall interval for GAST per HADCRUT4 than the stated 0.1C delving deep into their papers.
      However, I give them credit for setting the record even approximately straight in their warmest year announcement — a heart-warming occurrence.

      • Agreed – it is really important to science that results are [regularly] ( ideally always) quoted with the error bars in the press. This has got to be so much better than the current norm – the papal edict like quotation in the media giving a false sense of absolute certainty to the public.
        I also think the simplest method, based on the measurement error, as in this post, should be used. It is easier verified adn to my mind is the only valid method. I am not even sure I believe in the veracity of using anomolies rather than deriviing trends for an individual station’s raw data.

  1. “Does deriving a mean* of a data set reduce the measurement error? Short Answer: No, it does not.”
    I have read the essay and I’m going to eventually look at it again. If I understand correctly, there is a difference between measurement error and the signal:noise ratio.
    My understand of S:N is that if you want to reduce noise, you must take multiple measurements of the same thing. So in the case of temperature stations, the way to reduce noise would be to have several thermometers (e.g. ten of them) working in parallel, in the same spot. To get the purest possible signal, you would then derive the mean from these ten thermometers. But the measurement error would still be +/-0.1.
    So, am I understanding correctly?
    P.S. I am a bit obsessive about spelling and grammar, so can someone check that ‘heros’ is a legitimate plural of ‘hero’?
    [‘heroes’ ~mod.]

    • Thank you. It may be basic arithmetic (which I agree with) but basic spelling is also important.
      This is an original post. Not just a throwaway comment. This annoys me quite a lot.

      • reply re Heroes/heros ==> For those unable to get over the heroes/heros question. In English, the correct plural is HEROES, unless you are referring to the sandwich by the same name, in which case it is HEROS. In Greek, the original word is commonly believed to have been “heros” (with Greek letters).
        Thanks to all those defending standardized English spelling.

    • Taking the average of a large number of readings does in fact give you a more accurate mean IF the readings are truly distributed along a bell shaped probability curve.
      The faux science in the article is to ensure that in the specific examples given, they are not.
      Years ago for O level physics we did this with everybody calculating the value of ‘g’ using pendulums. The distribution of answers was very bell curved. The mean answer was within 1% of the official value. None of the actual readings and individual calculations was, however.

      • If those “…large number of readings…” have been altered/adjusted/manipulated beyond recognition (Which is proven they have), calculating an “average” from that data is meaningless!

      • but one has to first establish that the errors are normally distributed, which may not be the case.
        Think, for example, of self measurements of penis size. do you seriously maintain that the errors in the repeated measurements of one man, or the errors of many men’s results would normally distributed?

      • Hi Leo, yes if you measure the same thing dozens of times or more you will get a bell shaped curve. But if the chronometer you are using is accurate to +/- 1 millisecond, then you will be stuck with this underlying error as per Kip’s arithmetic. Each data point will be a bit fuzzy.

      • Gary Pearse says:
        February 8, 2015 at 5:04 am
        But if the chronometer you are using is accurate to +/- 1 millisecond, then you will be stuck with this underlying error as per Kip’s arithmetic.

        It all depends on what you’re doing. A common signals experiment involves recovering a sine wave buried deep under noise (20 dB for instance). The detector is a comparator which tells you only if the signal is greater or less than zero. I’m not sure how you would express the precision of the comparator. You could argue that it is 50% of the measurement range.
        The point is that a crude not-very-precise detector can give you any precision you want as long as you can average over enough cycles.
        Kip says:

        Does deriving a mean* of a data set reduce the measurement error?

        Leo Smith and I would both say, from our direct experience, that the statement is true for the right kind of data. I can also tell you (from painful experience) that you can’t just assume that you can improve accuracy by taking an average.
        Matt Briggs points out that most folks who use statistical techniques don’t actually understand what they are doing. Amen Brother.
        If you to know how the math works in the signals case here’s an url: http://www.rand.org/content/dam/rand/pubs/papers/2005/P305.pdf

      • Matt Briggs points out that most folks who use statistical techniques don’t actually understand what they are doing. Amen Brother.

        Sampling works for the sine wave because it has an underlying mean (zero). The same can be said for calculating a value for ‘g’ using pendulums.
        It can be argued however that the earth does not have an underlying average temperature as measured by the MET or other services, because different average surface temperatures are possible for the same amount of energy (because energy radiates as the 4th power, while average use the first power). Depending on how the energy is distributed, 2 identical earths, receiving and radiating the exact same amount of energy, could thus have two different average surface temperatures.
        Thus, statistical inferences based on average surface temperature without regard for distribution could be meaningless. Worse, they could be misleading. And yet we have an entire scientific discipline using average surface temperature as a metric. Hardly the stuff of science.
        Here is a simple test that one can apply to check that what I’m saying is true. Divide the earth into two parts. In one example, both parts have a temperature of 10. In the second example, one part has temperature of 1, the other 11.892.
        (10+10)/2 = 10 average
        (10^4+10^4) = 20000 energy out
        (1^4+11.892^4) = 20000 energy out
        (1+11.892)/2 = 6.4 average
        Same energy out, different average temperatures!!!
        Since energy in must equal energy out, depending on how you divide the earth, you can arrive at two different average surface temperatures for the exact same amount of energy in and out.
        What this means is that techniques such as pairwise adjustment are invalid, because they adjust the temperature linearly, while the physical word is determined by radiative balance which is a 4th power calculation.

      • Since energy in must equal energy out, depending on how you divide the earth, you can arrive at two different average surface temperatures for the exact same amount of energy in and out.

        What it also means is that average surface temperature is a meaningless metric, because an infinite number of different average surface temperatures are possible for the EXACT SAME forcings.
        Leave the forcings unchanged, and the earth can vary its average surface temperature simply by changing the distribution of energy. Make one place very cold and another place a little bit hotter; and because radiated energy varies as the 4th power the energy balance remains the same, while the average changes drastically.
        So for example, on Venus where CO2 concentrations are high, temperatures are almost the same between day and night, even though day and night each last 116 earth days, because CO2 radiates heat sideways around the planet.
        The same thing is happening on earth. While the energy remains unchanged, the average appears to be increasing because the difference between min and max is changing. The max is not increasing, if anything it will be decreasing. What we are seeing is an arithmetic illusion. A linear measure is being used to describe a 4th power event, which delivers a nonsense result.
        An entire branch of Science, Climate Science, has been mislead because they are using a linear measure as a metric for a 4th power physical phenomenon.

      • Which would also explain why the climate models have gone off the rails. they are trained using surface temperatures, but this is only valid if the energy distribution remains unchanged as you add CO2 to the atmosphere.
        But adding CO2 to the atmosphere doesn’t just affect radiation back to the surface. It also affects sideways radiation between night and day and the equator and the poles. Thus you would need to train the models using energy not temperature.
        Otherwise, as CO2 increases, your models would see the change in temperature distribution as a change in energy, while in the physical world there would be no change. This would make the models run hot.

      • Reply to the comment thread started by Leo Smith ==> Leo’s comment is a good opportunity for me to clear up this little point: we are talking about measurements — accuracy of. Although this applies
        to data generated with modern electronic instrumentation measurements as well, it is easier to visualize with the glass thermometer record.
        Thermometer records look like this: 71, 72, 74, 65, etc. They are all to the nearest degree. This is written more correctly as 71(+/-0.5). It is written this way, not signifying error in measurement but simply stating the truth in measurement accuracy. All measurements between 70.5 and 71.5 are recorded as 71. [There is a rounding rule, so strictly, either the upper .5 or the lower .5 will go to the next value.] The actual datum is “70.5-to-71.4999999999” which is more conveniently written 71(+/-0.5). Now that is a range, not a single number.
        This is not “random measurement error” — it is measurement accuracy. The majority of the world’s temperature records are of this sort.
        The chance of the actual temperature at the time of measurement being exactly “71” is almost infinitesimal. So [almost] every temperature recorded as “71” is actually something else, equally likely to be any of the possible values in the range. The actual values which have been recorded as “71” will not form a bell curve — there is no aspect of physics that says “temperatures are more likely to be close to a numbered thermometer degree mark than between marks” — rather, they are spread evenly across a seamless range .
        Please, those wishing to refute the arithmetic and its implications: Perform for yourself the arithmetic for the three-data data set described in the text.

      • Kip, while what you say is true, it’s also irrelevant to the accuracy of either the measured temperatures, or the average of them. The problem with averaging temperatures isn’t just that there’s a rounding process, but that the devices cannot be perfectly accurate, and their inaccuracies do not deviate from the true temperature by equal measures on both sides. If the device says the temp is 70.501 degrees, and we round that up to 71, we risk having rounded up when the temp is actually 70.499 degrees, and should have rounded down to 70. The rounding process merely compounds the error, therefore. And the real uncertainty is that we just don’t know how accurate our temperature device is. Nothing in physics requires these devices to be equally wrong in all directions. Most often, they will show a bias, but one that could only be detected by a much more accurate thermometer. And so there is an inherent uncertainty in any measuring device that must be well-known. Rounding procedures, as well as homogenization, averaging, and so on, only compound that original uncertainty, and can actually make it larger rather than smaller. So applying more and more statistical fixes to the data can easily make it less accurate and more uncertain, because these processes introduce even more biases into the process, rather than eliminating them. The more one tampers with the data, the more one creates further problems, until the original relationship to what is being measured can be almost entirely lost, and only a sticky, statistical goo remains.

    • Karim
      Good for you to be obsessive about grammar and spelling. I try to get it all correctly myself, often without success. I see and hear poor examples all the time, exactly where you shouldn’t, on TV and in newspapers.
      You sentence starting “My understand..” might have been autocorrect. It should have been “My understanding…”
      I just hand to correct my autocorrect 3 times just trying to type all that. 🙂

    • Reply to “hereos?” ==> Indeed….many thanks. I will correct my MS Word dictionary as well. ~ kh

    • as the system is never in equilibrium there is no reason that if you use ten identical thermometers with no error at all not situated extacty at the same place you will find the same numbers..
      the “temperature” is a function of the exact location, the temperature of the air, the wind, the humidity and the device…

    • This is my understanding: to reduce the S:N error you must take multiple readings of the “same thing.” But I think the climate scientists substitute multiple readings of temperature on different days from the same station. That is not the same thing. That is multiple readings of different “things” because each days temperature reading is a different “thing.” And multiple readings from different stations are obviously not the “same thing.” I would agree that if you had an array of thermometers at the same location that would increase the accuracy – but any other manipulation of the data can not increase its accuracy. Tell me where I’m wrong.

      • Reply to Scott Scarborough ==> You are not wrong. But this is a hard pill to swallow for those bred and raised on statistics — which general substitutes ethereal probabilities for an engineers favored realities.

      • Leo, all you’ve done with multiple readings in one place with similar instruments is minimized the probability that one or more of the readings were wonky. You have not increased the accuracy, only the confidence that the readings are inside the stated range of your sensor. And it’s only valid for a single space, it doesn’t work when you are comparing two different locations.
        Take 100 thermometers with +/-0.1°C accuracy and place them in a room as close to one another as you can in a homogenous volume of air. Read all of them simultaneously. You will most likely get a bell curve out of the readings but it’s not guaranteed. You will see a difference as each device is in a different location. (Unless it’s digital with one digit right of the decimal point.) The center of your bell curve is at 20.0°C and your mean/average is also at 20.0°C. You have assumed that all these devices increase your accuracy, but you’d be wrong. They only increase your confidence that the reading is inside of the range 19.9°C to 20.1°C. Now have all of your thermometers checked against a standard that is +/-0.001°C. All of the following are valid for your original data:
        1 – All 100 thermometers were calibrated at the factory by the same guy and the same standard one right after another and are all inside the stated tolerance, but 0.1°C too high making your true temperature 19.9°C.
        2 – All 100 thermometers were calibrated at the factory by the same guy and the same standard one after another and are all inside the stated tolerance, but 0.1°C too low making your true temperature 20.1°C.
        3 – Any combination of the above giving you a spread between the low end of the tolerance and the high end of the tolerance.
        Your 100 thermometers did not increase your accuracy one whit. If one or two thermometers in the mix are NOT inside the stated tolerance, taking multiple readings from multiple devices (in the same location) dilutes their inaccuracies. The problem here is that they don’t have 10 thermometers at each location. That means you have no confidence check against instrument inaccuracy in any of these databases. Each reading stands alone accuracy-wise and no amount of mathematical or statistical gyrations can change that.

        • Yes but, when a single instrument isn’t giving the ‘expected’ readings, don’t they just adjust to correct for that. Who needs multiple thermometers at a location when you might have another one only 10km away ?

      • Your 100 thermometers did not increase your accuracy one whit.

        In data quality, having 100 different address on file for the same person does not increase the accuracy of the result. In fact it reduces the accuracy as compared to having only 1 address on file.

  2. “The accuracy with which we can measure the global average temperature of 2010 is around one tenth of a degree Celsius.”
    They do not say that this accuracy is applicable to 1880 or even 1990. Measurement techniques have improved by Argos and satellites, which may be one of reasons that the temperature is so flat.

    • Most of the surface temperature recording stations in the U.S. have been shown by Anthony Watts and his crew to have errors greater than 1 degree C. I don’t see any reason to believe that the recording stations in the rest of the world are any better. It is not possible to take that data with those kind of errors in it and produce a world temperature that is accurate to 0.1 any measurement system. You can play with the data mathematically for the rest of eternity and you still can’t claim you have data that is accurate to within 0.1 degree. Making the adjustments just biases that error in another direction. My degree is not in math, but you don’t need a degree to figure this out.
      Due to bad siting there is no reason to assume modern surface readings are more accurate than those of the past. The error bar has to be huge when graphed for any period in the history of temperature recording in the United States. After all, Michael Mann used that error to draw his tricky hockey stick graph.
      The argument for how warm it has gotten since 1890 used to be about whether we were talking about 0.3 or 0.7, or whatever. Now Warmists have inserted another zero into the mix to continue their claims that warming continues unabated. At least the Met Office papers will give skeptics ammunition from the Warmist side to throw into the mix.

      • How much better are the satellite records, from an accuracy, consistency of siting and lack of infilling point of view ?

    • Reply to Global Cooling ==> This point is important to the overall discussion of “What is the true Original Measurement Error in the entitre global temperature record?” (We note that it will not be a single number, but different numbers, hopefully narrowing as the time approaches the present.)

  3. It may not be perfect, but matters would improve if GISS and NOAA followed the same path of being up front with their uncertainty ranges when publishing.

    • GISS and NOAA are too busy “adjusting” the data to match a progressivist agenda to bother calculating an uncertainty range. IF government agencies were honest about reporting, the range of uncertainty would probably be surprisingly large.

      • NOAA & GISS are probably quite uncertain about their uncertainties and as the public don’t want uncertainties they are only used to explain failed predictions and exploited to accentuate (or reverse) trends.

      • If the uncertainties were firstly acknowledged and secondly calculated/estimated honestly, my guess is that they would swamp any temperature change over the last 40+ years. That’s not in GISS and NOAA’s own interests so they don’t.

  4. I agree fully with Kip Hansen. Whether you use 1 thermometer, or a billion evenly distributed thermometers around the globe, the OEM error is still as stated.

  5. Intensive qualities don’t average. A 1 kg pot of water at 100° and a 1 kg block of iron at 0° will not average to 50° when placed in contact, because the water contains more heat (the specific heat is higher). So the average will be something higher. So arithmetically averaging different areas or objects is un-real and uninformative.

    • Except, the annual average daily high temperature at Hobart, Tasmania is 16.9°C. Svalbard is 4.7°C. I’d say Hobart is warmer than Svalbard. Apparently you wish to disagree. Why?

      • Where is he disagreeing with that? What he is saying is correct. The energy content of the air depends on humidity and pressure, so the resultant temperature of mixing Hobart andSvalbard air samples together isn’t simply the arithmetic mean.

      • A couple of questions. Wouldn’t this assume that atmospheric composition is uniform? Do the new CO2 maps coming out from satellites obviate this assumption. Is the atmosphere really “well-mixed” enough to allow this assumption? Are the differences between water and air accounted for in data sets? Humidity is mentioned above. Are the differences between salt water and fresh water accounted for – they would matter in the Great Lakes region. If these questions are stupid or irrelevant you can ignore me. We are, however, talking about hundredths and tenths of degrees.

      • @ DaveS
        I didn’t average the Hobart and Svalbard averages. That wouldn’t provide me with any useful information. However, the averages taken at Hobart and Svalbard do provide me with additional information, viz Hobart is warmer than Svalbard. Thus averaging intensive properties can convey useful information if you understand the limitations.

      • TPG, you are bringing a different argument to the table. And it is silly, since everyone will agree that Hobart is warmer than Longyearbyen (which is what I guess you mean when you say Svalbard).
        The point here is that energywise, the situation between Hobart and Svalbard is even more different due to the difference in composition and thermodynamic properties of the air sampled at both sites. AND, taking an average of the two locations, not only the average of each, is silly, meaningless and in no way substantiates a reduction in the original measurement error.
        BrianHs argument, as I read it, is that temperature is a proxy for energy – THUS you need to know the thermodynamic properties of what you are measuring temperature of before you can compare temperatures of different things. Different things being, in this case, different masses of air.

    • Well, yes, a lot of the above, but I see this as a walk back. When a group like Met Office and UEA see, almost every day, analyses of their sources and methods (Homewood on South America recently where cooling was switched to warming, jo nova on Ozzie Met switcheroos….) that they have tried to keep secret (FOI requests ignored), they realize the jig is up.
      I was first struck by the naivete of all the climateers with climategate. These folks of course didn’t expect the emails to be released, but the lack of understanding that the internet, cell phone (recording such as the Harvard adviser on Obamacare) had fundamentally and (hopefully forever – I’m nervous about that!) changed the landscape of the playing field. If your output is shoddy, mistaken, fraudulent….you are uncloaked in minutes to days! You are performing in front of the world and missteps in method or integrity are detected and under analysis almost in real time!
      Scary, but a boon to humankind. This move toward virtual perfection in quality control of human endeavor will give a quantum lift to excellence in science in particular and will also weed out those who can’t perform at this heady level. It also has proven to be the eyes and ears that governments have to work under (Please, don’t let them regulate this away…..There is a lot of thought being given to just that and centrally planned countries already have given the world a blueprint for doing it worldwide!)

    • Brian H: “So arithmetically averaging different areas or objects is un-real and uninformative.”
      TPG: “Except, the annual average daily high temperature at Hobart, Tasmania is 16.9°C. Svalbard is 4.7°C. I’d say Hobart is warmer than Svalbard. Apparently you wish to disagree. Why?”
      The second statement has nothing to do with the first statement. Pompous Git has setup a strawman and wrestled with it.

      • Too true J.A.
        He is also sloppy with his statement “I’d say Hobart is warmer than Svalbard.”
        That statement is not always true. Perhaps it is true more often than not. Probably.

  6. You give an example calculation
    1.9 + 1.6 / 2 = 1.75
    However, you need to place an opening bracket before the 1.9 and a closing bracket after the 1.6 to make the calculation correct. For example:
    (1.9 + 1.6) / 2 = 1.75
    The other examples also need to be adjusted accordingly.
    Regards
    Climate Heretic

      • Well there is no equal either. In the Forth language “.” means print to the screen what is on the stack
        So it is really: 1.9 1.6 + 2 / . 1.75 OK – “OK” being the standard “done” in Forth.
        And of course Forth mainly uses integers unless you have a Floating Point Routine. But no point in getting overly complicated. For now.

    • Reply to Heretic ==> You are correct on notation . .. I am doing simple 6th Grade addition and division … written as if speaking itfrom a podium.

  7. The Northern Hemisphere estimate is an anomaly of about 0.75 degrees and the SH about 0.38, so the OME must be greater than their difference of about 0.4 degrees? What would happen if one compared hemispheres divided be a meridian like 0 degrees Greenwich?
    Remember that much data was converted from Fahrenheit to Celsius. That conversion, to and fro, introduces a rounding error of the magnitude of 0.1 degree that mathematicians better than I can calculate precisely. I’ve played with real data and it does make a difference if you know from the original record that many values were expressed in degrees F with nothing after the decimal.
    Here is a quote from Australia’s BOM on rounding –
    “To address this problem, a rounding-corrected version of the ACORN-SAT data set is to be prepared.
    Following the method of Zhang et al. (2009), it involves adding a random increment of between
    −0.5°C and +0.5°C to all data that have been rounded to the nearest degree (defined as locationmonths
    in which all values are in whole degrees).”
    http://www.cawcr.gov.au/publications/technicalreports/CTR_049.pdf
    There is more if you wish to dig enough.

    • Reply to Geoff Sherrington and The Git ==> It is incredible to me that Zhang (2009) recommends (not read it thoroughly, so I don’t claim it does say so) that the solution to the problem clearly described in my blockquote comment above is to add yet more random errors.
      In my 6th Grade opinion, it would be better to accept the fact and limitations of the original measurements being ranges — Temp(+/-0.5).

  8. Kip Hansen
    You quote the UKMO as saying

    The accuracy with which we can measure the global average temperature of 2010 is around one tenth of a degree Celsius.”

    And you say

    Note also that this is the Uncertainty Range given for the HADCRUT4 anomalies around 2010 – this uncertainty range does not apply, for instance, to anomalies in the 1890s or the 1960s.

    On the basis of that you assert

    UK Met Office is my “Hero of the Day” for announcing their result with its OME attached – 0.56C (±0.1˚C) – and publicly explaining what it means and where it came from.

    Sorry, but providing a partial error estimate for one datum in a data set of more than 100 data points is NOT heroic: it is inadequate.
    Furthermore, it is not only inadequate but is also misleading; as Monckton of Brenchley says above

    The HadCRUT4 dataset publishes not only the data but also the measurement, coverage, and bias uncertainties, giving a combined uncertainty of +/- 0.16 K.

    An inadequate and misleading presentation is not scientific information. It is political spin which merits disdain and does not deserve respect bordering on hero worship.
    Please note that I am providing a rebuttal of the main argument in your essay and I am not posting a “quibble” and/or a “one man crusade for truth”
    Richard

    • You beat me to it Richard. You are correct and I am sorry Kip, you are mistaken. Met Office is sliding a false error range to their customers and the world overall.
      Richard nails it directly; a measurement error of (+/- 0.1) is the maximum accuracy for most modern physical thermometers.
      – That is, if read by a competent thorough and meticulous scientist.
      – That is, if the thermometer is officially calibrated correctly on a regular basis!
      – Which gives you a measurement error for one thermometer, one reading, one employee
      Two thermometers give you the individual thermometer reading accuracy, cumulative.
      When Met Office gives you an error range for the global temperature average, their error range must include all possible error ranges:
      — For every temperature reading (every thermometer should have a sticker listing the ‘officially certified’ error range.)
      — For every station (every station should have a history of ‘verified’ accuracy and an error range.)
      — For every employee (every employee has a history of experience and education who becomes known for certain standards of work and an error range.)
      All of the ‘modern’ networked stations are not immune to error readings, station installation, station certification, connectivity, etc… all of which establishes the error range for that particular node station
      When Met Office gives an anomaly comparison they should be adding in the error ranges for the base years plus the error range for the current year.
      One thing you can be absolutely sure of; when Met Office claims a measurement accuracy of (+/- 0.1) they are referring to one individual finest individual temperature reading alone, not their true accuracy.

    • Reply to Richard S and ATheoK ==> I think we all (nearly all) agree that the Met Office’s admitted / announced “+/-0.1°C” is in actuality to low, way too low, ridiculously too low, a fraud, a lie, or some other complaint.
      They have made a giant leap towards truth and transparency, in my opinion. They are the first major governmental agency producing global temperature numbers to do so. Their doing so validates The Pause and invalidates all the “warmest year” nonsense. They knew it would and published anyway.
      Give them credit for that alone — if you can find it in your hearts and minds. I do.

      • Kip:
        You’re allowed your personal opinions and peccadillos; even if you do write excellent articles and put them up here with big bull’s eyes.
        My heart and mind are somewhat open, (in my narrow opinion); but when a major player in the CAGW scam announces a different lie without explanation, apology or even regret, that lie is still fraudulent.
        This is not the result of good intentions or better science. Credit should be given when those hacks actually earn credit, until then their F– might be raised to F-

  9. Met Office has some of the most powerful computers in the world, but they calculate what they are told to do. I wasn’t entirely happy with their CET annual numbers and voiced my view last July. Now they accepted the method suggested.
    ( Vukcevic July 2014: Since monthly data is made of the daily numbers and months are of different length, I recalculated the annual numbers, using weighting for each month’s data, within the annual composite, according to number of days in the month concerned. This method gives annual data which is fractionally higher, mainly due to short February (28 or 29 days). Differences are minor, but still important, maximum difference is ~ 0.07 and minimum 0.01 degrees C. The month-weighted data calculation is the correct method.
    MetOffice 2015: Because February is a shorter month than all the others, and is usually colder than most all other months, our previous method, which was giving February equal weight alongside all other months, caused the annual temperature values to be pulled down, i.e. giving estimated annual values which were very slightly too cold (the difference varying between 0.01 and 0.07 degC. )
    They also said that all data files including graphs (seasonal temperatures, graphs etc are now recalculated according to the weighting method. Of course this applies not only to the CET temperatures only, but other UK composites, annual averages for number of other climate related data with monthly averages etc. etc.
    It is a positive development that the large institution is willing to undertake a major task to alter all their records.
    It should be interesting to find out how GISS and NOAA have calculated the US and other annual averaged indices.

    • Why calculate to an arbitrary concept such as a ‘month’ which you say requires a length of month weighting?
      Why not use the daily measurements to create a year-to-date figure then compare completed year or YTD figure to the corresponding previous periods?

      • In the early parts of the records no complete daily values were available, so monthly values are computed from what was available and annual average calculated. I assume that the force of habit persisted until the end of 2014. The first data published at beginning of 2015 were based on the improved method.
        Good idea to check with your MO (is it BOM ?) what they do.

      • The Australian BoM is the “match” for the UK MetOffice. However, the BoM are on record for serious manipulation of temperature data. They use (Used to use?) only 112 devices to calculate a “national average”, that’s 1 device for every ~68,500 sqaure kilometers of land, largely at airports. And then they changed how that average was calculated in 2013, making 2013 hot and 2014 hotter still.
        We now are faced with Perth, Western Australia, suffereing severe fire danger. So far, it looks like almost all of the fires were started by arsonists! Where do these people get off?

      • RE: vukcevic February 8, 2015 at 1:46 am
        You are also advocating a monthly figure which requires a length-of-month ‘adjustment/correction/weighting which must add another complication to determining accuracy.
        Why do we need a number for every day to get an annual average? That seems as arbitrary as a measurement every second, minute, hour, month, etc.
        In-filling cannot be accurate to any degree as it is merely a guess, albeit with a mathematical basis.

    • February is not “usually colder than most other months” in Australia, just the opposite. This looks to be just another “warming” adjustment.

      • Hi Bob
        This above refers to the CET (I wasn’t entirely happy with their CET annual numbers), UK and possibly N. Hemisphere but not global temperatures.
        However, if the adjustments are applied to the S. Hemisphere, than adjustment would be downward rather than upward.
        regards

    • Reply to V ==> Yes, that and a few thousands other odd bits….with today’s computing abilities there certainly is no reason to bother with yet another arbitrary non-existent value, another interim calculation, called “monthly average”.

      • Mr. Hansen, thank you for the comment.
        However, I do see some value (but only of local interest) in the monthly averages. Months of June-July (each side of the summer solstice, when insolation is greatest,) for all of 350 years show no rising trend worth mentioning ( less than 0.1C/century suggesting ‘flat TSI’) while most of the warming trend was generated in December/January, around winter solstice, time of minimum insolation; implying that orientation of the polar jet-stream may be responsible for the upward trend in the CET warming since 1660.
        This was an invitation to look at possibility of a separate CET winter & summer forecasts , based on the natural variability.
        with best regards

        • vukcevic

          Months of June-July (each side of the summer solstice, when insolation is greatest,) for all of 350 years show no rising trend worth mentioning ( less than 0.1C/century suggesting ‘flat TSI’) while most of the warming trend was generated in December/January, around winter solstice, time of minimum insolation; implying that orientation of the polar jet-stream may be responsible for the upward trend in the CET warming since 1660.

          Vuk! I’m surprised you missed two important things.
          1. It is only the Northern Hemisphere that is insolated in May-June-July-August. Not the whole planet.
          Further, when the Northern Hemisphere land area is being irradiated in May-June-Jul, that land area is being hit with the yearly LOWEST solar radiation (about 1315 maximum watts/m^2).
          2. When the Southern Hemisphere is being hit with its highest solar insolation in Dec-Jan-Feb each year, its sole bit of maximum-exposed 17 Mkm^2 land area is “all white”. Thus, the Antarctic continent is hit with 1410 watts/m^2, but ALL of it is reflected immediately back into space. There is no other land area between 40 south and the pole! (We will ignore New Zealand for a few minutes, and give a passing salute to the little corner of Argentina and Chili.) That’s it. So, no “dark land area” between 40 south and the south pole when it is being hit the yearly high radiation.
          Up north? Look again at the globe. Look at how much land in North America, China, Siberia, Russia, all of Europe, and Asia are “dark” and growing darker with ever-more greenery and vegetation! during the northern summer months. It is, what, 50% of the land area being hit by northern hemisphere summer radiation? And all of it darker and hotter due to extra CO2.

      • Reply to V ==> Yes, seasonal, monthly, regional calculations all have their places and are valuable information, just not a necessary step in calculating Global annual values — in which case they may add more error than they remove.
        Cheers, thanks for checking in and talking to me.

      • RACookPE1978 says
        Vuk! I’m surprised you missed two important things……
        Hi RAC, thanks for your comment.
        I was referring only to the CET records, small part but extremely well documented, of the Northern Hemisphere and even smaller in the global terms. I found as one moves around globe even in the same longitude band, the natural variability drifts in and out of phase. This led Dr. J. Curry to formulate ‘stadium hypothesis’. I am sceptical about global averaging of data

  10. Good to see a more rational discussion of errors that that of Phil Jones. He even seems to have omitted a 1/n-1 in his eq 12. Phil’s approach cannot be used for heterogeneous data such as global temperatures. If it was true, all we’d have to do is to get enough people holding their middle finger up in the air (to estimate temperature) to get whatever accuracy we wanted.

  11. More than one recent missive from the Met has caused me to think, “Can this be? I’m not sure I can trust it after all that’s gone before…”

    • You can’t Alan. Witness Betts’ tweets. They are still hell bent on maintaining their fund flows and that distorts their honesty.

  12. The accuracy with which we can measure the global average temperature of 2010 is around one tenth of a degree Celsius. in the same way New York is ‘around ‘ Boston ?
    ‘Author’s Comment Policies: I already know that “everybody” thinks the UK Met Office’s OME is [pick one or more]: way too small’ Funny I live in the UK and I have never heard anyone claim its ‘way to small’ and in reality you cannot ignore the manner in which its managed , for the head of any organisation , even if it a science based one , sets the tone for the whole organisation and with Slingo you have a very fixed approach that puts one view above all others on AGW.

  13. The accuracy with which we can measure the global average temperature of 2010 is around one tenth of a degree Celsius. The difference between the median estimates for 1998 and 2010 is around one hundredth of a degree, which is much less than the accuracy with which either value can be calculated.

    Am I alone in being confused as to their use of ‘measure’ and ‘calculate’?
    Surely the global average temperature cannot be measured, as they claim, but can only be calculated from many measurements, all with varying accuracies. Also, cannot any calculation be as accurate to as many significant figures you like, as long as one accepts that the resultant figure does not necessarily reflect reality?
    The flexibility with terms and their definitions seems to a major problem when debating the CAGW idea:
    <blockquote?"When I use a word," Humpty Dumpty said in rather a scornful tone, "it means just what I choose it to mean — neither more nor less."
    "The question is," said Alice, "whether you can make words mean so many different things."
    "The question is," said Humpty Dumpty, "which is to be master – – that's all."
    PS – be gentle with us non-mathematicians/statisticians

    • Reply to John in O ==> An interesting point — and if we still lived in the glass hi/low thermometer days it would be easier to answer.
      To the credit of the Met Office UK, they have finally come around to admitting in a public way that global temperature measurements themselves come with “accuracy bars” — Original Measurement Error” or Measurement Accuracy Ranges — by whatever names, it means what the Met Office says “the accuracy with which we can measure”. They use lots and lots measurements — most rounded to the nearest degree F or C (part of the problem), some from instruments that report to 0.1°C but have other built in errors, untold data are not actually measured at all buy are calculated by in-filling methods all with known inaccuracies and with past sea surface temperatures — it is a guess.
      They use all that data to calculate a result. The calculation itself has an error range or a CI, the measurements have their own measurement accuracies (obversely, measurement inaccuracies or error).
      There are two very in-depth, complex scientific research papers at the bottom of the essay that are the basis for their obviously “good round estimate” of 0.1°C.
      In the end it is considered by many to be improper to give as a result a number that is “more precise” than the precision of the measurements it depends on.

  14. “The UK’s Met Office (officially the Meteorological Office until 2000) is the national weather service for the United Kingdom. Its Hadley Centre in conjunction with Climatic Research Unit (University of East Anglia) created and maintains one of the world’s major climatic databases, currently known as HADCRUT4 which is described by the Met Office as “Combined land [CRUTEM4] and marine [sea surface] temperature anomalies on a 5° by 5° grid-box basis”.” — with passing of time met network changed. In urban areas the network increased and in rural areas — though increased in the measurement of rainfall — has not changed that much with more than two-thirds are under rural areas. This is an important issue for the representativeness of the global average temperature. This vary from region to region, country to country. The data has observational errors as well averaging errors as well change of unit of measurement — prior to 1957 it was in oF and later in oC. With all these errors and inaccurate grids, etc, how can we say temperature increase is caused by anthropogenic greenhouse gases and then extrapolate to 2100 and beyond using models and senasationalization of their impact on nature. This sensationalization diverting the govenments in wrong direction.
    Dr. S. Jeevananda Reddy

  15. Kip: “I warn commenters against the most common errors: substituting definitions from specialized fields (like “statistics”) for the simple arithmetical concepts used in the essay and/or quoting The Learned as if their words were proofs. I will not respond to comments that appear to be intentionally misunderstanding the essay.”
    Stand up for what you believe, Kip. It is your right to interpret the Met Office’s report in arithmetic terms, even though statistics is the conventional method for extracting “meaning from data” and used by all practicing scientists. Conventional statistics would be appropriate at a science blog, but aren’t necessary here, where everyone’s personal perspective is valued.
    As for “original measurement error”, it is a systematic error on your part, but not the Met Office’s.

    • Reply to ItsMyTurn ==> Maybe you have some other mathematics that produce “means” by some other method. I’d love to see it. If you have performed the meths with the three-data data set described in the essay and come up with a different answer, I’d love to see the detailed work.
      If you read not only my essay, do the homework, and read in turn the two papers on which the Met Office bases its statement: “The accuracy with which we can measure the global average temperature of 2010 is around one tenth of a degree Celsius.” and still retain your original viewpoint, please check in again and we’ll see if we can work it out.

  16. It is not possible to calculate the global average temperature anomaly with perfect accuracy because the underlying data contain measurement errors and because the measurements do not cover the whole globe.

    That’s right. Claiming perfect accuracy would be pure non-sense even if the results were reported in absolute figures and not against some relative, moving and thus arbitrary ‘normals’.
    Let alone in a gigantic and chaotic system like the Earth, where a blink of an eye can make a bigger difference than the claimed accuracy (0.1 °C) even in a fixed location. How many square kilometers is one measurement point supposedly representing anyway?
    In addition, the earlier the underlying data ‘corrections’ are stopped, the earlier hysteria about weather (including the semantic climate) can be stopped and real world problems can be identified with style.

    • How many square kilometers is one measurement point supposedly representing anyway?
      311,933.4 km² at the equator. (A box with a side of 558.5 km)

      • What ? Just one, randomly chosen measurement point , to represent 312,000 sq. Km .
        Think of the scope for adjustment in that !
        Is globally averaged temperature just a tool for social control, that would have been the envy of the Catholic Church of the Middle Ages.

  17. continuation — in the last hundred years, there is vast variation in rural and urban ecology [land use and land cover] this will play major role on temperature. You can see the sea change in USA or UK or any other developed country. Are we able to represent these variations in averaging the temperature as they vary with region to region and country to country. Also, so far nobody able to explain why the temperature changing like staircase [step-wise] type variation type — year to year variation or natural cyclic variation is o.k.
    Dr. S. Jeevananda Reddy

  18. I like it , Kip. Great explanation and a nice way of separating measuring instrument error from statistical error.
    You could do another post on measurement error through the last 150yrs or 50yrs, etc. That would also be extremely interesting.
    Many thanks

    • Reply to Stephen Richards ==> Thank you. That is one of my major projects — probably never to be realized. I have collected data for such a project and might post an essay giving highlights but not in-depth analysis.
      The two Met Office papers cover a lot of that ground.

  19. You could use proof by induction to prove the main point in the post; no need to repeat oneself with bigger numbers.
    As far as spread of measurements, I regularly drive about half an hour away, and, according to my car, the temperature between the two end points is vastly different to the temperatures along the way. There are weather stations at both those end points, and therefore any attempt to calculate the region’s temperature based on those two end points will be wrong.

    • Reply to Jarryd Beck ==> Absolutely correct — and there is a simple arithmetic proof as well.
      However, actually forcing one’s self to do these simple data set experiments forces the point home in one’s mind much more effectively.
      I will admit that I had to be forced to do this thru the five-data data set (increasingly boring and painstaking to do by hand) before I really really really believed it.

  20. “Importantly, Met Office states clearly that the Uncertainty Range derives from the accuracy of measurement and thus represents the Original Measurement Error (OME). “
    No, they didn’t. Nothing of the kind. They said:
    “The accuracy with which we can measure the global average temperature of 2010 is around one tenth of a degree Celsius.”
    The Met paper of Morice et al 2012 sets out what they mean by the measurement error of the global average:
    “A detailed measurement error and bias model was constructed for HadCRUT3 [Brohan et al., 2006]. This included descriptions of: land station homogenization uncertainty; bias related uncertainties arising from urbanization, sensor exposure and SST measurement methods; sampling errors arising from incomplete measurement sampling within grid–boxes; and uncertainties arising from limited global coverage. “
    It didn’t suddenly come to mean OME for HAD 4.
    “Here are the possible values: 1.8, 1.7, 1.6 (and all values in between)”
    No, it isn’t. 1.8 and 1.6 are the 95% limits. They are not the ends of the range. And they certainly aren’t equally likely to the center value.
    Calculating the standard error of the mean is elementary statistics. And this is not how to do it.

      • I don’t believe that is measurement accuracy, it is likely the reading you get. The link says in big, bold face not to record the temperatures in tenths of degrees but in degrees. So now we get (or got) temperature records to the nearest degree and we are talking about differences to hundreths of degrees being significant.

    • “The accuracy with which we can measure the global average temperature of 2010 is around one tenth of a degree Celsius.”
      What this means is surely that it is just as likely for any reading that the correct value is 1.6 or 1.8 or any value in between. Before assuming that 1.7 the central value is the mean, one should investigate the accuracy of the measuring agent. Perhaps there is a bias and each reading of 1.7 should be 1.8 or each reading of 1.7 should be 1.6.
      If the accuracy of the instruments used can only be ascertained to 0.1, then baldy stating that the mean is 1.7 is wrong. The Met Office is correct in its statement and Mr. Stokes’s interpretation of that statement
      “No, it isn’t. 1.8 and 1.6 are the 95% limits. They are not the ends of the range. And they certainly aren’t equally likely to the center value”
      is wishful thinking, based on the erroneous assumption that +0.1 is as likely as -0.1 which is totally unproven.

      • “What this means is surely that it is just as likely for any reading that the correct value is 1.6 or 1.8 or any value in between.”
        So what do you think the 95% means?

    • Reply to Nick Stokes ==> Linguistic agreement aside — read my whole essay and see that Met Office uses Uncertainty Range (not CI) and defines it in their FAQ, which I quote precisely, as “The accuracy with which we can measure the global average temperature”. If you wish to think this is a statistical confidence interval, you will have to do so on your own.
      You end you quote of Morice et al (2012) way too soon. Carrying on immediately from where you end off, it goes on to say:
      “The uncertainty model of Brohan et al. [2006] allowed conservative
      bounds on monthly and annual temperature averages to be formed. However, it did not provide the
      means to easily place bounds on uncertainty in statistics that are sensitive to low frequency
      uncertainties, such as those arising from step changes in land station records or changes in the
      makeup of the SST observation network. This limitation arose because the uncertainty model did
      not describe biases that persist over finite periods of time, nor complex spatial patterns of
      interdependent errors.
      To allow sensitivity analyses of the effect of possible pervasive low frequency biases in the
      observational near-surface temperature record, the method used to present these uncertainties has
      been revised. HadCRUT4 is presented as an ensemble data set in which the 100 constituent
      ensemble members sample the distribution of likely surface temperature anomalies given our
      current understanding of these uncertainties. This approach follows the use of the ensemble method
      to represent observational uncertainty in the HadSST3 [Kennedy et al., 2011a, 2011b] ensemble
      data set. ”
      and then, in the end, refers to this whole process, as well as that resulting from the sea surface paper, as “The accuracy with which we can measure the global average temperature”.
      If you feel so moved, you should write them a public letter explaining your contrary viewpoint. Maybe they will issue a correction.

      • Kip,
        ” Uncertainty Range (not CI) and defines it in their FAQ, which I quote precisely, as “The accuracy with which we can measure the global average temperature”. If you wish to think this is a statistical confidence interval, you will have to do so on your own.”
        They actually said:
        “+-0.1° C is the 95% uncertainty range.”
        If that isn’t a confidence interval, what do you think the 95% means?
        From your quote:
        ” HadCRUT4 is presented as an ensemble data set in which the 100 constituent ensemble members sample the distribution of likely surface temperature anomalies given our current understanding of these uncertainties.”
        What distribution could they be referring to?

      • Reply to Nick Stokes ==> I guess you’ll have to take it up with the Met Office….they repeatedly [said] the following:
        …which is much less than the accuracy with which either value can be calculated.
        It is not possible to calculate the global average temperature anomaly with perfect accuracy because the underlying data contain measurement errors and because the measurements do not cover the whole globe.
        The accuracy with which we can measure…
        ….which is much less than the accuracy with which either value can be calculated….
        They try to calculate the accuracy (which is another way to say the inaccuracy) and though it takes two learned papers to do so, they finally seem to agree, with a high degree of certainty, that 0.1°C represents that accuracy/inaccuracy.
        Everyone is entitled to an opinion about the matter.

      • Met Office language may leave something to be desired, but 95% ranges for the true values of temperature pretty well have to be confidence intervals. Or Bayesian, which clearly does not apply here. Section 3.2 of the Morice paper describes the distribution assumptions used to generate these CI’s using Monte Carlo methods – the 100-member ‘ensemble’. It seems heavily ‘statistical’.

      • Reply to basicstats ==> It seems perfectly clear to me that Met Office UK bends over backwards to communicate that they are talking of measurement accuracy — not the confidence interval of the results.
        They may well be coming up with an “accuracy of measurement” value which they hold to a certain degree of confidence…. which seems to be confusing many here.
        The Met Office is very statistically sophisticated — if they had meant CI, they simply would have said so, rather than using various forms of the words “measurement accuracy” over and over. If you read their two papers, it is strikingly clear that they are trying to find a way to quantify “the accuracy with which they can measure global temperature” — not to simply throw CI at their result.
        Honest opinions may vary….

  21. What is the probability using the choices on your graph of 2014 being 0.1C colder, correct or higher by 0.1C? Answer: 1/3
    How about 1998 higher and 2014 lower by 0.1C? 1/9
    The chance that the data set shown is totally accurate each and every year plotted? 1/1,162,261,467!
    Take any specific combination you like, that’s the chance of it being totally accurate, because that’s the total number of different combinations available within the error margin given by the Met.

  22. I spent 17 years as a MN Deck/ Navigating Officer, doing weather observations every 6 hours as one of the many VOs for the Met Office . We provided most of the data used by the UEA. CRU for their HADCRUT World Graphs shown at the top of this essay . When I read the notes on the derivation of the graphs and the use of SSTs, I realised they were a waste of space, complete rubbish.
    The statement :
    “It is not possible to calculate the global average temperature anomaly with perfect accuracy because the underlying data contain measurement errors and because the measurements do not cover the whole globe.”
    is , to say the least, laughable.
    Recently, I attempted a survey of fellow Vos on a MN web site. I received 42 replies, all confirming that the assumptions for ” corrections ” to SSTs were wrong…..

    • Reply to The Ol’ Seadog ==> I have spent 1/3 of my adult life at sea. Yes, SSTs before the satellite era [are] at very best random guesses.

  23. This is the error in measurement’s of temperature at a single point ? Then taken as representing the temperature over how much of the planet’s surface, before being aggregated with how many approximations from other points, to produce a figure for the whole surface ?
    How likely is that to be within even 1degree of the actual average of global temperatures ?
    Perhaps it is better for comparison only , year to year , but only if the selection of actual measurement point doesn’t keep changing. Doesn’t error due to selection & re-selection of measurement points have rather more potential than it is given credit for ?

    • And, only if the conditions in the area surrounding the measurement station don’t keep changing.

    • Reply to eddiesharpe ==> In the great big real world? Of course the calculated GAST is dubious at best.
      This essay is about a brave move by Met Office UK and the arithmetical realities of finding the mean of a data set made up of numbers that are really ranges, such as 71(+/-0.5).

  24. Why is it that the 1961 – 1990 “global average” (I know meaningless) is used in comaprisons? If we have “global measurements” (LOL) from 1880 to today, why not average those?

    • Perhaps it has been determined that this time period reflects the perfect temperature that must be maintained to sustain life on Earth.
      Or maybe it simply gives them the graphs they desire.

    • Reply to Patrick ==> I refer to this as the “My Golden Childhood bias” — its that wonderful time when things were perfect and the sun always shone, the birds always sang, people were kind and could be trusted, etc.

  25. I must say that I have a simple question here to address the ‘lack of total global coverage of data’:
    ‘Can you say with accuracy whether the temperature of the USA (excluding Hawaii and Alaska) was the hottest ever in 2014? Ditto Europe?’
    By highlighting what the regional temperatures are, you can present the results as: ‘definitive in the following regions, less definitive in the following regions and lacking in sufficient data to make judgements in the following regions’.
    Not only can that be done supremely easily, it removes all this nonsense and pointless argument due to ‘lack of data’. It removes the point of ‘in filling’ and data manipulation and focusses the global climate community on the need to establish and maintain weather measurements in the regions where insufficient data currently exists.
    Every farmer needs accurate trends for his/her local region. What’s going on in Australia really isn’t important for growing corn in Iowa, unless there are some pretty clear correlations between the two data sets (that sort of research is also interesting if it generates leading indicators of value).
    Why not just say what we DO know, rather than waste time, money and scientific credibiliy trying to put sticking plaster on to achieve something which really isn’t that important in practical terms??

  26. Too bad a satellite record didn’t start in 1679, 1779 or 1879 instead of 1979. Or even 1929. It’s likely that 1934 was warmer than 2014 & Arctic sea ice extent less in 1944 than 2014

  27. Surely this is an error?
    Thus, the mean is accurately expressed as 1.75(+/-0.1)
    if the 2 sets are for the same data point then the new “Range” becomes 1.6 to 1.9, therefore that should now be expressed as
    1.75(+/-0.15)

    • Reply to Osborn ==> The range is for the mean, not the data. You are right for the range of data but the range of the mean is as given.

    • Good for Booker. And thank you, Paul for being the inspiration for his writing. But I see that his piece has garnered nearly 1000 comments at time of writing – many from the usual suspects who, no matter what, abuse the man and his writings when many of them (the trolls) are not fit to sharpen his pen nibs.

    • Some scandal. The Arctic temperature has increased after 1987.
      One big part of that scandal is of course that stations have been added after 1987. For 1930-40 the number of stations has gone from about 500 to more than 2000.
      Scandal, scandal. They increase the number of stations! They reduce the infilling!

      • And the number of land-based stations has been reduced from what to what? Hint: A factor of ten?

      • Harry Passfield says:
        “And the number of land-based stations has been reduced from what to what? Hint: A factor of ten?”
        Harry Passfield managed to get an fourfold increase of stations 1930-1940 in 1987 vs 2015 to be a reduction. By a factor of ten.

  28. The axiom “Averaging never improves accuracy but may increase precision” can be found in any statistics textbook, and is nicely illustrated in wikipedia..
    http://en.wikipedia.org/wiki/Accuracy_and_precision#mediaviewer/File:Accuracy_and_precision.svg
    The neatest explanation I have heard is..
    The chance of rolling a six with a fair die is 1/6. Rolling the die a thousand times might give an average of 3.5, but this doesn’t alter the chance of rolling a six to 1/3.5. It is still 1/6.

    • Reply to Max S. ==> Thanks, Max. I wanted to include the precision issue in the original essay (actually had it in and removed it) because my “kindergarten examples” don’t demonstrate that aspect very well.

  29. The British Met office data is absolute #### (see climategate, Climateaudit), as is all of the above as usual. Refer to Steven Goddard, Paul Homewood, Mahorasy, Climatedepot, Climateaudit, ect ect ect….You are giving far, far, far too much attention to fraudsters who should be in jail by now. Same goes for nearly all the current WMO/UN, “Science” publication agents promoting this scam. When will you wake up?

  30. “The accuracy with which we can measure the global average temperature of 2010 is around one tenth of a degree Celsius.” ~ UK Met Office

    This is utter bull droppings. Utter lunacy. Utter deception.
    There is no way at present to find an average temperature of the entire planet to one tenth of a degree of degree Celsius. No way at all. (and that is beside the point that the average temperature is mostly meaningless anyway)

    • Mark: Well said. The way I look at it is that the GAT (whatever that is) is a propagandists wet dream expression that he needs to get accepted into mainstream thought. An example of this is the 70 mph speed limit: it is only a number. When it came out in the ’60s it was far more dangerous to drive at that speed in the cars of the day (mine topped at 65!) than modern cars. However, safety campaigners will tell you that driving at 80 mph is dangerous because the speed limit is 70 mph. It is meaningless. Like GAT.

      • Actually, according to my uncle a former professional race car driver and past Safety Officer at Indianapolis Speedway, it is the response time of the average driver that is the issue. He tells me that once you are at 80 mph closing time is faster than human reaction time.

    • What’s all this averaging of temperatures anyhow ?
      How does it relate to anything in the real world. Average propensity to melt perhaps ?
      It’s certainly gets less to do with heat content, when changes of state become involved.

      • They don’t. They average temperature anomalies off a common time frame baseline. That is mathematically legitimate for determining average trends. It does not give an average absolute temperature. Only an average rate of change.
        Which has, BTW, been fiddled first by homogenization. See essay When Data Isn’t.

  31. Whilst I appreciate the Met Office using a + or – 0.1 deg C error, I have a problem because whilst the underlying data remains as suspect and modified as it does, all this serves to do is to try and add scientific integrity to their pronouncements:
    We are referred to “Quantifying uncertainties in global and regional temperature change using an ensemble of observational estimates: the HadCRUT4 data set.”
    wherein it states:
    “The land record has now been updated to include many additional station records and (b)re-homogenized station data. This new land air temperature data set is known as CRUTEM4.”
    We know that GISS has reduced temperature records for Patagonia, changing a long term (60 year) cooling trend into a warming and it has now just come out that they have done the same in Iceland to get rid of the ‘sea-ice years’ of the 1970s. They have done the same elsewhere removing 1930s and 1940s temperatures which exceeded current ones.
    Producing claims based on not just homgenised data but from data which has then again been Re-Homogenised without all changes being detailed and valid argument (and I mean valid as opposed to specious ‘argument’) provided to justify this is nonsensical and cannot be relied upon.

  32. Slightly OT, but dealing with efforts to understand what data are telling you. I spend a lot of time reading census and economic data. When doing so, it important to understand what the reports are about.
    You may be reading about financial data in nominal or real terms. Incomes could be expressed as median or mean. That could be skewed significantly when there is income inequality. Also, reports can cover family incomes, household incomes and wages. They all have specific meanings.
    When data are presented, I try to understand the subtext of who is writing the report and why is it being written. Climate data and economic data are both subject to this admonition……. everyone has his hustle.

  33. “Does deriving a mean* of a data set reduce the measurement error?
    Short Answer: No, it does not.
    I am sure some of you will not agree with this.”

    We’re not only concerned with ‘measurement error’, i.e. was the temperature at each location correctly recorded, but (assuming the temperatures were accurately recorded) does the average of these measurements represent a true estimate of this abstract called ‘global surface temperature’?
    Now it’s obvious that the actual temperature at the “surface” will vary greatly depending on the thermometer’s location, elevation, time of day, season, weather conditions etc. So any set of recorded measurements will tend to vary according to these environmental parameters.
    For each thermometer X, the measurements can be viewed as samples of a random variable. So the question should be “what is the expected value of X”? If probability distribution of X is strictly uniform, then the expected value would be the arithmetic mean as you demonstrated.
    Some of the error components may be randomly distributed (i.e. uniform) then the expected value of such a component is zero.
    But other error components represent bias and are not randomly distributed. Their expected value is the sum of n samples, each weighted by its probability (which for uniformly distribution would be 1/n), which in general will not be zero.
    So “averaging” will reduce errors caused by random variance, but will not eliminate errors due to bias.
    Also, it is a common fallacy to view a set of unbiased thermometer readings as ‘true’ temperatures. But the readings depend on their environment. If we took another set of “unbiased” readings at different locations and times would we get the same expected values? Of course not, because the total squared error of our estimates has two component bias²+variance (ignoring non-deterministic noise).
    So even if we were guaranteed that our thermometers were unbiased, we would still use many random samples to estimate regional and global temperatures because averaging reduces variance component of the estimation error.

    • Johanus
      There cannot be a determinable “error” for a parameter which is not defined. And there is no agreed definition of global temperature.
      Each of the teams which determines global temperature uses its own definition which it changes almost every month so its values of global temperature change almost every month. If it were possible to estimate the errors of global temperature reported last month then those estimated errors would change for the values of global temperature reported this month.
      Richard

      • Excellent point Richard. I’m still waiting for one of these organizations to publish the “specifications” page from the Earth’s Owner/Operator Manual that says what the “average” temperature of the planet is supposed to be at this specific point in its 3,500,000,000 year history or how one’s supposed to measure it. For some reason none of them ever answer when I ask “”What’s the temperature supposed to be?” I can’t imaging why…

      • I’ll disagree slightly Richard.
        There ‘is’ an error range for ‘global temperature change’. The fact that the definition is changed willy nilly by the alleged data caretakers does not eliminate the error; but it does make it near impossible to correctly identify and assign the error.
        still that error range exists; but remains unknown. Since the temperature caretakers believe they can calculate global temperature change; they should be able to calculate the error; all they need is real expertise in deriving that error (an engineer with a very strong stomach).
        A database and/or a data method that changes past results is abuse.
        Corrections do not eliminate error ranges; they become a factor for calculating a representative, usually larger, error range.
        e.g. Temperatures were transmitted late and missed entry
        – Temperature were transcribed or entered incorrectly
        – Which means that someone is verifying come individual temperatures
        Homogenization adjustments are opinions without absolute data verification. Yes, NOAA, MetO, BOM are guessing when they do mass ‘homogenizations’. The result is garbage for accuracy or science.
        Infilling is a spatial fantasy guess with as much validity of a carnival freak sideshow. I believe the main purpose is for ‘make work’.

      • ATheoK
        I dispute that we have a slight disagreement.
        I wrote

        There cannot be a determinable “error” for a parameter which is not defined. And there is no agreed definition of global temperature.

        And you say

        The fact that the definition is changed willy nilly by the alleged data caretakers does not eliminate the error; but it does make it near impossible to correctly identify and assign the error.

        It seems to me that we agree.
        Richard

      • Reply to the richardscourtney // ATheoK comment thread ==> You two are, unfortunately, probably entirely correct. The other issue is, even if someone was to formulate a scientifically defensible definition of “Global Surface Air Temperature at 2 meters above ground level” AND a scientifically defensible definition of “Global Sea Surface Temperature” — what justification could be offered for reducing both to anomalies, then averaging the anomalies together weighted for surface area? One is a “skin temperature” of a liquid, one is the ambient temperature of a gas at a particular altitude near the ground.
        Somewhat like determining Average Room Temperature by some hodge-podge combination of air and floor surface or air and wall surface temperatures.

    • Reply to Johanus ==> Let me address this point only –> “For each thermometer X, the measurements can be viewed as samples of a random variable.”
      We are not taking random variables, but measurements of actual temperatures. The range +/-0.1°C presented by the Met Office derives (through a lot of complicated steps) from situations like this: The actual real world temperature being measured in Outpost, OK is 70.8°F. Our weather co-op station volunteer accurately records this as 71°F. The obvious 0.2°F difference between actuality and the report is not random variance. It is simply the difference caused by the recording rules, which are “report whole degrees, round up or down as appropriate” — actual temperatures are evenly spread but do not follow a normal distribution centered on thermometer degree marks.
      I’m sure you see where this leads.

      • Kip, you said:
        “We are not taking random variables, but measurements of actual temperatures….The actual real world temperature being measured in Outpost, OK is 70.8°F.”
        You seem to believe in the “thermometers-show-‘true’-temperature” fallacy. I’m surprised, because you clearly don’t believe the “one-true-global-temperature” fallacy.
        ‘temperature’ is a mathematical abstraction which cannot be precisely measured without applying mathematical models, such as expansion of mercury in a calibrated tube, or current flowing in a calibrated thermocouple, or thickness of calibrated ice layers etc. These are all ‘proxies’ in the sense that there is no device that measures arbitrary ‘temperature’ directly. And all of these proxies tend to be wrong, more or less. Some are more ‘accurate’ than others, and some are actually useful for the purpose of making us believe we can measure temperatures ‘directly’.
        So how can we measure the ‘error’ of some mathematical abstraction that doesn’t really exist, except in our minds?
        Good question. It is possible because there are natural ‘calibration points’, such as freezing and boiling points, which are uniquely determined by kinetic energy, which in turn have a precise relationship (more or less) to ‘temperature’, which is defined abstractly as the average kinetic energy at thermodynamic equilibrium. So arbitrary temperatures can be modeled as linear interpolations/extrapolations around these fixed points, and also allows us to estimate temperatures as mathematically continuous values. Temperature ‘errors’ occur whenever a thermometer does not match the value expected from the calibration points (or values interpolated from these points).
        So back to your example, let’s say we need to know the answer to the question “What is the current temperature in Outpost, OK?” Some might say it is 70.8F because someone observed that temperature on a thermometer there. So is that really the ‘actual real world’ temperature in Outpost, as you stated?
        No. Imagine, for the sake of argument, that your thermometer is surrounded by other thermometers, independently engineered and operated, and spaced at intervals, say on the order of 1 kilometer, but all located within Outpost OK. Will they all produce exactly the same simultaneous values?
        Most likely they will not (with this likelihood increasing with the size of Outpost). So, without knowing the ‘true’ temperature (if such exists), how should we interpret these differences?
        A statistical approach make most sense here, viewing each thermometer’s reading as a kind of mathematical value which is subject to variations due to physical processes and/or random chance. So, some of the thermometer readings may actually represent true variances in the actual kinetic energy of air molecules (“variance”), while others may have been improperly calibrated or interpolated and thus have a permanent offset (“bias”). Or some thermometers may have simply been misread, which could either be bias (deliberate “finger on the scale”) or variance (purely random transcription error).

        In probability and statistics, a random variable (aleatory variable or stochastic variable) is a variable whose value is subject to variations due to chance (i.e. randomness, in a mathematical sense). A random variable can take on a set of possible different values (similarly to other mathematical variables), each with an associated probability, in contrast to other mathematical variables.
        https://en.wikipedia.org/wiki/Random_variable

        So here’s my answer to the question “What is the current temperature in XYZ?”. Assume XYZ is an arbitrary surface location with finite extent in surface area and time, so XYZ’s “temperature” is the expected value estimated from measurements (“samples”) taken at one instant of time (more or less) from one or more thermometers in XYZ which claim to represent the ‘true’ average atmospheric kinetic energy in XYZ (more or less).
        We can’t know the ‘true’ kinetic energy temperature Tk at each point, but we can still reason about these values mathematically and represent them as sampled random variables, because the Tk are well posed in physics (mechanical thermodynamics). Let SE be the ‘squared error’ i.e. the sum of the squared differences between the our thermometer readings and the ‘true’ Tk. Then using variance-bias theory we can decompose into three components: SE = bias²+variance+noise
        ‘bias’ is defined mathematically as the expected difference between the Tk and the samples (thermometer readings). When bias=0 we say the estimator is unbiased. ‘variance’ is the sum of the squared deviations of the samples from the mean sample value, and ‘noise’ is, in effect, the natural uncertainty of Tk itself, which is unknown. (noise is usually modeled as a random Gaussian, because it is convenient to do so. That does not mean it is Gaussian in nature).
        So we should be somewhat skeptical about thermometers because their readings are estimates of the ‘true’ kinetic temperature, based on more or less reliable proxies/models. But we can, with some confidence obtained from statistical theory, make useful explanations and predictions about the world around us, using these readings.
        … and using the mean of a set of nearby recorded temperatures is a better estimator of Tk than an arbitrary reading within that set.

      • Reply to Johanus ==> Well, that’s thorough, alright! Thanks for the in-delth analysis.
        I’m not sure it helps our co-op weather station volunteer with his doubts about the accuracy of his recorded temperatures when he knows he has to round either up or down. Luckily, he is not charged with the responsibility of determining the real actual temperature of his city or even of his backyard — only with writing in his log the readings he finds his thermometers.
        Here, we acknowledge that when he writes “71” in the log that the thermometer showed some reading between 70.5 and 71.5, but what that reading was is lost forever. Thus in the future when some climatologist uses this datum, it must be used as the range it represented when written which is 71(+/-0.5)
        Thanks for taking the time to share with us.

      • I doubt that any current co-op reports are generated from manual observations of glass-bulb thermometers. There are many thousands of co-ops using relatively inexpensive (~$250 and up) stations with automated Internet reporting, such as the ones made by Davis Instruments, which are surprisingly robust and reliable, and mostly operated by dedicated amateur observers:
        http://wxqa.com/stations.html
        https://www.google.com/search?q=amateur+weather+station+equipment
        As for the historical co-op data (Form 1009’s etc) their temperatures were rounded to the nearest degree F, probably because that was the effective intrinsic accuracy of measurements made by eyeball from the old glass-bulb instruments.
        http://wattsupwiththat.com/2011/01/22/the-metrology-of-thermometers/

      • Reply to Johanus ==> The illustrative Co-op Weather Station volunteer is just that – a cartoon used to represent the vast majority of past temperature records, and the one that most easily illustrates the point.
        All temperature reports are similar in nature….there is the instrument reading which is rounded off to some agreed upon precision (maybe twice — once inside the instrument itself and once more in the data repository) and represents, in reality, a range of values “71.23(+/-0.005)” or some such.
        My kindergarten examples apply to these exactly the same.
        Thanks for the links.

  34. An OME error of +/-0.1C becomes rather meaningless when the measurement instruments are located at sites which qualify as CRN4+ on average.

  35. we can say with a good deal of confidence that 2010 was warmer than 1989, or indeed any year prior to 1996
    That’s a complete unknown on a human history scale, and a flat-out falsehood on a geological time scale.
    The meme that it’s hotter now than “ever before” is deliberate deception. The worst sort of political propaganda.

    • +1
      The geologic record shows so much variation, so much change, at times quite rapid change, the mechanisms are so poorly and incompletely understood, that a claim of “any year prior to 1996” is transparently false on it’s face.
      I might be tempted to congratulate the Met Office for it’s candor if instead of admitting to some tiny uncertainty in a mostly meaningless statistics exercise, they would STOP intentionally conflating uncertainty in their subjective statistical map of “global temperature” with true climatic uncertainty.

  36. I know I’m just a dumb old construction worker, but I have a question.
    How can one justify an error bar of.+/- 0.1˚C for 2014 when the error bar for 1961 may have been +/-.1.0’C or +/-.50C without a lot of finagling with the 1961 data and introducing more error?

    • Reply to old construction worker ==> The measurement error (accuracy of measurement) is surely better in 2014 than in 1961, thus we would expect the Uncertainty Range to be smaller in 2014. Remember, they still used human-read glass hi/low thermometers for measurement in 1961 and, in the US, only reported whole degrees F.
      Likewise for your field, we except that your floors or ceilings or foundations are more accurately level now that you can used laser-light levels to draw nearly perfectly level lines at the desired ceiling line (for instance) — in my experience, they beat the heck out of bubble string levels or water tube levels.

  37. The uncertainty of ±0.1°C sounds a great deal like the margin of error you can sometimes dig out of NCDC. So, is this the measurement uncertainty or the use of MOE to estimate the standard deviation?
    Does climate science ever compare the difference of means using any statistical test? Or, is just eyeballing the rarely included “error bars” good enough?
    At least the Met discussed uncertainty instead of proudly announcing that a 0.02°C difference was significant with a margin of error between 0.05°C and 0.1°C.

    • Reply to Bob Greene ==> I am of the opinion that the Met Office means what is says with their “The accuracy with which we can measure the global average temperature” statement — right or wrong quantitatively, I think they try to portray “measurement accuracy” and not the various statistical constructs.
      The journal Nature requires that charts and graphs depicting “error bars” have an explanation of what those marks mean — as the same marks are used for many different concepts — some of them less useful and more confusing than others.

      • “Measurement accuracy” or the ability to read the temperature in 0.1° increments? Measurement accuracy would be a component of the overall variability in most measurements. Climate science seems to take the most rudimentary error estimates and may or may not use even that. Pretty haphazard and sloppy way to be in the forefront of plotting the global economy.

      • Reply to Bob Greene ==> Well, of course, you are right when you say “Measurement accuracy would be a component of the overall variability in most measurements.”
        Met Office UK has based their rather rule-of-thumb-like 0.1°C on two complex papers, links for which are given at the end of the original essay above. These papers to find a quantitative answer to the very very complex question.

  38. I’m really surprised your example [1.7(+/-0.1)] really isn’t expressed as 1.8(-0.2) in keeping with today’s way of expressing global temperature

  39. So – it’s back to bold political statements and weather reactionism for 2015. “I Got You, Babe” is still playing on the global temperature radio station this morning. OK, Campers. Rise and shine. It’s cold out there. But yes, thanks to Met for going all “publical” with their “scientifical” honesty. Strange days when we have to give attaboys to government scientists for doing what they’re rudimentarily expected to do.

  40. I am not so impressed with the idea that they know the accuracy to 0.1 C at all. They may have the measure of a particular station even more accurate than the 0.1C at any time. The question though is whether that translates into a valid meaning and error level for a 5×5 deg grid used for the grand average.
    I commented (Dr Spencer’s site) to a Cowtan and Way reference (5 Jan, 2015) and repeat here:
    “Apropos to your question, why not take a look at two T’s I have experienced today. I live in Perth, Australia. One Perth station hit just over 43C today max. Another in a suburb called Swanbourne peaked at just over 42 but has been 10C lower than the Perth station.
    They are about 10kms apart.!! It is 19:30 WST now and the difference is still 7C. So what is the temp for this small area????? Yet you insist on incestuous 1200km infilling as being acceptable methodology. Nonsense!
    Check for yourself; over the whole day
    http://www.weatherzone.com.au/station.jsp?lt=site&lc=9225&list=ob
    http://www.weatherzone.com.au/station.jsp?lt=site&lc=9215&list=ob
    NB:
    1. these two references will lead you to the latest day’s hourly readings for 24 hours so will not show the Jan 5, 2015 readings. You may need to go to the original BOM data.
    2. Perth has a number of stations within a 15 km radius. Although not as dramatic as that Jan day there are significant variations which would swamp any 0.1C idea of error level. I’m bemused that anyone could suggest a T for any day for Perth which has some real meaning and could be considered accurate to 0.1C.

    • Reply to tonyM == > In-flling and homogenization are very problematic….
      So are simple things such as finding a daily mean …. use the Hi/Low? Use the hourly readings? Use the minute-by-minute readings? All different answers by huge amounts.
      Watch for my post on this topic — but please don’t hold your breath.

  41. One trick is to admit a small crime to hide a bigger one.
    Has not the big problem always been the adjustments to the raw data?
    OK, give them credit for admitting to the existence of the mouse but when will they admit to the existence of the elephant?
    Eugene WR Gallun

  42. Kip,
    I liked the article but can see that some have decided that data from thermometers located near the surface will never be trusted………..unless, maybe if they showed a cooling trend.
    Satellite temperatures were cooler and did not show 2014 as the hottest year, which may be part of this.There are numerous issues for thermometer data and complicating factors(for satellites too) but I think this article addresses one of those issues well.
    Regardless off whether you agree or not, even the warmest data is coming in cooler than global climate model projections and show no dangerous warming.

  43. As soon as the hysterical New York Times front page article cam out a few weeks ago, I posted the following on my blog — https://luysii.wordpress.com which essentially said the same thing
    The New York Times and NOAA flunk Chem 101
    As soon as budding freshman chemists get into their first lab they are taught about significant figures. Thus 3/7 = .4 (not .428571 which is true numerically but not experimentally) Data should never be numerically reported with more significant figures than given by the actual measurement.
    This brings us to yesterday’s front page story (with the map colored in red) “2014 Breaks Heat Record, Challenging Global Warming Skeptics“. Well it did if you believe that a .02 degree centigrade difference in global mean temperature is significant. The inconvenient fact that the change was this small was not mentioned until the 10th paragraph. It was also noted there that .02 C is within experimental error. Do you have a thermometer that measures temperatures that exactly? Most don’t, and I doubt that NOAA does either. Amusingly, the eastern USA was the one area which didn’t show the rise. Do you think that measurements here are less accurate than in Africa, South America Eurasia? Could it be the other way around?
    It is far more correct to say that Global warming has essentially stopped for the past 14 years, as mean global temperature has been basically the same during that time. This is not to say that we aren’t in a warm spell. Global warming skeptics (myself included) are not saying that CO2 isn’t a greenhouse gas, and they are not denying that it has been warm. However, I am extremely skeptical of models predicting a steady rise in temperature that have failed to predict the current decade and a half stasis in global mean temperature. Why should such models be trusted to predict the future when they haven’t successfully predicted the present.
    It reminds me of the central dogma of molecular biology years ago “DNA makes RNA makes Protein”, and the statements that man and chimpanzee would be regarded as the same species given the similarity of their proteins. We were far from knowing all the players in the cell and the organism back then, and we may be equally far from knowing all the climate players and how they interact now.

  44. I fully appreciate Kip Hansen’s point that the Met is trying to craft a meaningful statement about their statistics. I still feel that they have missed the mark: The accuracy with which we can measure the global average temperature of 2010 is around one tenth of a degree Celsius. One must realize that “accuracy” relates solely to the ability to find the actual mean temperature, in the same way an accurate archer hits the bull’s-eye. The sparse global sampling of temperature makes it impossible to know the global mean temperature to +/-0.1deg C. Though the Met acknowledges this sampling problem, they should instead have said: “The accuracy with which we measure temperature at [however many] temperature stations distributed around the globe is ….” This is important because that single average temperature they publish can change as new stations are added or as existing stations are removed, regardless of the actual change in global temperature. I suppose someone has the data of all individual stations, and could show how the changes in sampling density over time have altered the mean. Of course then the heat-island effects and urbanization effects would quite possibly exacerbate the global mean temperature rise, and complicate such ananlysis.
    Precision of measurement is quite different from accuracy, and rather speaks to the ability to reliably reproduce a measurement (or process, such as an archer making additional shots in sequence). In a global sampling scheme, each of the measurement stations has its own characteristic precision. Climatologists are wont to treat this as a random variable, a risky proposition with their scientific credibility at stake. Of course scientific thermometers have a quoted precision, but they are being used in a measurement process, not a in controlled standards laboratory.
    Here is a more mechanical example: Consider a wind gauge of the type with spinning hemisphere cups. In a lab, the device would be tested with a known flow rate of dry air at several different flow points. The the precision would be calculated and reported with the instrument. But in use, the instrument is measuring a continuous variable. The inertia of the spinning component may cause the instrument to indicate faster wind speeds during periods of rapid wind subsidence. This means that there will be a bias in the measurements that is not reflected in the advertised precision, and treating the measurements as a random variable is not correct. This might be a very small effect, but would make it impossible to quantify minute differences in daily mean wind speed.
    Thanks to Mr. Hansen, we now have excellent explanation of precision of the means, and its significance on “global” temperature records.

    • Excellent point, I constantly tell my Project Managers and Sales types that they cannot count on the specifications listed in the brochure for a piece of gear to be realized in the field. They are merely starting points measured from a known set of conditions… i.e an “ideal” lab.

    • Reply to Matt ==> Read the HADCRUT4 paper — interestingly, they run a 100-ensemble error/accuracy model in their determination of the 0.1°C uncertainty range estimate.

  45. While there have been several comments that mention it, none have been very explicit with why the supposed law of large numbers is not necessarily applicable to climate temperature measurement. The fundamental problem is that most folks do not see the temperature data from the perspective of an instrumentation technician or engineer.
    No instrument calibration technician will claim that instrument errors may be assumed to be randomly distributed around some true value. All that will be claimed is that the instrument was adjusted and verified to read within its specified accuracy limits in his calibration lab setup. Fresh from the factory, similar instruments often show the similar error profiles. Many years ago when working as an instrumentation technician, the practice was to adjust an instrument to match the calibration reference to within about one half the specified accuracy tolerance to allow a little slack for instrument drift over time. Whether instruments had the similar or different error profiles was not a consideration.
    OK. So the instrument errors from true may not be assumed to be random, what else is there to consider? Well the issue is that a thermometer is actually reading its sensor temperature, not the world around it. It is the job of the instrumentation engineer to find a way to accurately and consistently couple that sensor to the process to be observed. As an example, that is the purpose of the Stevenson Screen enclosure used for our historic temperature data collection.
    Does a thermometer mounted in a Stevenson Screen accurately and precisely provide air temperature observations? Only relatively speaking. Each thermometer and each installation of Stevenson Screen will have its own variations from perfect. The best accuracy claimed for these installations was plus or minus one degree Fahrenheit of true air temperature. Remember, we are dealing with the real world and even that level of accuracy was likely seldom attained.
    What about modern electronic temperature measuring devices? A platinum resistance temperature detector (RTD) can easily provide a sensor accuracy of +/- 0.01 degrees Celsius. Allowing for errors and drift in the electronics used to read the RTD’s resistance, long term accuracy levels of better than +/- 0.05 degrees Celsius are fairly easily achieved. Remember though that is the sensor accuracy, not field true accuracy. With high accuracy instruments, how that sensor is installed in the field is the primary limitation to its overall observational accuracy.
    What should you conclude from this? An instrument’s accuracy should never be assumed to be better than its specified calibration accuracy. Instrument errors cannot be assumed to be random. Averaging multiple observations from the same instrument or from multiple instruments may allow some noise reduction but it will not improve the accuracy of those observations. As described above, the law of large numbers does not apply to instrument observations.

    • Gary
      Excellent comments. As someone who has never been involved in any of the instrumentation issues, my focus is more on the human element. I think back over the last hundred(s) of years and visualize the tens of thousands (?) of individuals with innumerable different types of thermometers and different methods and different work habits and the issue of what time was the temperature read, etc etc and shake my head at anyone thinking that over this very long time span we can make valid comparisons. In our small organization with very tight controls and procedures, we had errors every single day. The aggregation of those involved in compiling temperatures is really a longitudinal organization with no tight controls and the opportunity for errors on a daily base worldwide boggles my mind. If we could wave a magic wand and have the same individual record every single reading over the globe for the last several hundred years, there would still be an incalculable number of errors.

    • GaryW
      You say

      The fundamental problem is that most folks do not see the temperature data from the perspective of an instrumentation technician or engineer.

      That strongly suggests you have not read Appendix B of this.
      I think you will want to read that Appendix and probably the rest of the link, too.
      Richard

      • Richard,
        I do not see how anything in Appendix B invalidates what I wrote. It seems that what it says is that some folks think that instrument measurements cannot be reduced to a mean global temperature value because all instrument errors cannot be known so those folks think it is OK to make a best guess and make another guess about the 95% error range of their guess.
        Overall, as I read that appendix, the point it made was similar to what I wrote. Maybe I incorrectly took your comment and recommendation as a criticism.

      • GaryW
        I made no criticism. I said I thought you would want to read an account by 18 scientists from around the world which made your point more than a decade ago.
        Richard

    • Reply to GaryW ==> Thanks for that interesting and insightful Instrumentation Engineer’s Viewpoint — I think that helps some to clarify the issue.

    • Thank you GaryW for injecting valued experience to this discussion.
      While the purpose and intent of the Stevenson Screen was to reduce error, hindsight observations reveal it as a prima donna device, since relocation by almost any distance is very often quoted as a reason to homogenise. Paint or whitewash old or new, height above ground or growing grass or bitumen or bare soil and many other variables have been noted.
      You are entirely correct to note the responsibility of the user to provide reproducible surrounings for a high accuracy (in the lab) deviceo
      To the extent that the actual record shows laxity of this purpose, claims of +/- 0.1 deg are doomed to drown in wordsmithing while the real world marches on.

  46. The idea that we can measure global average temperarture to 0.1degC is a ridiculous assetion. One only has to look at a weather map of the UK to know that we cannot measure the average temperature of the UK to 0.1degC despite the fact that the UK has far better spatial coverage than the globe as a whole. Mountain and coastal areas have their own micro-climates and they are often under sampled. That makes a big difference in a country like the UK.
    Today, the weather report suggested that there was going to be a divide of fog broadly over the Penines. Under the fog the temperature was forecasted to be 1 degC, where there was no fog it was forecasted to be 8 degC. So which thermometers are being used to compile CET; the ones showing 1 degC, or the ones showing 8 degC?
    The same is so with the difference between urban and rural temps which is frequently stated to be 4 to 6 degC difference, but can, of course, be more than that.
    On a global basis, with all the station drop outs since the 1960s/70s, it cannnot possibly be the case that the margin of error is only 0.1degC, especially given that the distribution of the stations that have dropped out does not to be randon and equally distributed. .
    In fact, I would suggest that the minimum margin of error is double the difference between what GAT waould show on raw undadjusted data, and what GAT shows in the compilation homogenised/adjusted data set.
    Of course, one would hope that the homogenisation/adjustment is an improvement, but that might not be the cae. The homegenisation/adjustment may even make the data worse. For example, one frequently sees examples where adjustments made to take account of UHI are counter-intuitive.
    Every institution who compiles a data set should be forced to show what the unadjusted raw data set shows, in addition to their homogenised data set.

  47. I believe Kip is wrong here since we’re talking about the 95% uncertainty range. The probability that multiple readings are all at there extreme value gets lower as the number of readings increase, therefore the 95% uncertainty range should get narrower with more readings assuming unbiased and independent error distribution. The kinder garden example quietly assumes we’re talking about the 100% uncertainty range, but we’re not.

    • You can actually try this out with a die. With one die the probability of getting the extreme values (1 or 6) is 1/6. With two dice the probability that the average is at the extreme values drops to 1/36. No matter how many dice are in play the 100% uncertainty range is still 3.5 +/- 2.5, yes, but the 95% uncertainty range narrows with more dice.

      • I agree with Pax on this. I was going to use the dice example but chose to use Kip’s examples in a post which appears further down. As the number of measurements increases the probability that the true value of the mean is near the extremes is vanishingly small.

    • Reply to pax ==> If we were talking about statistical uncertainty range (of any percentage). We are not talking about statistical uncertainty at all. We are talking about measurement accuracy/measurement error.
      The simplest way to look at it is: Mr. Jones, co-op weather stations volunteer, goes out to his Stevenson- screened thermometer and looks at it. The thermometer is reading is between 71 and 72 but closer, he feels, from his viewing angle today, to 71. He records 71. Now, 71 is not the actual temperature of his station at that time. It is only what he records on the official records. The truer, more accurate statement, is that the temperature recorded (when the record is viewed at some time in the future) was 71(+/-0.5) because we know that he was required to round to the nearest whole degree– thus all we know is that (if the thermometer was correct, in spec, calibrated, etc etc) that the temperature for that station at that moment as between 70.5 and 71.5. We write that knowledge as 71(+/-0.5). The likelihood that the temperature was actually factually exactly 71 is vanishingly small. We only know it was somewhere in that range, but not where. There is no scientific or mathematical reason to believe that any point in the range is more likely than any other point.

      • I understand the uncertainty involved in one reading. But I thought that you were making the argument that averaging a number of measurements does not decrease the uncertainty, I say that it does. If we say that the true temperature were 71.6 and you had 100 people read the thermometer and then took the average, then you would get an answer that was closer to the true value of 71.6. The probability that you would get exactly 71 is vanishingly small since this would require all 100 people to “feel” that it read 71. Therefore averaging draws the result closer to the true value. This seems obvious to me.

      • Reply to Pax ==> You are talking a statistical ideal. In my example to you just above, I showed that the recorded value “71” represents a value range from 70.5 to 71.5. Looking at the record, we can know nothing closer to the “actual temperature” than that.
        Read the original essay above, just the maths part. See that finding the mean of values that are ranges — these look like this 71(+/-0.1) — produces a mean that must be expressed as a range.
        The unfortunate fact is that a range of measurement is notated exactly the same as statisticians notate things like Confidence Intervals — which are animals of a different stripe altogether.
        Actually do the three-data data set experiment described in the essay and see if it doesn’t change your viewpoint.

  48. In my lounge, I have an old (about 30 years) spirit thermometer. Over the past few years, I have suspected that it under-records temperature. I was not of that view say 12 to 15 years ago.
    Today, I checked it against a modern electronic (thermocouple type) thermometer. The difference between the two was 1.5deg C; the spirit thermometer reading cooler.
    Of course, it is likely that both thermometers are wrong. But it is likely that the spirit thermometer has been degrading over time, particularly over the last few years.
    Equipment changes and degradation alone are likely to give an error of not less than 0.1degC (I would guess more like 0.25degC). Indeed, I recall reading a paper on Stevenson Screen degredation that suggested that wearing/degradation of the paint/wood could be in the order of 0.4 to 0.5degC, and Anthony has also done experimentation on the impact of modern day latex paints.
    The impact of this type of degredation is not normally distributed.
    The scientists are kidding themselves when they claim that their data sets have an error margin of only 0.1degC, and we know why that is the case; the easiset person to fool is oneself especially when you have ‘a Cause’ to promote and/or to fund your relevance and pay check.

    • Yes Richard, liquid in glass thermometers degrade over time. Glass ‘creeps’. That is just one of the reasons that past temperatures are retrospectively ‘corrected’ ( homogenised).

      • ‘correction ‘ is not the problem , its how you ‘correct’ that matters or rather your motivation for your ‘corrections’

      • Mercury/alcohol in glass thermometers read higher over time as the glass shrinks reducing the diameter of the liquid column.

        thus, temperature readings are the most accurate the further back in time one looks, and no adjustment should be made to the past. the present should be adjust downwards, to allow for increasing readings with age.

  49. (1.9 + 1.6) / 2 = 1.75 is mathematically correct but scientifically incorrect. If the measurements are accurate to one tenth, and the measurements are presented with a precision of one tenth, then the average cannot be presented with a precision of one hundredth. Measurements do not “gain precision” through averaging.

    • Oh yes they do. The average is not itself a measurement. This was the subject of a discussion we had many years ago on Dave Wojic’s Climate Change Debate. Even when presented with a spreadsheet illustrating this fact, some still clung to their mistaken belief. So it goes…

      • Example:
        2+10+3+4=19
        19/4=4.75 (average of the four single digit precision measurements)
        The argument that this should be rounded to 4.8, or 5 makes no sense since:
        4*4.8=19.2 and 4*5=20, neither of which are equal to 19.

    • Reply to Sal and The Git ==> There are lots of opinions on the “averages reported at precisions more precise than the original measurement accuracy.” I sometimes call this the “Average Number of Children problem”. The Average Number of Children under 18 at home in an American family last year is given by Census.gov as 1.9 children. A figure precise to tenths of a child, yet there is not a single family in the US that has 1.9 children, so while the precision is high, the actual precise average is nonsensical and has an “accuracy” of zero. We simply do not measure children in decimal portions. One child or Zero children or Two children, yes. 1.9 children no.
      1.5 pairs of shoes? 0.5 wives? 17.23498 yardsticks in stock?
      Some things and measurement data are not suited to expression as averages more precise than the measurement units.

      • 1.9 children/couple is a meaningful number because the number of children is an integer. One is written as 1 but it is really 1.000 . . . , so 1.9 is scientifically accurate, but writing 3/7 as .427571 is not if 3 is just 3 not 3.000 … etc. etc.

      • Reply to luysii ==> I’m having trouble understanding your point….give the explanation another try will you?
        I know that 1.9 is “scientifically numerically accurate” but remains nonsensical in the realm of real American families and is accurate to a percentage of ZERO% — as not a single family out of the ~ 122.5 million families in the USA, has 1.9 children.
        Precision and accuracy are dependent on field of endeavor and application.

      • In truth, averages are convenient fictions that allow our minds to grapple with large populations. Exemplary for me was the Spitalfields exhumation conducted some years ago when a crypt was to be demolished. The human remains were given to forensic specialists who were asked to estimate the age at death. Their estimates were a decade or more younger than the actual age at death. The forensic experts knew that the average age at death in that part of London in the 19thC was very low. That average was badly skewed by the very high infant mortality with most dying during the first five years of life. Past the hump, there were plenty living into their 80s and 90s. Average age at death was 29 in the 1840s, but that didn’t mean there were an extraordinary number popping their clogs in their 20s and 30s.

  50. The accuracy with which we can measure the global average temperature of 2010 is around one tenth of a degree Celsius.

    Then all the climate models are wrong because they deviate from the real temperature by more than 0.1 degrees Celsius.

  51. Thanks to all for being patient — Sunday mornings for me are taken up with religious observance. I am just getting to address your comments seriously now, 1:30 pm Sunday Eastern Time. I do intend to answer and reply to those that seem to need attention, NB: If I fail to address your comment and you were hoping I would — please post it again, leading with a liitle bit like “Please respond to this point” and I will try to do so.

    • “. The result in short is, that one might be able only but under best conditions to reach an uncertainty of ± 1 to 2 K respectively. ”
      Is this were true of the Global Averages Michael, surely we not be able to discern the surface measured series following the perturbations in the independent satellite series so clearly and yet confined within a +/-0.5 K range, as can be seen illustrated here for example. http://jennifermarohasy.com//wp-content/uploads/2009/05/tom-quirk-global-temp-grp-blog.jpg
      though perhaps that’s not what you were suggesting.

  52. Hi Kip,
    I can only support your reasons to question. Especially their claim to be able to “measure” anomalies as well as absolute global mean temperature with an uncertainty of only ± 0.1 K ist out of any practical and theoretical reosoning.
    I work sind many years on this topic. For details see my paper in E&E http://www.eike-klima-energie.eu/uploads/media/E___E_algorithm_error_07-Limburg.pdf. The result in short is, that one might be able only but under best conditions to reach an uncertainty of ± 1 to 2 K respectively. That would of course finish all discussions about the question which year was the “hottest”! And with it plenty of others. I would like to discuss this result with you further, please contact me by email at limburg@grafik-system.de
    regards
    Michael Limburg

    • Reply to Michael Limburg ==> Thank you for checking in from Europe. As of yet, I have no opinion whatever on the real size of the original measurement error that ought to be applied to something as vast as the GAST (HADCRUT4 Land and Sea or otherwise). However, I’ll send you an email so we can touch base.
      Thank you for taking the time to read my essay here at WUWT.

  53. This Telegraph article just made the DRUDGE report, which assures millions of extra readers.
    It calls fiddling with the temperature record “the biggest science scandal ever”. It also has a survey, which shows that [currently] 92% of readers agree.
    Maybe the tide is turning…

    • If the fiddling is as great as stated by some(which is doubtful but there are several legit issues) then it is steepening the slope of the temperature increase with time.
      If the end of the data collection period was fixed permanently, then the fiddling(defined as decreasing temperatures from decades ago and increasing recent temperatures) could be maximized.
      However, here is the problem. If the fiddling temperature slope has been increased, let’s say from 1950 to 2010, it means that new observational data, must be fiddled with even more to maintain that slope or it backfires……………if you use the unfiddled, most recent data and compare it to the fiddled, just prior data and increased slope.
      With time, it becomes harder and harder, then impossible to maintain the temperature fiddling slope unless actual temperatures do actually increase.
      If 2014 temperatures were fiddled with to barely nudge us into the “hottest year ever” category, then, without temperatures actually rising, the 2015, 2016, 2017 temperatures must be fiddled with even more to get even hotter.
      Fiddling during the first decade of this century, just makes it harder to show the next decade is even “hotter”.
      Regardless of how clever those who fiddle with temperature data are, it would be harder and harder to continue to do this to accomplish the task with time…………in fact, if there was fiddling, it eventually backfires and requires a bigger increase in actual temperatures just to catch up to the recent temperatures + the fiddled higher amount.
      Not claiming that the temperature records are completely reliable and definitely not all the instruments or their locations, just that any benefits to increasing the slope of the temperature uptrend are completely maxed out here.
      Future temperatures will only look cooler compared to recent ones if this was the case. I found it hard to believe that this could be maintained or increased in the future to the level needed without it being more and more obvious.

      • Not really, Mike. When they run the homogenization program, ALL prior temperatures are adjusted based on “current” inputs. Is anyone in the general public going to remember the temperature stated in 2014, 2013, 2012, 1970, 1945, 1935 …
        Not likely. The media will publish what is put in front of them and create a new scary headline.
        We already have lots of examples of this and there is no hue and cry from the public. It is now easy to just keep producing computerized output to fit the agenda.
        At least until the weather turns ugly and wakes everyone up.
        But for now, let it stay warm. Much better than cold.

  54. It is the treatment of errors in measurement and data sets in climate science that has always amazed me. Amazed in the sense that, for too long, the alarmist scientists have been allowed to quote values to more decimal places than the errors quoted. When I read Physics, the calculation of errors and their application to experimental results was hammered into us little proto-physicists. If you handed in lab work and quoted results to a higher accuracy than your errors you failed. Not only that, you were laughed at for making such a basic and easily avoided mistake.
    Now, everybody who has studied science knows this but it is rife in climate science. These people don’t even blush when they do it but must know they are being, to put it politely, less than rigorous.

    • Why change a habit that has been so rewarding ?
      This is an area where the value of your ‘research’ exist that in the quality of your data but in the ‘impact’ , especially in the press , the release of your paper has . Look at he behaviour of climate ‘sciences’ leading voices and you will see why this approach is not seen has a problem, but as a normal way to work.

  55. I think you’re making too much of sloppy English, viz: “The accuracy with which we can measure the global average temperature of 2010 is around one tenth of a degree Celsius.”
    The sloppiest word is “measure”. It should read “construct” since it’s nothing more than a statistical inference and has little to do with reality. After all, what is it supposed to convey? Is it related to energy retained by the entire atmosphere and/or oceans or is that merely what we’re supposed to perceive it to mean? If so it’s hopelessly inadequate. For starters it totally disregards latent heat in various places.
    The word “accuracy” is therefore not just sloppy but wholly inappropriate. Perhaps the entire thing would be better written as:
    “The statistical confidence with which we can construct a hypothetical global average surface temperature for 2010 is around one tenth of a degree Celsius but this tells us nothing about how much energy was gained or lost by the planet since 2009.”
    Or you could just append “+/- several Hiroshima bombs, give or take” for effect.

    • Reply to AJB ==> Alas, it is not my language, but the language of the Met Office UK. However, it is an important point in this issue that I (the author of this essay) am talking about — and believe that Met Office UK is talking about, “the accuracy with which we can measure the global average temperature”. It matters not whether they give a quantity that we agree with, they are talking about accuracy of measurement and NOT NOT NOT “statistical confidence” — that is what is so great about the way they stated what their Uncertainty Range was….accuracy with which we can measure….

      • Yes Kip, I’m aware of that and take your point. My point is the Met Office’s language is “sloppy”, not yours. Words mean different things to different folk. What they’ve written is as much farce as tragedy; it’s all in the eyes of the beholder. How is the man on the Clapham omnibus (maybe on his way to a media job somewhere) likely to interpret that? The Met Office’s target audience is surely not science specifically but the public at large.

      • Reply to AJB ==> Even the man on the Clapham omnibus can understand their statement “The difference between the median estimates for 1998 and 2010 is around one hundredth of a degree, which is much less than the accuracy with which either value can be calculated. This means that we can’t know for certain – based on this information alone – which was warmer.”
        When at last at Wimbledon, I missed riding the omnibus, but enjoyed the down-and-dirty fish and chips!

      • LOL! I hope you had a portion of mushy peas with that. Constructing a global average temperature is about as useful as estimating the average size of pea and projecting the radius of carton the mush will eventually be served in. We already know the mush will be green though 🙂

  56. Kip Hansen said in opening his guest post at WUWT,

    Those following the various versions of the “2014 was the warmest year on record” story may have missed what I consider to be the most important point.

    then concluding his guest post at WUWT Kip Hansen said that most important point was,

    UK Met Office is my “Hero of the Day” for announcing their result with its OME attached – 0.56C (±0.1˚C) – and publicly explaining what it means and where it came from.

    Kip Hansen,
    A very stimulating post. Thanks.
    Why would a reasonable person involved in the climate science discourse consider the UK Met Office a hero (of the day) when it does something that (arguably) explains clearly in a professional manner what their logic, context and process is?
    Your guest post helps us to understand better how there is open distrust of climate focused science if what you said the UK Met Office did is considered heroic instead of just (arguably) being basic professional conduct.
    John

    • Reply to John ==> Of course, you are right that it is extraordinary that something as simple as stating their results with a frank and transparent guess-timate of the accuracy of measurement should elicit kudos, congratulations and a declaration of heroism. However, in Climate Science, this is the case today.
      I do not offer my congratulations to them as sarcasm or irony. Their statement is somewhat of a Game Changer for governmental climate organizations. In one little statement, they validated the skeptical viewpoint of The Pause…..they eliminate all senseless arguing about data that are a mere few hundredths of a degree different, even those a tenth of a degree different. Of course, these are degrees C…the numeric values are thus larger digits when speaking in degrees F (degrees C being larger than degrees F — 0.1°C is about 0.18°F).
      Thanks for checking in.
      [Rather, “.. in Climate Science, this is not the case today” ? .mod]

      • Kip Hansen,
        I really really want to share your level of optimism . . . I am optimistic . . . . but less so than you seem to be . . . I hope I am wrong in being less optimistic.
        John

      • Reply to .mod ==> It is the case today that it is extraordinary that something [ syntactically, we could have anything here, like: as simple as stating their results with a frank and transparent guess-timate of the accuracy of measurement ] should elicit kudos, congratulations and a declaration of heroism.
        [And thus: no edit requested, no edit granted. 8<) .mod]

    • Enough bitching against the Met Office.
      Read what they published, particularly, at their site given below their 2010-2013 update.
      http://www.trebrown.com/hrdzone.html
      The door in front of the office was changed from “No Smoking permitted inside” to “No Statistician permitted inside”

  57. Kip
    I’m not entirely sure I agree with the general conclusion of your post. Looking at your examples you have example 1, i.e.

    Here’s our data set: 1.7(+/-0.1)

    Yep – I agree with you here. The measured value is 1.7 but the true value could lie anywhere between 1.6 and 1.8 – with each value in the range having equal probability of being the true value.
    Now we move on to Example 2

    Here is our new data set: 1.7(+/-0.1) and 1.8(+/-0.1)

    And this

    If both data are raised to their highest value +0.1:
    1.7 + 0.1 = 1.8
    1.8 + 0.1 = 1.9

    Now from this we can conclude that 1.85 could be the true mean of the 2 values (1.7 & 1.8) but there is only one way this could happen and that is if BOTH measurements were low by the “maximum” measurement error. Similarly 1.65 could be the true mean only if BOTH measurements were high by the “maximum” measurement error.
    In fact, the mean of the actual measured values (1.75) has more chance of being the true mean because there are more combinations of measurements which will produce that value e.g. +0.1 & -0.1, +0.09 & -0.09, +0.08 & -0.08 ……. Etc.
    As the number of measurements increases, the probability that the true mean is at the extremes of the measurement error decreases.

  58. Reply to Finn ==> You have caught me short-forming the explanation. Of course, my examples are just the extremes, because we are talking about a range. The means of any of the intrim points will be within the means of the extremes…I thought that would be so obvious to the science-oriented crowd here that I did not explicitly state it In the end, it is the two far ends of the range of the means that we are interested in because they will give us the Range of the Mean. I try to clearly state that the actual value of the mean is a RANGE more properly expressed as:
    What is the mean of the widest spread?
    1.9 + 1.6 / 2 = 1.75
    What is the mean of the lowest two data?
    1.6 + 1.7 / 2 = 1.65
    What is the mean of the highest two data:
    1.8 + 1.9 / 2 = 1.85
    The above give us the range of possible means: 1.65 to 1.85
    with a central value of 1.75
    Thus written 1.75(+/-0.1)
    We have no interest whatever in any probabilities of anything. I can state an exactly correct range — in the physical world, every point within the range is just as probable as any other point. Means are only probabilities in statistics.
    Your solution would be correct only if we thought that the world magically declared that actual temperatures (for instance) must be normally distributed in respect to their distances from the arbitary markings on both F and C graduated thermometers.
    In the real world however, our actual temperatures, that data for which we have only represented as ranges, do not behave that way. They could be anywhere equally, but there is no reason that they couldn’t all be crowding the .0999999s.

    • We have no interest whatever in any probabilities of anything. I can state an exactly correct range — in the physical world, every point within the range is just as probable as any other point

      No, Kip, every point within the range of a MEAN is not “just as probable”. On a single measurement – yes – but not when the mean of several measurements is calculated.
      The dice example given by PAX (above) is a good analogy. The outcomes of a single throw of a die are exactly the same (i.e. 1/6) so the mean value after one throw has equal probability of being 1, 2,, 3, 4, 5 or 6.
      The MEAN value after 2 throws, however, does not have EQUAL probability. E.G
      The probability that the mean is 6 (i.e. 2 sixes) is 1/6 x 1/6 = 1/36 whereas the probability that the mean is 3.5 is 6/36 or 1/6. In other words it’s 6 times more likely that the mean is 3.5 than it is that the mean is 6 (or 1).

      • Perhaps I should just add that the probability that the mean is at either end of the measurement error becomes so small that it is effectively ZERO if about 5 or 6 measurements are taken.
        Even when we use discrete values, as in the case of dice, the probability that the true mean is 6 (or 1) is tiny. After 5 throws of a die the probability that the mean score is 6 (or 1) is about 0.012%.

      • Reply to John Finn ==> Give me an example with values that are themselves ranges. That is what I (and most others here) have been talking about.
        The range of die throw values is properly represented as 3.5(+/-2.5), with the limiting case that only whole integers are allowed. Every throw of the die will fall within that range — the range is 100% correct. This mean as a true range needs no probability — it is exactly arithmetically correct for all cases always.
        Your mean, while six times more likely than some other number, is only 1 in six.
        The actual arithmetical mean, expressed as a range, is always correct — 100% of the time.
        Of course, I have cheated, haven’t I? I include all the possible values in the range.
        As for my range of means for actual measurements, using temperatures as our real world, since the range is created by a rounding rule, and the whole integers are entirely arbitrary, there is no reason for the numeric median of the range of means to be any more likely than any other value in the range, since the range is NOT CREATED by the math, but by the original measurement.

  59. Reality vs. Statisticians:
    Welcome to the
    Plant Hardiness Zone Map of the British Isles.
    The plant hardiness zone map of the British Isles is the most detailed ever to be created for this region, and is the product of many months work studying the average winter climate statistics for the periods 1961 to 2000 recorded by the Irish and UK Met Offices.
    The USA first undertook climatic studies to provide a guide map for plant hardiness of the North American continent. These were undertaken by two independent groups: The Arnold Arboretum of Harvard University in Cambridge, Massachusetts, and the United States Department of Agriculture (USDA) in Washington, D.C. See the USDA map.
    When we took it upon ourselves to create the Plant Hardiness Zone Map of the British Isles we replicated the same zones, based on the same equivalent temperature scale as the USA to form a basic standardisation. However, we have changed the colour coding for the 7 split zones occurring in the British Isles to colours which are more meaningful to the average user. See the map using the USDA colour scheme. And click to hide it. You may also switch the map’s colour zone information off and on to display a physical map of the British Isles including the warmer major towns and cities, which apear as white patches, and also the various cooler elevations, which apear as darker patches.
    Both our map and the USDA maps are inadequate due to factors such as the frequency and duration of cold outbreaks. The main problem is the fact that the British Isles lies so far north of the equator, where both winter and winter nights are long. No other place on the planet, which shares similar winter temperatures is situated so far from the equator, and therefore our problems are quite unique. The noticeable differences are to be seen in our zones 10a & 9b. Whereas, plants from other country’s zones 10a & 9b can whithstand short, colder outbreaks than us and survive, in the British Isles they must endure a winter which is several months long with low light levels and wet weather. Consequently, there are very few plants, labelled as zone 9 or 10 that can be grown here.
    It is important to understand that average minimum air temperatures (protected from wind & direct sun) are used in defining these zones. Ambient temperatures will be lowered by local frost pockets and by wind-chill. Although, plants do not suffer from the effects of wind-chill like we do, they can get dehydrated and suffer from windburn in cold easterly winds. Therefore, you must read these zones as the maximum potential temperature for an area once windbreaks have been put in place. For example – If you see from the map that you are in a potential zone 9a then you can over winter plants outdoors, which are suitable for zones 8b, 8a, 7b etc. You will only be able to over winter plants suitable for zone 9a once protection has been put in place and a favourable microclimate created.
    Essential reading: Predicting Cold Hardiness in Palms Trebrown Blog.
    2010-2013 Update
    We had plans for an updated Plant Hardiness Zone Map of the British Isles, and many months work had already gone into this by 2011. The general trend had been a period of increased warming over the decade since we first published the map. However, the British Isles were then hit by two successively cold winters and a third not quite so cold but nevertheless colder than we had been used to. The results of these cold spells has been to bring the pattern of the map back similar to what it was. There are only slight pattern changes from the original map, and for this reason we have decided to save work and retain the original map without an update, as this is the best match for the British Isles without any temporary warming patterns which may be misleading.

  60. I just want to make sure that the ……
    Please read again instead of arguing about +or- 0.00001C
    Essential reading: Predicting Cold Hardiness in Palms Trebrown Blog.
    2010-2013 Update
    We had plans for an updated Plant Hardiness Zone Map of the British Isles, and many months work had already gone into this by 2011. The general trend had been a period of increased warming over the decade since we first published the map. However, the British Isles were then hit by two successively cold winters and a third not quite so cold but nevertheless colder than we had been used to. The results of these cold spells has been to bring the pattern of the map back similar to what it was. There are only slight pattern changes from the original map, and for this reason we have decided to save work and retain the original map without an update, as this is the best match for the British Isles without any temporary warming patterns which may be misleading.

  61. Reply to rd50 ==> Ah, nice point. I had to defend this type of viewpoint in my Baked Alaska essay. Many climate concepts are only important locally and are — how do you say? — ‘break points’ — like your Plant Hardiness Zone Maps. Fairbanks, Alaska has had average temperatures level or slightly falling for 30 years, but growing season increasing — as growing season depends on contiguous frost-free days.
    Likewise, precipitation as a yearly average is often meaningless. Farmers need rain in the right amounts and at the right times in the right amounts….far more important than annual averages. This is different for California snow pack — they need high annual numbers, the more the better. And so on….

      • Do you see the point?
        You go ahead about statistical analysis: accuracy, precision, average, corrections and on and on, but then you come back to reality: growing season, frost-free days, rain in the right amounts…..far more important than annual “global” averages…. Yes, I like this, getting back to reality. I like the Met Office, contrary to what I most often read here. I most often read here that the Met Office is terrible (or something like this).
        As a farmer, the Met Office is great information as given by them above. (Just as an aside, this info to farmers from the Met Office is available in USA, Canada, Australia, China, India and probably other countries I will admit I did not look for).
        The statisticians and their “global average” is stupid. Not the statisticians are stupid, the global average is stupid. There is no meaning to “global average” and I don’t care if their “global averages” are +or- 0.1 or 0.0001.
        This is not to denigrate your effort in the statistical information area, far from this, I assure you. Thank you.

      • Reply to rd50 ==> Well, good. I see we agree after all!
        Thanks for your input. Yes, and some here in comments object to the idea that anyone would say anything nice about the Met Office, just on [ misguided ] principle.
        Hopefully, their forthrightness will impact other governmental agencies that produce metrics or information about climate. Did you read my essay on MCID ? It touches on this aspect of climate science metrics.

  62. Yes I did read all the posts from the first to the last before I posted above.
    So, what is the answer from all you posters +or- statistical analysis.
    You gave no answer. Just complaints.

    • Reply to rd50 ==> I am not sure whom you are addressing with this.
      I thought I was agreeing with you. Who is disagreeing?
      I have found it useful here (and elsewhere on the blogs) to indicate who and what I am replying to. Sometimes comment nesting is confusing or puts your comment out of thread order or in the wrong thread altogether.

      • Sorry Kip Hansen. Not very familiar with the system here. I did answer above.
        I appreciate your questions/answers to statistical analysis and following discussion. No problem with such.
        I simply wanted to point out that the Met Office often under criticism here is very realistic and does not rely on +or- 0.1 C to give advice to farmers. And you also agreed with such, local is important.

  63. Reply to luysii ==> I’m having trouble understanding your point….give the explanation another try will you?
    I know that 1.9 is “scientifically numerically accurate” but remains nonsensical in the realm of real American families and is accurate to a percentage of ZERO% — as not a single family out of the ~ 122.5 million families in the USA, has 1.9 children.
    Precision and accuracy are dependent on field of endeavor and application.
    Kip: Thanks for responding. When things are measured in integers (like hits in baseball or the number of children in a family) the precision is (nearly) infinite. A batter either gets a hit or he doesn’t. A good hitter never gets .323 hits in a given bat. It’s an average, just like 1.9 children. To a population geneticist or someone trying to figure out if Social Security will be solvent in 30 years, 1.9 children is a very useful and (presumably) accurate number.
    What experimentalists are taught (or should be) is that any observation is inherently error prone, and that no more significant figures should ever be reported as an average of a series of measurements than the number of significant figures in an individual measurement.
    That’s what so great about the work you cite. They give the accuracy of the measurement at .1 C, meaning that the increment of .02 C trumpeted by the NYT has no scientific meaning.
    I don’t know how many observations went into the final number for global temperature in 2014. Let’s say 10^9. Can we say that the average global temperature was xxx.123456 C? Mathematically we could, experimentally we can’t.
    I hope this helps

  64. To Kip Hansen.
    Yes I read your previous post MICD. OK with it. How to relate different issues/differences? Certainly worth thinking about.

    • To Kip Hansen
      Come to think about it, the Met Office is giving to farmers the kind of advice you discussed in your MCID post!
      I love the Met Office. Forget the statisticians +/-0.1C.

      • BoM here in the Land of Under is justly criticised for its climate fantasies, but invaluable for their excellent weather forecasts. Perhaps they should stick to what they are good at 🙂 Speaking as a farmer (Ret.).

  65. Going all the way back to the top, the figure that is actually being quoted is not the temperature but the temperature anomaly. This is the difference between two temperatures and thus suffers from error propagation effects. The quoted error should therefore be sqrt(2) greater than the error in the temperature measurement. If we take the raw +/-0.1 value, the correct error in the anomaly is +/-0.14. If we take the larger figure 0f +/-0.16 as quoted by Lord Monckton, the correct error in the anomaly becomes +/-0.23. Indeed, if we then compare two anomalies, the error in their difference becomes +/-0.2 (or in Monckton,+/-0.32). How long does this make the pause (sorry, hiatus)?

    • Reply to JMcM ==> Ain’t maths wonderful ?!?
      You are right — for the reason stated and about a thousand other reasons — the stated 0.1°C is probably far too low.
      Some of this discussion is silly, as many believe that the very attempt to “measure” or “calculate” the “Global Average Surface Temperature over Land and Sea” is doomed before it gets off the ground.
      But, nonetheless, I appreciate the Met Office’s step in the right direction of scientific honesty and transparency.

  66. There cannot be more significant digits in an answer than there is in the least accurate measurement in the data. Therefore, 1.2cm + 1.58cm does not equal 2.78cm, it should be 2.8cm…only as many significant digits as in the least accurate data point. I really can’t trust anything from a ‘scientist’ who doesn’t know this. So, when HADCRUT says the temperature was 0.56 degrees plus/minus 0.1 degrees, I know they are [] pseudo-scientists.

    • Not sure why you would want to round in this case, it depends upon what these values represent.
      If I were to make 2 metal objects to fit into a slot, object A = 12mm object B = 15.8mm then the slot would need to be 27.8mm wide.
      If I wanted a sane life in manufacturing, I would allow for manufacturing and measurement tolerances:
      Say A and B are +/- 1.0mm so
      1) 12.1 + 15.9 = 28.0mm
      2) 12.0 + 15.8 = 27.8mm
      3) 11.9 + 15.7 = 27.6mm
      So the uncertainty of +/- .1mm has ‘propagated’ to the final tolerance of +/-.2mm
      The nominal measurement is 27.8mm and loses meaning if rounded.
      Also, the slot for A and B needs to be toleranced, as it is produced by a distinct manufacturing process:
      It needs to be large enough to hold A + B at their widest (28.0mm) so the slot needs to be, as a minimum, 28.1mm wide.
      Real tolerances need to be kept, added, and the result will truly represent your ‘worst case’.
      If your ‘doing maths’ then follow mathematical rules and conventions, but when working with ‘readings’ or ‘measurements’ of ‘real things’ do not throw away data.

  67. I’ve been reading this recent book rubbishing climate change Climate Change The Facts ed Alan Moran – there’s a dense chapter on Forecasting which sets out the rules and studies the inaccuracies of the past. Does the Met Office obey any of that? Can we ask them? These five year jobboes they do? One should watch the trend in the language as the truth emerges – could be good for a laugh if I’m spared.

  68. The BEST data set includes uncertainty values (margin of error) for each month. The uncertainty in the early years is huge. So large, in fact, that it is easy to construct a set of values, well within the uncertainty bars, which shows that the earth has been cooling since 1752.

    • Reply to Keith A. Nonemaker ==> Yes — that is an important point. The BEST project data set does have acknowledged “error bars” but these are expressly defined as “(0.05 C with 95% confidence)” for the year 2014 — and defines this term, in their latest missive as “the margin of uncertainty (0.05 C).”. So I expect that this is a calculated statistical Confidence Interval.
      The “margin of uncertainty” the 1890s looks to be, from their anomaly chart, a mere 0.35 and by the early 1900s, they seem to have scrubbed it out nearly altogether.

  69. EPILOGUE: Many thanks to all who read my essay and took time to participate in the discussion. All of your input is appreciated.
    Lots of interesting experience and annecdotes from varying fields of study. I truly am impressed by the general lack of sniping and trolling, which I hope represents an improvement in the hearts and minds of the readers, and not just better moderation (h/t the mods — whose hard work deserves our thanks and respect).
    I am on to another project for a while and will not be checking comments here regularly. If you have an issue related to this essay and must have an answer, you may email me at my first name at the domain i4 decimal net.
    Thanks again — Kip Hansen

    • Hi Kip, you should really read up on the theory behind averaging multiple inaccurate measurements and how that IMPROVES accuracy. Regardless of what you think It is a very common practice which WORKS in the real physical world. As an example you could read the application note AN118 from Silicon Laboratories (oversampling = multiple measurements). Quote: “Oversampling and averaging can be used to increase measurement resolution, eliminating the need to resort to expensive, off-chip ADC’s”. Or, mapped to the discussion at hand: multiple inaccurate temperature readings can be averaged to a result which have greater accuracy than any individual reading. The AN118 even contains this exact example!
      There are probably many reasons (such as various forms of bias and adjustments) why we should regard the temperature record with skepticism, the process of getting a more reliable answer by averaging many readings is however not one of them.
      Sorry, but the main conclusion of your post is just plain wrong.

      • Pax: the very well known and used data sheet AN118 is using ‘ONE’ A2D and increases the effective number of A2D bits very well.
        This thread has been about averaging readings from thousands of sensors (A2Ds if you like) and being unable to eradicate the original uncertainty.
        Not quite comparing like with like I think.
        See my response at February 10, 2015 at 12:31 pm, an uncertainty is for life not just for averaging.
        Would you suggest that if you made 1000’s of products A and B, then due to the massive number, when averaged, that they would now fit into a slot 27.8mm wide?
        I suspect a wise person would agree that if you do not know what the size/value of something is precisely, AND you need to use that something later, then you have to retain that lack of precision otherwise only bad fortune will follow. In my example above, worst case using your averaging methods, some times the products A and B will not fit into the slot. With my method, they always will fit.
        In the temperature case, you can not know which of your thermometers are reading high or low at anyone time (hence this discussion) so the only legitimate action is to keep the original ‘wide’ error bars (or uncertainty).

      • Reply to pax ==> I’ll leave the technical point to steverichards1984, who speaks specifically about AN118.
        You might consider why, with so many math and science people commenting here, that they don’t all weigh in in defense of your opinion. I think it is that they see the difference between the application of my simple grade school math to the question of finding the mean of a series of different measurements of different things, measurements themselves that must be considered ranges and the question of “a thousand thermometers in my backyard, read all at once” (or, a thousand thermometers in my never-changing backyard read over and over again.) I do appreciate your devil’s-advocate challenges.
        Nonetheless, it would be informative for you to do the basic three-data data set experiment suggested in my essay, and if you find a different answer, or can state with simple logic why the results, which will be as I state, are somehow “wrong” — I would be happy to read it.
        As Randy Newman says in “It’s a Jungle Out There”: I could be wrong now, but I don’t think so

      • Hi Steve, I do not see how using one measuring device making many readings compared to using many measuring devices each making one reading changes the central principle in any fundamental way.
        “Would you suggest that if you made 1000’s of products A and B, then due to the massive number, when averaged, that they would now fit into a slot 27.8mm wide?” No, I would not suggest that. Further I do not understand your analogy since the individual product samples are not an approximation (inexact measurement) of a real physical property – the products are the physical property themselves.
        Regarding the rest of your comment: You do not need to know which of your thermometers are reading high or low at anyone time so long as you are only measuring differences between means (temperature anomaly) – this is another reason why your product A/B analogy fails.
        Of course, the thermometers 20 years ago are not the same as today so this could indeed make this whole exercise moot due to unknown bias. But Kip is making a much stronger statement here, he claims that the mathematics of deriving a mean does not give a more accurate value. Well, it verifiably does.

      • Hi Kip, I understand your simple example and I have explained why I think it does not apply – namely that you operate with a 100% uncertainty interval. I agree that the full (100%) uncertainty range remains after the mean has been derived.

      • Reply to pax ==> Now we see that we have been talking past one another.
        I have been making the simple example that applies exactly to the case of our Co-Op Weather Station volunteer who reads the glass thermometer and regardless of actual reading, must write a value to the nearest whole degree — historically, the entire temperature record is made up of such readings until very recently. In the more recent past, a sensor does exactly the same thing, rounding to some decimal point — and then some software may do it again to some other less precise decimal point, so the example still applies. This produces a value that is, as you say, 100% uncertain across its entire range. With temperature records, this is the actuality — it is not a matter of probabilities or possibilities. ALL the temperature recordings are of this type — something “rounded” to some arbitrary accuracy.
        We see that all surface temperature records are in actuality ranges — expressed in this manner: 71(+/-0.5) and although they look like statistical uncertainty ranges, they are as you describe, 100% uncertainty ranges/intervals by their very nature, and must be treated as such. Thus “the full (100%) uncertainty range remains after the mean has been derived.”
        I am glad that we have worked through the disagreement and have arrived at a common understanding — even if you do not think applies. Thanks for sticking it out with me.

  70. The temperature graph that starts the essay above is expressed in “anomaly” units of degrees C.
    The text quotes the Met Office as “The HadCRUT4 dataset (compiled by the Met Office and the University of East Anglia’s Climatic Research Unit) shows last year was 0.56C (±0.1C*) above the long-term (1961-1990) average.”
    That is, the average value of the climate between 2 nominated points in time, such as 1961 to 1990, is subtracted from the values displayed. Therefore, the errors calculated from the data in 1961-90 propagate into the year 2010 anomaly data.
    The errors from 30 years of old data are highly probably greater than those of one year of more modern data such as the quoted year of 2010 – “The accuracy with which we can measure the global average temperature of 2010 is around one tenth of a degree Celsius.”
    This is the first and the most simple way to question the 0.1 deg claim.
    Further, in parts of the World like Australia, between about 1990 and 2010, most weather stations changed from using liquid in glass thermometers in Stevenson screens to MMTS thermistor-type detectors in various housing. An error in this changeover would propagate to the 2010 and 2014 calculations. There seems to be little data comparing the 2 instrument types. Anyone know where to look for detailed, controlled overlap data?
    The formal approach to the propagation of error is given usefully at
    http://www.bipm.org/metrology/thermometry/

  71. “One consequence of working only with temperature change is that our analysis does not produce estimates of absolute temperature. For the sake of users who require an absolute global mean temperature, we have estimated the 1951–1980 global mean surface air temperature as 14°C with uncertainty several tenths of a degree Celsius. That value was obtained by using a global climate model [Hansen et al., 2007] to fill in temperatures at grid points without observations, but it is consistent with results of Jones et al. [1999] based on observational data. The review paper of Jones et al. [1999] includes maps of absolute temperature as well as extensive background information on studies of both absolute temperature and surface temperature change.”
    Error for global data between 1951-1980 was several tenths of a degree Celsius. They are now no more accurate and the distribution of stations used reduced since then. Therefore this would support the error recently is no where near the 0.1 c claim now and maybe if anything even a little worse than the stated period. Areas of less than 50 square miles can vary by several degrees centigrade. Changing some of the locations and reducing the coverage, only increases the errors seen in local regions. These errors are much larger than the ridiculous claim of 0.1 c.

    • Reply to Matt G ==> You quote, I assume, GLOBAL SURFACE TEMPERATURE CHANGE — Hansen et al. (2010) doi: 10.1029/2010RG000345.
      Surface temperatures were still recorded from glass thermometers throughout the 1951-1980 period (almost exclusively). Thermometer temps must be recorded as ranges +/- 0.5 degrees (F or C). Since the original measurement error can not be excluded from the resulting mean, their uncertainty is minimally 0.5 degrees — and that is before all the uncertainty of in-filling etc is added onto that uncertainty range.
      Thanks for highlighting this Hansen quote.

Comments are closed.