Guest Essay by Kip Hansen
Those following the various versions of the “2014 was the warmest year on record” story may have missed what I consider to be the most important point.
The UK’s Met Office (officially the Meteorological Office until 2000) is the national weather service for the United Kingdom. Its Hadley Centre in conjunction with Climatic Research Unit (University of East Anglia) created and maintains one of the world’s major climatic databases, currently known as HADCRUT4 which is described by the Met Office as “Combined land [CRUTEM4] and marine [sea surface] temperature anomalies on a 5° by 5° grid-box basis”.
The first image here is their current graphic representing the HADCRUT4 with hemispheric and global values.
The Met Office, in their announcement of the new 2014 results, made this [rather remarkable] statement:
“The HadCRUT4 dataset (compiled by the Met Office and the University of East Anglia’s Climatic Research Unit) shows last year was 0.56C (±0.1C*) above the long-term (1961-1990) average.”
The asterisk (*) beside (+/-0.1°C) is shown at the bottom of the page as:
“*0.1° C is the 95% uncertainty range.”
So, taking just the 1996 -> 2014 portion of the HADCRUT4 anomalies, adding in the Uncertainty Range as “error bars”, we get:
The journal Nature has a policy that any graphic with “error bars” – with quotes because these types of bars can be many different things – must include an explanation as to exactly what those bars represent. Good idea!
Here is what the Met Office means when it says Uncertainty Range in regards HADCRUT4, from their FAQ:
“It is not possible to calculate the global average temperature anomaly with perfect accuracy because the underlying data contain measurement errors and because the measurements do not cover the whole globe. However, it is possible to quantify the accuracy with which we can measure the global temperature and that forms an important part of the creation of the HadCRUT4 data set. The accuracy with which we can measure the global average temperature of 2010 is around one tenth of a degree Celsius. The difference between the median estimates for 1998 and 2010 is around one hundredth of a degree, which is much less than the accuracy with which either value can be calculated. This means that we can’t know for certain – based on this information alone – which was warmer. However, the difference between 2010 and 1989 is around four tenths of a degree, so we can say with a good deal of confidence that 2010 was warmer than 1989, or indeed any year prior to 1996.” (emphasis mine)
This is a marvelously frank and straightforward statement. Let’s parse it a bit:
• “It is not possible to calculate the global average temperature anomaly with perfect accuracy …. “
Announcements of temperature anomalies given as very precise numbers must be viewed in light of this general statement.
• “…. because the underlying data contain measurement errors and because the measurements do not cover the whole globe.”
The reason for the first point is that the original data themselves, right down to the daily and hourly temperatures recorded in humongous data sets, contain actual measurement errors – part of this includes such issues as accuracy of equipment and units of measurement – and errors introduced by methods to attempt to account for “measurements do not cover the whole globe” – various methods of in-filling.
• “However, it is possible to quantify the accuracy with which we can measure the global temperature and that forms an important part of the creation of the HadCRUT4 data set. The accuracy with which we can measure the global average temperature of 2010 is around one tenth of a degree Celsius.”
Note well that the Met Office is not talking here of statistical confidence intervals but “the accuracy with which we can measure” – measurement accuracy and its obverse, measurement error. What is that measurement accuracy? “…around one tenth of a degree Celsius” or, in common notation +/- 0.1 °C. Note also that this is the Uncertainty Range given for the HADCRUT4 anomalies around 2010 – this uncertainty range does not apply, for instance, to anomalies in the 1890s or the 1960s.
• “The difference between the median estimates for 1998 and 2010 is around one hundredth of a degree, which is much less than the accuracy with which either value can be calculated. This means that we can’t know for certain – based on this information alone – which was warmer.”
We can’t know (for certain or otherwise) which is different from any of the other 21st century data points that are reported as within 100ths of a degree of one another. The values can only be calculated to an accuracy of +/- 0.1˚C
And finally,
• “However, the difference between 2010 and 1989 is around four tenths of a degree, so we can say with a good deal of confidence that 2010 was warmer than 1989, or indeed any year prior to 1996.”
It is nice to see them say “we can say with a good deal of confidence” instead of using a categorical “without a doubt”. If two data are 4/10ths of a degree different, they are confident of a difference and the sign, + or -.
Importantly, Met Office states clearly that the Uncertainty Range derives from the accuracy of measurement and thus represents the Original Measurement Error (OME). Their Uncertainty Range is not a statistical 95% Confidence Interval. While they may have had to rely on statistics to help calculate it, it is not itself a statistical animal. It is really and simply the Original Measurement Error (OME) — the combined measurement errors and lack of accuracies of all the parts and pieces, rounded off to a simple +/- 0.1˚C, which they feel is 95% reliable – but has a one in twenty chance of being larger or smaller. (I give links for the two supporting papers for HADCRUT4 uncertainty at the end of the essay.****)
UK Met Office is my “Hero of the Day” for announcing their result with its OME attached – 0.56C (±0.1˚C) – and publicly explaining what it means and where it came from.
[ PLEASE – I know that many, maybe even almost everyone reading here, think that the Met Office’s OME is too narrow. But the Met Office gets credit from me for the above – especially given that the effect is to validate The Pause publically and scientifically. They give their two papers**** supporting their OME number which readers should read out of collegial courtesy before weighing in with lots of objections to the number itself. ]
Notice carefully that the Met Office calculates the OME for the metric and then assigns that whole OME to the final Global Average. They do not divide the error range by the number of data points, they do not reduce it, they do not minimize it, they do not pretend that averaging eliminates it because it is “random”, they do not simply ignore it as if was not there at all. They just tack it on to the final mean value – Global_Mean( +/- 0.1°C ).
In my previous essay on Uncertainty Ranges… there was quite a bit of discussion of this very interesting, and apparently controversial, point:
Does deriving a mean* of a data set reduce the measurement error?
Short Answer: No, it does not.
I am sure some of you will not agree with this.
So, let’s start with a couple of kindergarten examples:
Example 1:
Here’s our data set: 1.7(+/-0.1)
Pretty small data set, but let’s work with it.
Here are the possible values: 1.8, 1.7, 1.6 (and all values in between)
We state the mean = 1.7 Obviously, with one datum, it itself is the mean.
What are the other values, the whole range represented by 1.7(+/-0.1)?:
1.8 and every other value to and including 1.6
What is the uncertainty range?: + or – 0.1 or in total, 0.2
How do we write this?: 1.7(+/-0.1)
Example 2:
Here is our new data set: 1.7(+/-0.1) and 1.8(+/-0.1)
Here are the possible values:
1.7 (and its +/-s) 1.8, 1.6
1.8 (and its +/-s) 1.9, 1.7
What’s the mean of the data points? 1.75
What are the other possible values for the mean?
If both data are raised to their highest value +0.1:
1.7 + 0.1 = 1.8
1.8 + 0.1 = 1.9
If both are lowered to their lowest -0.1:
1.7 – 0.1 = 1.6
1.8 – 0.1 = 1.7
What is the mean of the widest spread?
1.9 + 1.6 / 2 = 1.75
What is the mean of the lowest two data?
1.6 + 1.7 / 2 = 1.65
What is the mean of the highest two data:
1.8 + 1.9 / 2 = 1.85
The above give us the range of possible means: 1.65 to 1.85
0.1 above the mean and 0.1 below the mean, a range of 0.2
Of which the mean of the range is: 1.75
Thus, the mean is accurately expressed as 1.75(+/-0.1)
Notice: The Uncertainty Range, +/-0.1, remains after the mean has been determined. It has not been reduced at all, despite doubling the “n” (number of data). This is not a statistical trick, it is elementary arithmetic.
We could do this same example for data sets of three data, then four data, then five data, then five hundred data, and the result would be the same. I have actually done this for up to five data, using a matrix of data, all the pluses and minuses, all the means of the different combinations – and I assure you, it always comes out the same. The uncertainty range, the original measurement accuracy or error, does not reduce or disappear when finding of the mean of a set of data.
I invite you to do this experiment yourself. Try the simpler 3-data example using the data like 1.6, 1.7 and 1.8 ~~ all +/- 0.1s. Make a matrix of the nine +/- values: 1.6, 1.6 + 0.1, 1.6 – 0.1, etc. Figure all the means. You will find a range of means with the highest possible mean 1.8 and the lowest possible mean 1.6 and a median of 1.7, or, in other notation, 1.7(+/-0.1).
Really, do it yourself.
This has nothing to do with the precision of the mean. You can figure a mean to whatever precision you like from as many data points as you like. If your data share a common uncertainty range (original measurement error, a calculated ensemble uncertainty range such as found in HADCRUT4, or determined by whatever method) it will appear in your results exactly the same as the original – in this case, exactly +/- 0.1.
The reason for this is clearly demonstrated in our kindergarten example of 1, 2 and 3-data data sets – it is a result of the actual arithmetical process one must use in finding the mean of data each of which represent a range of values with a common range width*****. No amount of throwing statistical theory at this will change it – it is not a statistical idea, but rather an application of common grade-school arithmetic. The results are a range of possible means, the mean of which we use as “the mean” – it will be the same as the mean of the data points when not taking into account the fact that they are ranges. This range of means is commonly represented with the notation:
Mean_of_the_Data Points(+/- one half of the range)
– in one of our examples, the mean found by averaging the data points is 1.75, the mean of the range of possible means is 1.75, the range is 0.2, one-half of which is 0.1 — thus our mean is represented 1.75(+/-0.1).
If this notation X(+/-y) represents a value with its original measurement error (OME), maximum accuracy of measurement, or any of the other ways of saying that the (+/-y) bit results from the measurement of the metric then X(+/-y) is a range of values and must be treated as such.
Original Measurement Error of the data points in a data set, by whatever name**, is not reduced or diminished by finding the mean of the set – it must be attached to the resulting mean***.
# # # # #
* – To prevent quibbling, I use this definition of “Mean”: Mean (or arithmetic mean) is a type of average. It is computed by adding the values and dividing by the number of values. Average is a synonym for arithmetic mean – which is the value obtained by dividing the sum of a set of quantities by the number of quantities in the set. An example is (3 + 4 + 5) ÷ 3 = 4. The average or mean is 4. http://dictionary.reference.com/help/faq/language/d72.html
** – For example, HADCRUT4 uses the language “the accuracy with which we can measure” the data points.
*** – Also note that any use of the mean in further calculations must acknowledge and account for – both logically and mathematically – that the mean written as “1.7(+/-0.1)” is in reality a range and not a single data point.
**** – The two supporting papers for the Met Office measurement error calculation are:
Colin P. Morice, John J. Kennedy, Nick A. Rayner, and Phil D. Jones
and
J. J. Kennedy , N. A. Rayner, R. O. Smith, D. E. Parker, and M. Saunby
***** – There are more complicated methods for calculating the mean and the range when the ranges of the data (OME ranges) are different from datum to datum. This essay does not cover that case. Note that the HADCRUT4 papers do discuss this somewhat as the OMEs for Land and Sea temps are themselves different.
# # # # #
Author’s Comment Policies: I already know that “everybody” thinks the UK Met Office’s OME is [pick one or more]: way too small, ridiculous, delusional, an intentional fraud, just made up or the result of too many 1960s libations. Repeating that opinion (with endless reasons why) or any of its many incarnations will not further enlighten me nor the other readers here. I have clearly stated that it is the fact that they give it at all and admit to its consequences that I applaud. Also, this is not the place continue your One Man War for Truth in Climate Science (no matter which ‘side’ you are on) – please take that elsewhere.
Please try to keep comments to the main points of this essay –
Met Office’s remarkable admission of “accuracy with which we can measure the global average temperature” and that statement’s implications.
and/or
“Finding the Mean does not Reduce Original Measurement Error”.
I expect a lot of disagreement – this simple fact runs against the tide of “Everybody- Knows Folk Science” and I expect that if admitted to be true it would “invalidate my PhD”, “deny all of science”, or represent some other existential threat to some of our readers.
Basic truths are important – they keep us sane.
I warn commenters against the most common errors: substituting definitions from specialized fields (like “statistics”) for the simple arithmetical concepts used in the essay and/or quoting The Learned as if their words were proofs. I will not respond to comments that appear to be intentionally misunderstanding the essay.
# # # # #
Discover more from Watts Up With That?
Subscribe to get the latest posts sent to your email.

Or the opposite; I’ve never been able to decide. Superbly explained, Kip. There should be more of it 🙂
The HadCRUT4 dataset publishes not only the data but also the measurement, coverage, and bias uncertainties, giving a combined uncertainty of +/- 0.16 K.
Later this year when the Homogenization errors start making some headway into the mainstream press, there will be a horrible realization…that without the adjustments, there is almost no statistically significant warming since the early 1940s.. At that point the facade of certainty will crumble, except for the faithful who may well continue the green religion well into the next glacial period. At that point the hindcasting of the models will be invalidated.
I expect a soul-crushing loss of “faith” in science when this falls apart. People never should have had faith in science to begin with…it truly has no place there. People will once again realize that academics aren’t the best people to consult on real-world problems. In an academic world you can twist the world to your will with thought experiments and ignore the consequences. But when you f*ck with the real world…the real world f*cks you back.
Poitsplace said “I expect a soul-crushing loss of “faith” in science when this falls apart.”
That’s already happened to some of us, thank God. But politicans have their sights on gaining a vast amount of power and wealth, with the Green movement as their justification. And so they’re dumping vast amounts of money into science to get the results they need. Even when more people awaken to the hoax, how can anyone stop that momentum?
What bothers me is that none of these global temperature anomalies derive their 90% confidence interval directly from the distribution of the underlying means- they appear to be applying the Central Limit Theorem to variables that do not have identical probability distributions. The CLT has rules and the method used to determine the uncertainty of the mean temperature anomalies appears to break those rules (in a number of ways).
How can a variable be independent when it is subject to homogenisation, for instance?
How can we assume that the distribution of monthly mean temperature changes (from an historical baseline) is identical for every calendrical month or at every location on the earth?
How can we accept that any change in monthly mean anomaly distribution occurs simultaneously at every place on the planet?
Surely local climate changes often do occur in isolation.
With that being the case, then any anomaly that is 0.32 lower than the 2014 anomaly average could be statistically tied with 2014. We therefore have a 21 way tie for first place according to Hadcrut4.3 with 1990 the earliest year.
Reply to Comment Thread started by Git ==> All good points. Monckton correctly points out that it is quite possible to find a greater overall interval for GAST per HADCRUT4 than the stated 0.1C delving deep into their papers.
However, I give them credit for setting the record even approximately straight in their warmest year announcement — a heart-warming occurrence.
Agreed – it is really important to science that results are [regularly] ( ideally always) quoted with the error bars in the press. This has got to be so much better than the current norm – the papal edict like quotation in the media giving a false sense of absolute certainty to the public.
I also think the simplest method, based on the measurement error, as in this post, should be used. It is easier verified adn to my mind is the only valid method. I am not even sure I believe in the veracity of using anomolies rather than deriviing trends for an individual station’s raw data.
“Does deriving a mean* of a data set reduce the measurement error? Short Answer: No, it does not.”
I have read the essay and I’m going to eventually look at it again. If I understand correctly, there is a difference between measurement error and the signal:noise ratio.
My understand of S:N is that if you want to reduce noise, you must take multiple measurements of the same thing. So in the case of temperature stations, the way to reduce noise would be to have several thermometers (e.g. ten of them) working in parallel, in the same spot. To get the purest possible signal, you would then derive the mean from these ten thermometers. But the measurement error would still be +/-0.1.
So, am I understanding correctly?
P.S. I am a bit obsessive about spelling and grammar, so can someone check that ‘heros’ is a legitimate plural of ‘hero’?
[‘heroes’ ~mod.]
Thank you. It may be basic arithmetic (which I agree with) but basic spelling is also important.
This is an original post. Not just a throwaway comment. This annoys me quite a lot.
reply re Heroes/heros ==> For those unable to get over the heroes/heros question. In English, the correct plural is HEROES, unless you are referring to the sandwich by the same name, in which case it is HEROS. In Greek, the original word is commonly believed to have been “heros” (with Greek letters).
Thanks to all those defending standardized English spelling.
Taking the average of a large number of readings does in fact give you a more accurate mean IF the readings are truly distributed along a bell shaped probability curve.
The faux science in the article is to ensure that in the specific examples given, they are not.
Years ago for O level physics we did this with everybody calculating the value of ‘g’ using pendulums. The distribution of answers was very bell curved. The mean answer was within 1% of the official value. None of the actual readings and individual calculations was, however.
If those “…large number of readings…” have been altered/adjusted/manipulated beyond recognition (Which is proven they have), calculating an “average” from that data is meaningless!
but one has to first establish that the errors are normally distributed, which may not be the case.
Think, for example, of self measurements of penis size. do you seriously maintain that the errors in the repeated measurements of one man, or the errors of many men’s results would normally distributed?
A remarkably good proxy for measurements prone to confirmation bias.
Hi Leo, yes if you measure the same thing dozens of times or more you will get a bell shaped curve. But if the chronometer you are using is accurate to +/- 1 millisecond, then you will be stuck with this underlying error as per Kip’s arithmetic. Each data point will be a bit fuzzy.
It all depends on what you’re doing. A common signals experiment involves recovering a sine wave buried deep under noise (20 dB for instance). The detector is a comparator which tells you only if the signal is greater or less than zero. I’m not sure how you would express the precision of the comparator. You could argue that it is 50% of the measurement range.
The point is that a crude not-very-precise detector can give you any precision you want as long as you can average over enough cycles.
Kip says:
Leo Smith and I would both say, from our direct experience, that the statement is true for the right kind of data. I can also tell you (from painful experience) that you can’t just assume that you can improve accuracy by taking an average.
Matt Briggs points out that most folks who use statistical techniques don’t actually understand what they are doing. Amen Brother.
If you to know how the math works in the signals case here’s an url: http://www.rand.org/content/dam/rand/pubs/papers/2005/P305.pdf
Sampling works for the sine wave because it has an underlying mean (zero). The same can be said for calculating a value for ‘g’ using pendulums.
It can be argued however that the earth does not have an underlying average temperature as measured by the MET or other services, because different average surface temperatures are possible for the same amount of energy (because energy radiates as the 4th power, while average use the first power). Depending on how the energy is distributed, 2 identical earths, receiving and radiating the exact same amount of energy, could thus have two different average surface temperatures.
Thus, statistical inferences based on average surface temperature without regard for distribution could be meaningless. Worse, they could be misleading. And yet we have an entire scientific discipline using average surface temperature as a metric. Hardly the stuff of science.
Here is a simple test that one can apply to check that what I’m saying is true. Divide the earth into two parts. In one example, both parts have a temperature of 10. In the second example, one part has temperature of 1, the other 11.892.
(10+10)/2 = 10 average
(10^4+10^4) = 20000 energy out
(1^4+11.892^4) = 20000 energy out
(1+11.892)/2 = 6.4 average
Same energy out, different average temperatures!!!
Since energy in must equal energy out, depending on how you divide the earth, you can arrive at two different average surface temperatures for the exact same amount of energy in and out.
What this means is that techniques such as pairwise adjustment are invalid, because they adjust the temperature linearly, while the physical word is determined by radiative balance which is a 4th power calculation.
What it also means is that average surface temperature is a meaningless metric, because an infinite number of different average surface temperatures are possible for the EXACT SAME forcings.
Leave the forcings unchanged, and the earth can vary its average surface temperature simply by changing the distribution of energy. Make one place very cold and another place a little bit hotter; and because radiated energy varies as the 4th power the energy balance remains the same, while the average changes drastically.
So for example, on Venus where CO2 concentrations are high, temperatures are almost the same between day and night, even though day and night each last 116 earth days, because CO2 radiates heat sideways around the planet.
The same thing is happening on earth. While the energy remains unchanged, the average appears to be increasing because the difference between min and max is changing. The max is not increasing, if anything it will be decreasing. What we are seeing is an arithmetic illusion. A linear measure is being used to describe a 4th power event, which delivers a nonsense result.
An entire branch of Science, Climate Science, has been mislead because they are using a linear measure as a metric for a 4th power physical phenomenon.
Which would also explain why the climate models have gone off the rails. they are trained using surface temperatures, but this is only valid if the energy distribution remains unchanged as you add CO2 to the atmosphere.
But adding CO2 to the atmosphere doesn’t just affect radiation back to the surface. It also affects sideways radiation between night and day and the equator and the poles. Thus you would need to train the models using energy not temperature.
Otherwise, as CO2 increases, your models would see the change in temperature distribution as a change in energy, while in the physical world there would be no change. This would make the models run hot.
Kip, while what you say is true, it’s also irrelevant to the accuracy of either the measured temperatures, or the average of them. The problem with averaging temperatures isn’t just that there’s a rounding process, but that the devices cannot be perfectly accurate, and their inaccuracies do not deviate from the true temperature by equal measures on both sides. If the device says the temp is 70.501 degrees, and we round that up to 71, we risk having rounded up when the temp is actually 70.499 degrees, and should have rounded down to 70. The rounding process merely compounds the error, therefore. And the real uncertainty is that we just don’t know how accurate our temperature device is. Nothing in physics requires these devices to be equally wrong in all directions. Most often, they will show a bias, but one that could only be detected by a much more accurate thermometer. And so there is an inherent uncertainty in any measuring device that must be well-known. Rounding procedures, as well as homogenization, averaging, and so on, only compound that original uncertainty, and can actually make it larger rather than smaller. So applying more and more statistical fixes to the data can easily make it less accurate and more uncertain, because these processes introduce even more biases into the process, rather than eliminating them. The more one tampers with the data, the more one creates further problems, until the original relationship to what is being measured can be almost entirely lost, and only a sticky, statistical goo remains.
Karim
Good for you to be obsessive about grammar and spelling. I try to get it all correctly myself, often without success. I see and hear poor examples all the time, exactly where you shouldn’t, on TV and in newspapers.
You sentence starting “My understand..” might have been autocorrect. It should have been “My understanding…”
I just hand to correct my autocorrect 3 times just trying to type all that. 🙂
“Your sentence…”
Lol. Autocorrect got me again. Seniors and phones are a bad mix.
Reply to “hereos?” ==> Indeed….many thanks. I will correct my MS Word dictionary as well. ~ kh
as the system is never in equilibrium there is no reason that if you use ten identical thermometers with no error at all not situated extacty at the same place you will find the same numbers..
the “temperature” is a function of the exact location, the temperature of the air, the wind, the humidity and the device…
This is my understanding: to reduce the S:N error you must take multiple readings of the “same thing.” But I think the climate scientists substitute multiple readings of temperature on different days from the same station. That is not the same thing. That is multiple readings of different “things” because each days temperature reading is a different “thing.” And multiple readings from different stations are obviously not the “same thing.” I would agree that if you had an array of thermometers at the same location that would increase the accuracy – but any other manipulation of the data can not increase its accuracy. Tell me where I’m wrong.
Reply to Scott Scarborough ==> You are not wrong. But this is a hard pill to swallow for those bred and raised on statistics — which general substitutes ethereal probabilities for an engineers favored realities.
No matter how you put it, temperature is an intensive property and “global average temperature” is a meaningless metric for a non-equilibrium system.
Indeed, I can only agree. But Kip’s point, I believe, is it still behoves us to get the arithmetic correct.
However Kip gets it wrong as far as I can tell.
Leo, all you’ve done with multiple readings in one place with similar instruments is minimized the probability that one or more of the readings were wonky. You have not increased the accuracy, only the confidence that the readings are inside the stated range of your sensor. And it’s only valid for a single space, it doesn’t work when you are comparing two different locations.
Take 100 thermometers with +/-0.1°C accuracy and place them in a room as close to one another as you can in a homogenous volume of air. Read all of them simultaneously. You will most likely get a bell curve out of the readings but it’s not guaranteed. You will see a difference as each device is in a different location. (Unless it’s digital with one digit right of the decimal point.) The center of your bell curve is at 20.0°C and your mean/average is also at 20.0°C. You have assumed that all these devices increase your accuracy, but you’d be wrong. They only increase your confidence that the reading is inside of the range 19.9°C to 20.1°C. Now have all of your thermometers checked against a standard that is +/-0.001°C. All of the following are valid for your original data:
1 – All 100 thermometers were calibrated at the factory by the same guy and the same standard one right after another and are all inside the stated tolerance, but 0.1°C too high making your true temperature 19.9°C.
2 – All 100 thermometers were calibrated at the factory by the same guy and the same standard one after another and are all inside the stated tolerance, but 0.1°C too low making your true temperature 20.1°C.
3 – Any combination of the above giving you a spread between the low end of the tolerance and the high end of the tolerance.
Your 100 thermometers did not increase your accuracy one whit. If one or two thermometers in the mix are NOT inside the stated tolerance, taking multiple readings from multiple devices (in the same location) dilutes their inaccuracies. The problem here is that they don’t have 10 thermometers at each location. That means you have no confidence check against instrument inaccuracy in any of these databases. Each reading stands alone accuracy-wise and no amount of mathematical or statistical gyrations can change that.
Yes but, when a single instrument isn’t giving the ‘expected’ readings, don’t they just adjust to correct for that. Who needs multiple thermometers at a location when you might have another one only 10km away ?
In data quality, having 100 different address on file for the same person does not increase the accuracy of the result. In fact it reduces the accuracy as compared to having only 1 address on file.
“The accuracy with which we can measure the global average temperature of 2010 is around one tenth of a degree Celsius.”
They do not say that this accuracy is applicable to 1880 or even 1990. Measurement techniques have improved by Argos and satellites, which may be one of reasons that the temperature is so flat.
“one tenth of a degree ” (THE WHOLE WORLD??)
The hubris of that statement is unfathomable.
It is indeed. And the ballyhoo over it is stupendous.
Most of the surface temperature recording stations in the U.S. have been shown by Anthony Watts and his crew to have errors greater than 1 degree C. I don’t see any reason to believe that the recording stations in the rest of the world are any better. It is not possible to take that data with those kind of errors in it and produce a world temperature that is accurate to 0.1 any measurement system. You can play with the data mathematically for the rest of eternity and you still can’t claim you have data that is accurate to within 0.1 degree. Making the adjustments just biases that error in another direction. My degree is not in math, but you don’t need a degree to figure this out.
Due to bad siting there is no reason to assume modern surface readings are more accurate than those of the past. The error bar has to be huge when graphed for any period in the history of temperature recording in the United States. After all, Michael Mann used that error to draw his tricky hockey stick graph.
The argument for how warm it has gotten since 1890 used to be about whether we were talking about 0.3 or 0.7, or whatever. Now Warmists have inserted another zero into the mix to continue their claims that warming continues unabated. At least the Met Office papers will give skeptics ammunition from the Warmist side to throw into the mix.
How much better are the satellite records, from an accuracy, consistency of siting and lack of infilling point of view ?
Reply to Global Cooling ==> This point is important to the overall discussion of “What is the true Original Measurement Error in the entitre global temperature record?” (We note that it will not be a single number, but different numbers, hopefully narrowing as the time approaches the present.)
That line baffles me, so we can measure the global average temperature?
I am shocked.
It may not be perfect, but matters would improve if GISS and NOAA followed the same path of being up front with their uncertainty ranges when publishing.
GISS and NOAA are too busy “adjusting” the data to match a progressivist agenda to bother calculating an uncertainty range. IF government agencies were honest about reporting, the range of uncertainty would probably be surprisingly large.
NOAA & GISS are probably quite uncertain about their uncertainties and as the public don’t want uncertainties they are only used to explain failed predictions and exploited to accentuate (or reverse) trends.
If the uncertainties were firstly acknowledged and secondly calculated/estimated honestly, my guess is that they would swamp any temperature change over the last 40+ years. That’s not in GISS and NOAA’s own interests so they don’t.
I agree fully with Kip Hansen. Whether you use 1 thermometer, or a billion evenly distributed thermometers around the globe, the OEM error is still as stated.
Reply to garymount ==> Thank you, sir!
Intensive qualities don’t average. A 1 kg pot of water at 100° and a 1 kg block of iron at 0° will not average to 50° when placed in contact, because the water contains more heat (the specific heat is higher). So the average will be something higher. So arithmetically averaging different areas or objects is un-real and uninformative.
Except, the annual average daily high temperature at Hobart, Tasmania is 16.9°C. Svalbard is 4.7°C. I’d say Hobart is warmer than Svalbard. Apparently you wish to disagree. Why?
Where is he disagreeing with that? What he is saying is correct. The energy content of the air depends on humidity and pressure, so the resultant temperature of mixing Hobart andSvalbard air samples together isn’t simply the arithmetic mean.
A couple of questions. Wouldn’t this assume that atmospheric composition is uniform? Do the new CO2 maps coming out from satellites obviate this assumption. Is the atmosphere really “well-mixed” enough to allow this assumption? Are the differences between water and air accounted for in data sets? Humidity is mentioned above. Are the differences between salt water and fresh water accounted for – they would matter in the Great Lakes region. If these questions are stupid or irrelevant you can ignore me. We are, however, talking about hundredths and tenths of degrees.
@ur momisugly DaveS
I didn’t average the Hobart and Svalbard averages. That wouldn’t provide me with any useful information. However, the averages taken at Hobart and Svalbard do provide me with additional information, viz Hobart is warmer than Svalbard. Thus averaging intensive properties can convey useful information if you understand the limitations.
Please tell me what the average temperature is between Hobart and Svalbard, and how you arrived at that number?
TPG, you are bringing a different argument to the table. And it is silly, since everyone will agree that Hobart is warmer than Longyearbyen (which is what I guess you mean when you say Svalbard).
The point here is that energywise, the situation between Hobart and Svalbard is even more different due to the difference in composition and thermodynamic properties of the air sampled at both sites. AND, taking an average of the two locations, not only the average of each, is silly, meaningless and in no way substantiates a reduction in the original measurement error.
BrianHs argument, as I read it, is that temperature is a proxy for energy – THUS you need to know the thermodynamic properties of what you are measuring temperature of before you can compare temperatures of different things. Different things being, in this case, different masses of air.
@ur momisugly Anders
I argue against the claim put here that averaging intensive properties is completely without meaning.
Well, yes, a lot of the above, but I see this as a walk back. When a group like Met Office and UEA see, almost every day, analyses of their sources and methods (Homewood on South America recently where cooling was switched to warming, jo nova on Ozzie Met switcheroos….) that they have tried to keep secret (FOI requests ignored), they realize the jig is up.
I was first struck by the naivete of all the climateers with climategate. These folks of course didn’t expect the emails to be released, but the lack of understanding that the internet, cell phone (recording such as the Harvard adviser on Obamacare) had fundamentally and (hopefully forever – I’m nervous about that!) changed the landscape of the playing field. If your output is shoddy, mistaken, fraudulent….you are uncloaked in minutes to days! You are performing in front of the world and missteps in method or integrity are detected and under analysis almost in real time!
Scary, but a boon to humankind. This move toward virtual perfection in quality control of human endeavor will give a quantum lift to excellence in science in particular and will also weed out those who can’t perform at this heady level. It also has proven to be the eyes and ears that governments have to work under (Please, don’t let them regulate this away…..There is a lot of thought being given to just that and centrally planned countries already have given the world a blueprint for doing it worldwide!)
Brian H: “So arithmetically averaging different areas or objects is un-real and uninformative.”
TPG: “Except, the annual average daily high temperature at Hobart, Tasmania is 16.9°C. Svalbard is 4.7°C. I’d say Hobart is warmer than Svalbard. Apparently you wish to disagree. Why?”
The second statement has nothing to do with the first statement. Pompous Git has setup a strawman and wrestled with it.
Too true J.A.
He is also sloppy with his statement “I’d say Hobart is warmer than Svalbard.”
That statement is not always true. Perhaps it is true more often than not. Probably.
You give an example calculation
1.9 + 1.6 / 2 = 1.75
However, you need to place an opening bracket before the 1.9 and a closing bracket after the 1.6 to make the calculation correct. For example:
(1.9 + 1.6) / 2 = 1.75
The other examples also need to be adjusted accordingly.
Regards
Climate Heretic
Reverse Polish makes brackets unnecessary
1.9 1.6 + 2 /
Well there is no equal either. In the Forth language “.” means print to the screen what is on the stack
So it is really: 1.9 1.6 + 2 / . 1.75 OK – “OK” being the standard “done” in Forth.
And of course Forth mainly uses integers unless you have a Floating Point Routine. But no point in getting overly complicated. For now.
Reply to Heretic ==> You are correct on notation . .. I am doing simple 6th Grade addition and division … written as if speaking itfrom a podium.
The Northern Hemisphere estimate is an anomaly of about 0.75 degrees and the SH about 0.38, so the OME must be greater than their difference of about 0.4 degrees? What would happen if one compared hemispheres divided be a meridian like 0 degrees Greenwich?
Remember that much data was converted from Fahrenheit to Celsius. That conversion, to and fro, introduces a rounding error of the magnitude of 0.1 degree that mathematicians better than I can calculate precisely. I’ve played with real data and it does make a difference if you know from the original record that many values were expressed in degrees F with nothing after the decimal.
Here is a quote from Australia’s BOM on rounding –
“To address this problem, a rounding-corrected version of the ACORN-SAT data set is to be prepared.
Following the method of Zhang et al. (2009), it involves adding a random increment of between
−0.5°C and +0.5°C to all data that have been rounded to the nearest degree (defined as locationmonths
in which all values are in whole degrees).”
http://www.cawcr.gov.au/publications/technicalreports/CTR_049.pdf
There is more if you wish to dig enough.
One problem I have with BoM data is that temperatures recorded since the decision to measure to the nearest one tenth of a degree, some 18% of the numbers end in zero.
Reply to Geoff Sherrington and The Git ==> It is incredible to me that Zhang (2009) recommends (not read it thoroughly, so I don’t claim it does say so) that the solution to the problem clearly described in my blockquote comment above is to add yet more random errors.
In my 6th Grade opinion, it would be better to accept the fact and limitations of the original measurements being ranges — Temp(+/-0.5).
Kip Hansen
You quote the UKMO as saying
And you say
On the basis of that you assert
Sorry, but providing a partial error estimate for one datum in a data set of more than 100 data points is NOT heroic: it is inadequate.
Furthermore, it is not only inadequate but is also misleading; as Monckton of Brenchley says above
An inadequate and misleading presentation is not scientific information. It is political spin which merits disdain and does not deserve respect bordering on hero worship.
Please note that I am providing a rebuttal of the main argument in your essay and I am not posting a “quibble” and/or a “one man crusade for truth”
Richard
You beat me to it Richard. You are correct and I am sorry Kip, you are mistaken. Met Office is sliding a false error range to their customers and the world overall.
Richard nails it directly; a measurement error of (+/- 0.1) is the maximum accuracy for most modern physical thermometers.
– That is, if read by a competent thorough and meticulous scientist.
– That is, if the thermometer is officially calibrated correctly on a regular basis!
– Which gives you a measurement error for one thermometer, one reading, one employee
Two thermometers give you the individual thermometer reading accuracy, cumulative.
When Met Office gives you an error range for the global temperature average, their error range must include all possible error ranges:
— For every temperature reading (every thermometer should have a sticker listing the ‘officially certified’ error range.)
— For every station (every station should have a history of ‘verified’ accuracy and an error range.)
— For every employee (every employee has a history of experience and education who becomes known for certain standards of work and an error range.)
All of the ‘modern’ networked stations are not immune to error readings, station installation, station certification, connectivity, etc… all of which establishes the error range for that particular node station
When Met Office gives an anomaly comparison they should be adding in the error ranges for the base years plus the error range for the current year.
One thing you can be absolutely sure of; when Met Office claims a measurement accuracy of (+/- 0.1) they are referring to one individual finest individual temperature reading alone, not their true accuracy.
Reply to Richard S and ATheoK ==> I think we all (nearly all) agree that the Met Office’s admitted / announced “+/-0.1°C” is in actuality to low, way too low, ridiculously too low, a fraud, a lie, or some other complaint.
They have made a giant leap towards truth and transparency, in my opinion. They are the first major governmental agency producing global temperature numbers to do so. Their doing so validates The Pause and invalidates all the “warmest year” nonsense. They knew it would and published anyway.
Give them credit for that alone — if you can find it in your hearts and minds. I do.
Kip:
You’re allowed your personal opinions and peccadillos; even if you do write excellent articles and put them up here with big bull’s eyes.
My heart and mind are somewhat open, (in my narrow opinion); but when a major player in the CAGW scam announces a different lie without explanation, apology or even regret, that lie is still fraudulent.
This is not the result of good intentions or better science. Credit should be given when those hacks actually earn credit, until then their F– might be raised to F-
Reply to ATheoK ==> Well, at least you raise their grade!
Met Office has some of the most powerful computers in the world, but they calculate what they are told to do. I wasn’t entirely happy with their CET annual numbers and voiced my view last July. Now they accepted the method suggested.
( Vukcevic July 2014: Since monthly data is made of the daily numbers and months are of different length, I recalculated the annual numbers, using weighting for each month’s data, within the annual composite, according to number of days in the month concerned. This method gives annual data which is fractionally higher, mainly due to short February (28 or 29 days). Differences are minor, but still important, maximum difference is ~ 0.07 and minimum 0.01 degrees C. The month-weighted data calculation is the correct method.
MetOffice 2015: Because February is a shorter month than all the others, and is usually colder than most all other months, our previous method, which was giving February equal weight alongside all other months, caused the annual temperature values to be pulled down, i.e. giving estimated annual values which were very slightly too cold (the difference varying between 0.01 and 0.07 degC. )
They also said that all data files including graphs (seasonal temperatures, graphs etc are now recalculated according to the weighting method. Of course this applies not only to the CET temperatures only, but other UK composites, annual averages for number of other climate related data with monthly averages etc. etc.
It is a positive development that the large institution is willing to undertake a major task to alter all their records.
It should be interesting to find out how GISS and NOAA have calculated the US and other annual averaged indices.
Why calculate to an arbitrary concept such as a ‘month’ which you say requires a length of month weighting?
Why not use the daily measurements to create a year-to-date figure then compare completed year or YTD figure to the corresponding previous periods?
In the early parts of the records no complete daily values were available, so monthly values are computed from what was available and annual average calculated. I assume that the force of habit persisted until the end of 2014. The first data published at beginning of 2015 were based on the improved method.
Good idea to check with your MO (is it BOM ?) what they do.
The Australian BoM is the “match” for the UK MetOffice. However, the BoM are on record for serious manipulation of temperature data. They use (Used to use?) only 112 devices to calculate a “national average”, that’s 1 device for every ~68,500 sqaure kilometers of land, largely at airports. And then they changed how that average was calculated in 2013, making 2013 hot and 2014 hotter still.
We now are faced with Perth, Western Australia, suffereing severe fire danger. So far, it looks like almost all of the fires were started by arsonists! Where do these people get off?
RE: vukcevic February 8, 2015 at 1:46 am
You are also advocating a monthly figure which requires a length-of-month ‘adjustment/correction/weighting which must add another complication to determining accuracy.
Why do we need a number for every day to get an annual average? That seems as arbitrary as a measurement every second, minute, hour, month, etc.
In-filling cannot be accurate to any degree as it is merely a guess, albeit with a mathematical basis.
February is not “usually colder than most other months” in Australia, just the opposite. This looks to be just another “warming” adjustment.
Hi Bob
This above refers to the CET (I wasn’t entirely happy with their CET annual numbers), UK and possibly N. Hemisphere but not global temperatures.
However, if the adjustments are applied to the S. Hemisphere, than adjustment would be downward rather than upward.
regards
Reply to V ==> Yes, that and a few thousands other odd bits….with today’s computing abilities there certainly is no reason to bother with yet another arbitrary non-existent value, another interim calculation, called “monthly average”.
Mr. Hansen, thank you for the comment.
However, I do see some value (but only of local interest) in the monthly averages. Months of June-July (each side of the summer solstice, when insolation is greatest,) for all of 350 years show no rising trend worth mentioning ( less than 0.1C/century suggesting ‘flat TSI’) while most of the warming trend was generated in December/January, around winter solstice, time of minimum insolation; implying that orientation of the polar jet-stream may be responsible for the upward trend in the CET warming since 1660.
This was an invitation to look at possibility of a separate CET winter & summer forecasts , based on the natural variability.
with best regards
vukcevic
Vuk! I’m surprised you missed two important things.
1. It is only the Northern Hemisphere that is insolated in May-June-July-August. Not the whole planet.
Further, when the Northern Hemisphere land area is being irradiated in May-June-Jul, that land area is being hit with the yearly LOWEST solar radiation (about 1315 maximum watts/m^2).
2. When the Southern Hemisphere is being hit with its highest solar insolation in Dec-Jan-Feb each year, its sole bit of maximum-exposed 17 Mkm^2 land area is “all white”. Thus, the Antarctic continent is hit with 1410 watts/m^2, but ALL of it is reflected immediately back into space. There is no other land area between 40 south and the pole! (We will ignore New Zealand for a few minutes, and give a passing salute to the little corner of Argentina and Chili.) That’s it. So, no “dark land area” between 40 south and the south pole when it is being hit the yearly high radiation.
Up north? Look again at the globe. Look at how much land in North America, China, Siberia, Russia, all of Europe, and Asia are “dark” and growing darker with ever-more greenery and vegetation! during the northern summer months. It is, what, 50% of the land area being hit by northern hemisphere summer radiation? And all of it darker and hotter due to extra CO2.
Reply to V ==> Yes, seasonal, monthly, regional calculations all have their places and are valuable information, just not a necessary step in calculating Global annual values — in which case they may add more error than they remove.
Cheers, thanks for checking in and talking to me.
RACookPE1978 says
Vuk! I’m surprised you missed two important things……
Hi RAC, thanks for your comment.
I was referring only to the CET records, small part but extremely well documented, of the Northern Hemisphere and even smaller in the global terms. I found as one moves around globe even in the same longitude band, the natural variability drifts in and out of phase. This led Dr. J. Curry to formulate ‘stadium hypothesis’. I am sceptical about global averaging of data
Good to see a more rational discussion of errors that that of Phil Jones. He even seems to have omitted a 1/n-1 in his eq 12. Phil’s approach cannot be used for heterogeneous data such as global temperatures. If it was true, all we’d have to do is to get enough people holding their middle finger up in the air (to estimate temperature) to get whatever accuracy we wanted.
More than one recent missive from the Met has caused me to think, “Can this be? I’m not sure I can trust it after all that’s gone before…”
You can’t Alan. Witness Betts’ tweets. They are still hell bent on maintaining their fund flows and that distorts their honesty.
The accuracy with which we can measure the global average temperature of 2010 is around one tenth of a degree Celsius. in the same way New York is ‘around ‘ Boston ?
‘Author’s Comment Policies: I already know that “everybody” thinks the UK Met Office’s OME is [pick one or more]: way too small’ Funny I live in the UK and I have never heard anyone claim its ‘way to small’ and in reality you cannot ignore the manner in which its managed , for the head of any organisation , even if it a science based one , sets the tone for the whole organisation and with Slingo you have a very fixed approach that puts one view above all others on AGW.
Am I alone in being confused as to their use of ‘measure’ and ‘calculate’?
Surely the global average temperature cannot be measured, as they claim, but can only be calculated from many measurements, all with varying accuracies. Also, cannot any calculation be as accurate to as many significant figures you like, as long as one accepts that the resultant figure does not necessarily reflect reality?
The flexibility with terms and their definitions seems to a major problem when debating the CAGW idea:
<blockquote?"When I use a word," Humpty Dumpty said in rather a scornful tone, "it means just what I choose it to mean — neither more nor less."
"The question is," said Alice, "whether you can make words mean so many different things."
"The question is," said Humpty Dumpty, "which is to be master – – that's all."
PS – be gentle with us non-mathematicians/statisticians
Reply to John in O ==> An interesting point — and if we still lived in the glass hi/low thermometer days it would be easier to answer.
To the credit of the Met Office UK, they have finally come around to admitting in a public way that global temperature measurements themselves come with “accuracy bars” — Original Measurement Error” or Measurement Accuracy Ranges — by whatever names, it means what the Met Office says “the accuracy with which we can measure”. They use lots and lots measurements — most rounded to the nearest degree F or C (part of the problem), some from instruments that report to 0.1°C but have other built in errors, untold data are not actually measured at all buy are calculated by in-filling methods all with known inaccuracies and with past sea surface temperatures — it is a guess.
They use all that data to calculate a result. The calculation itself has an error range or a CI, the measurements have their own measurement accuracies (obversely, measurement inaccuracies or error).
There are two very in-depth, complex scientific research papers at the bottom of the essay that are the basis for their obviously “good round estimate” of 0.1°C.
In the end it is considered by many to be improper to give as a result a number that is “more precise” than the precision of the measurements it depends on.
Errata ==> “….data are not actually measured at all but are calculated….
“The UK’s Met Office (officially the Meteorological Office until 2000) is the national weather service for the United Kingdom. Its Hadley Centre in conjunction with Climatic Research Unit (University of East Anglia) created and maintains one of the world’s major climatic databases, currently known as HADCRUT4 which is described by the Met Office as “Combined land [CRUTEM4] and marine [sea surface] temperature anomalies on a 5° by 5° grid-box basis”.” — with passing of time met network changed. In urban areas the network increased and in rural areas — though increased in the measurement of rainfall — has not changed that much with more than two-thirds are under rural areas. This is an important issue for the representativeness of the global average temperature. This vary from region to region, country to country. The data has observational errors as well averaging errors as well change of unit of measurement — prior to 1957 it was in oF and later in oC. With all these errors and inaccurate grids, etc, how can we say temperature increase is caused by anthropogenic greenhouse gases and then extrapolate to 2100 and beyond using models and senasationalization of their impact on nature. This sensationalization diverting the govenments in wrong direction.
Dr. S. Jeevananda Reddy
sorry small correction — “more than two-thirds of the area under rural”.
Kip: “I warn commenters against the most common errors: substituting definitions from specialized fields (like “statistics”) for the simple arithmetical concepts used in the essay and/or quoting The Learned as if their words were proofs. I will not respond to comments that appear to be intentionally misunderstanding the essay.”
Stand up for what you believe, Kip. It is your right to interpret the Met Office’s report in arithmetic terms, even though statistics is the conventional method for extracting “meaning from data” and used by all practicing scientists. Conventional statistics would be appropriate at a science blog, but aren’t necessary here, where everyone’s personal perspective is valued.
As for “original measurement error”, it is a systematic error on your part, but not the Met Office’s.
Reply to ItsMyTurn ==> Maybe you have some other mathematics that produce “means” by some other method. I’d love to see it. If you have performed the meths with the three-data data set described in the essay and come up with a different answer, I’d love to see the detailed work.
If you read not only my essay, do the homework, and read in turn the two papers on which the Met Office bases its statement: “The accuracy with which we can measure the global average temperature of 2010 is around one tenth of a degree Celsius.” and still retain your original viewpoint, please check in again and we’ll see if we can work it out.
errata — “maths”
That’s right. Claiming perfect accuracy would be pure non-sense even if the results were reported in absolute figures and not against some relative, moving and thus arbitrary ‘normals’.
Let alone in a gigantic and chaotic system like the Earth, where a blink of an eye can make a bigger difference than the claimed accuracy (0.1 °C) even in a fixed location. How many square kilometers is one measurement point supposedly representing anyway?
In addition, the earlier the underlying data ‘corrections’ are stopped, the earlier hysteria about weather (including the semantic climate) can be stopped and real world problems can be identified with style.
How many square kilometers is one measurement point supposedly representing anyway?
311,933.4 km² at the equator. (A box with a side of 558.5 km)
What ? Just one, randomly chosen measurement point , to represent 312,000 sq. Km .
Think of the scope for adjustment in that !
Is globally averaged temperature just a tool for social control, that would have been the envy of the Catholic Church of the Middle Ages.
continuation — in the last hundred years, there is vast variation in rural and urban ecology [land use and land cover] this will play major role on temperature. You can see the sea change in USA or UK or any other developed country. Are we able to represent these variations in averaging the temperature as they vary with region to region and country to country. Also, so far nobody able to explain why the temperature changing like staircase [step-wise] type variation type — year to year variation or natural cyclic variation is o.k.
Dr. S. Jeevananda Reddy
I have performed a re analysis of all this data, which shows the temperature récord isn’t what the Hadley Center thinks it is. My conclusion is really accurate, the temperature has increased and we are all going to die! I submitted my paper to Nature Climate Change, and I’m sure it will pass peer review and get the number one paper of the year. A rough draft is available for those who can understand very complex words:
http://21stcenturysocialcritic.blogspot.com.es/2015/02/planet-wide-reanalysis-shows-high.html
I like it , Kip. Great explanation and a nice way of separating measuring instrument error from statistical error.
You could do another post on measurement error through the last 150yrs or 50yrs, etc. That would also be extremely interesting.
Many thanks
Reply to Stephen Richards ==> Thank you. That is one of my major projects — probably never to be realized. I have collected data for such a project and might post an essay giving highlights but not in-depth analysis.
The two Met Office papers cover a lot of that ground.
Is this before or after station data adjustments?????
http://www.telegraph.co.uk/news/earth/environment/globalwarming/11395516/The-fiddling-with-temperature-data-is-the-biggest-science-scandal-ever.html
“Following my last article, Homewood checked a swathe of other South American weather stations around the original three. In each case he found the same suspicious one-way “adjustments”.”
I was wondering the same, or similar if not exactly the same.
How does the homogenization affect the accuracy?
My guess is that homogenization lowers the accuracy.
Reply to urederra ==> “How does the homogenization affect the accuracy?” How does the in-filling affect the accuracy?
Interesting questions not be be answered here except to mention that, of course, anytime the data itself it touched in a way that affects it, that changes the results. How much? I certainly don’t know.
Accuracy is difference between the measured value and reality; homogenized values are synthetic and therefore have no defined accuracy.
You could use proof by induction to prove the main point in the post; no need to repeat oneself with bigger numbers.
As far as spread of measurements, I regularly drive about half an hour away, and, according to my car, the temperature between the two end points is vastly different to the temperatures along the way. There are weather stations at both those end points, and therefore any attempt to calculate the region’s temperature based on those two end points will be wrong.
This, they are just kidding themselves.
In the UK there was 9 degrees C difference between one side of the Penines to the other side.
Reply to Jarryd Beck ==> Absolutely correct — and there is a simple arithmetic proof as well.
However, actually forcing one’s self to do these simple data set experiments forces the point home in one’s mind much more effectively.
I will admit that I had to be forced to do this thru the five-data data set (increasingly boring and painstaking to do by hand) before I really really really believed it.
“Importantly, Met Office states clearly that the Uncertainty Range derives from the accuracy of measurement and thus represents the Original Measurement Error (OME). “
No, they didn’t. Nothing of the kind. They said:
“The accuracy with which we can measure the global average temperature of 2010 is around one tenth of a degree Celsius.”
The Met paper of Morice et al 2012 sets out what they mean by the measurement error of the global average:
“A detailed measurement error and bias model was constructed for HadCRUT3 [Brohan et al., 2006]. This included descriptions of: land station homogenization uncertainty; bias related uncertainties arising from urbanization, sensor exposure and SST measurement methods; sampling errors arising from incomplete measurement sampling within grid–boxes; and uncertainties arising from limited global coverage. “
It didn’t suddenly come to mean OME for HAD 4.
“Here are the possible values: 1.8, 1.7, 1.6 (and all values in between)”
No, it isn’t. 1.8 and 1.6 are the 95% limits. They are not the ends of the range. And they certainly aren’t equally likely to the center value.
Calculating the standard error of the mean is elementary statistics. And this is not how to do it.
Where measurement accuracy (+/-0.1) becomes recording accuracy (+/-0.5), as it did for most of the last century, the distribution becomes more square, making all values roughly equally likely.
http://www.srh.noaa.gov/ohx/dad/coop/EQUIPMENT.pdf
I don’t believe that is measurement accuracy, it is likely the reading you get. The link says in big, bold face not to record the temperatures in tenths of degrees but in degrees. So now we get (or got) temperature records to the nearest degree and we are talking about differences to hundreths of degrees being significant.
“The accuracy with which we can measure the global average temperature of 2010 is around one tenth of a degree Celsius.”
What this means is surely that it is just as likely for any reading that the correct value is 1.6 or 1.8 or any value in between. Before assuming that 1.7 the central value is the mean, one should investigate the accuracy of the measuring agent. Perhaps there is a bias and each reading of 1.7 should be 1.8 or each reading of 1.7 should be 1.6.
If the accuracy of the instruments used can only be ascertained to 0.1, then baldy stating that the mean is 1.7 is wrong. The Met Office is correct in its statement and Mr. Stokes’s interpretation of that statement
“No, it isn’t. 1.8 and 1.6 are the 95% limits. They are not the ends of the range. And they certainly aren’t equally likely to the center value”
is wishful thinking, based on the erroneous assumption that +0.1 is as likely as -0.1 which is totally unproven.
“What this means is surely that it is just as likely for any reading that the correct value is 1.6 or 1.8 or any value in between.”
So what do you think the 95% means?
Reply to Nick Stokes ==> Linguistic agreement aside — read my whole essay and see that Met Office uses Uncertainty Range (not CI) and defines it in their FAQ, which I quote precisely, as “The accuracy with which we can measure the global average temperature”. If you wish to think this is a statistical confidence interval, you will have to do so on your own.
You end you quote of Morice et al (2012) way too soon. Carrying on immediately from where you end off, it goes on to say:
“The uncertainty model of Brohan et al. [2006] allowed conservative
bounds on monthly and annual temperature averages to be formed. However, it did not provide the
means to easily place bounds on uncertainty in statistics that are sensitive to low frequency
uncertainties, such as those arising from step changes in land station records or changes in the
makeup of the SST observation network. This limitation arose because the uncertainty model did
not describe biases that persist over finite periods of time, nor complex spatial patterns of
interdependent errors.
To allow sensitivity analyses of the effect of possible pervasive low frequency biases in the
observational near-surface temperature record, the method used to present these uncertainties has
been revised. HadCRUT4 is presented as an ensemble data set in which the 100 constituent
ensemble members sample the distribution of likely surface temperature anomalies given our
current understanding of these uncertainties. This approach follows the use of the ensemble method
to represent observational uncertainty in the HadSST3 [Kennedy et al., 2011a, 2011b] ensemble
data set. ”
and then, in the end, refers to this whole process, as well as that resulting from the sea surface paper, as “The accuracy with which we can measure the global average temperature”.
If you feel so moved, you should write them a public letter explaining your contrary viewpoint. Maybe they will issue a correction.
All –> sorry for the formatting of the quote. NMF.
Kip,
” Uncertainty Range (not CI) and defines it in their FAQ, which I quote precisely, as “The accuracy with which we can measure the global average temperature”. If you wish to think this is a statistical confidence interval, you will have to do so on your own.”
They actually said:
“+-0.1° C is the 95% uncertainty range.”
If that isn’t a confidence interval, what do you think the 95% means?
From your quote:
” HadCRUT4 is presented as an ensemble data set in which the 100 constituent ensemble members sample the distribution of likely surface temperature anomalies given our current understanding of these uncertainties.”
What distribution could they be referring to?
Reply to Nick Stokes ==> I guess you’ll have to take it up with the Met Office….they repeatedly [said] the following:
…which is much less than the accuracy with which either value can be calculated.
It is not possible to calculate the global average temperature anomaly with perfect accuracy because the underlying data contain measurement errors and because the measurements do not cover the whole globe.
The accuracy with which we can measure…
….which is much less than the accuracy with which either value can be calculated….
They try to calculate the accuracy (which is another way to say the inaccuracy) and though it takes two learned papers to do so, they finally seem to agree, with a high degree of certainty, that 0.1°C represents that accuracy/inaccuracy.
Everyone is entitled to an opinion about the matter.
Met Office language may leave something to be desired, but 95% ranges for the true values of temperature pretty well have to be confidence intervals. Or Bayesian, which clearly does not apply here. Section 3.2 of the Morice paper describes the distribution assumptions used to generate these CI’s using Monte Carlo methods – the 100-member ‘ensemble’. It seems heavily ‘statistical’.
Reply to basicstats ==> It seems perfectly clear to me that Met Office UK bends over backwards to communicate that they are talking of measurement accuracy — not the confidence interval of the results.
They may well be coming up with an “accuracy of measurement” value which they hold to a certain degree of confidence…. which seems to be confusing many here.
The Met Office is very statistically sophisticated — if they had meant CI, they simply would have said so, rather than using various forms of the words “measurement accuracy” over and over. If you read their two papers, it is strikingly clear that they are trying to find a way to quantify “the accuracy with which they can measure global temperature” — not to simply throw CI at their result.
Honest opinions may vary….
What is the probability using the choices on your graph of 2014 being 0.1C colder, correct or higher by 0.1C? Answer: 1/3
How about 1998 higher and 2014 lower by 0.1C? 1/9
The chance that the data set shown is totally accurate each and every year plotted? 1/1,162,261,467!
Take any specific combination you like, that’s the chance of it being totally accurate, because that’s the total number of different combinations available within the error margin given by the Met.