Guest Essay by Kip Hansen
Those following the various versions of the “2014 was the warmest year on record” story may have missed what I consider to be the most important point.
The UK’s Met Office (officially the Meteorological Office until 2000) is the national weather service for the United Kingdom. Its Hadley Centre in conjunction with Climatic Research Unit (University of East Anglia) created and maintains one of the world’s major climatic databases, currently known as HADCRUT4 which is described by the Met Office as “Combined land [CRUTEM4] and marine [sea surface] temperature anomalies on a 5° by 5° grid-box basis”.
The first image here is their current graphic representing the HADCRUT4 with hemispheric and global values.
The Met Office, in their announcement of the new 2014 results, made this [rather remarkable] statement:
“The HadCRUT4 dataset (compiled by the Met Office and the University of East Anglia’s Climatic Research Unit) shows last year was 0.56C (±0.1C*) above the long-term (1961-1990) average.”
The asterisk (*) beside (+/-0.1°C) is shown at the bottom of the page as:
“*0.1° C is the 95% uncertainty range.”
So, taking just the 1996 -> 2014 portion of the HADCRUT4 anomalies, adding in the Uncertainty Range as “error bars”, we get:
The journal Nature has a policy that any graphic with “error bars” – with quotes because these types of bars can be many different things – must include an explanation as to exactly what those bars represent. Good idea!
Here is what the Met Office means when it says Uncertainty Range in regards HADCRUT4, from their FAQ:
“It is not possible to calculate the global average temperature anomaly with perfect accuracy because the underlying data contain measurement errors and because the measurements do not cover the whole globe. However, it is possible to quantify the accuracy with which we can measure the global temperature and that forms an important part of the creation of the HadCRUT4 data set. The accuracy with which we can measure the global average temperature of 2010 is around one tenth of a degree Celsius. The difference between the median estimates for 1998 and 2010 is around one hundredth of a degree, which is much less than the accuracy with which either value can be calculated. This means that we can’t know for certain – based on this information alone – which was warmer. However, the difference between 2010 and 1989 is around four tenths of a degree, so we can say with a good deal of confidence that 2010 was warmer than 1989, or indeed any year prior to 1996.” (emphasis mine)
This is a marvelously frank and straightforward statement. Let’s parse it a bit:
• “It is not possible to calculate the global average temperature anomaly with perfect accuracy …. “
Announcements of temperature anomalies given as very precise numbers must be viewed in light of this general statement.
• “…. because the underlying data contain measurement errors and because the measurements do not cover the whole globe.”
The reason for the first point is that the original data themselves, right down to the daily and hourly temperatures recorded in humongous data sets, contain actual measurement errors – part of this includes such issues as accuracy of equipment and units of measurement – and errors introduced by methods to attempt to account for “measurements do not cover the whole globe” – various methods of in-filling.
• “However, it is possible to quantify the accuracy with which we can measure the global temperature and that forms an important part of the creation of the HadCRUT4 data set. The accuracy with which we can measure the global average temperature of 2010 is around one tenth of a degree Celsius.”
Note well that the Met Office is not talking here of statistical confidence intervals but “the accuracy with which we can measure” – measurement accuracy and its obverse, measurement error. What is that measurement accuracy? “…around one tenth of a degree Celsius” or, in common notation +/- 0.1 °C. Note also that this is the Uncertainty Range given for the HADCRUT4 anomalies around 2010 – this uncertainty range does not apply, for instance, to anomalies in the 1890s or the 1960s.
• “The difference between the median estimates for 1998 and 2010 is around one hundredth of a degree, which is much less than the accuracy with which either value can be calculated. This means that we can’t know for certain – based on this information alone – which was warmer.”
We can’t know (for certain or otherwise) which is different from any of the other 21st century data points that are reported as within 100ths of a degree of one another. The values can only be calculated to an accuracy of +/- 0.1˚C
• “However, the difference between 2010 and 1989 is around four tenths of a degree, so we can say with a good deal of confidence that 2010 was warmer than 1989, or indeed any year prior to 1996.”
It is nice to see them say “we can say with a good deal of confidence” instead of using a categorical “without a doubt”. If two data are 4/10ths of a degree different, they are confident of a difference and the sign, + or -.
Importantly, Met Office states clearly that the Uncertainty Range derives from the accuracy of measurement and thus represents the Original Measurement Error (OME). Their Uncertainty Range is not a statistical 95% Confidence Interval. While they may have had to rely on statistics to help calculate it, it is not itself a statistical animal. It is really and simply the Original Measurement Error (OME) — the combined measurement errors and lack of accuracies of all the parts and pieces, rounded off to a simple +/- 0.1˚C, which they feel is 95% reliable – but has a one in twenty chance of being larger or smaller. (I give links for the two supporting papers for HADCRUT4 uncertainty at the end of the essay.****)
UK Met Office is my “Hero of the Day” for announcing their result with its OME attached – 0.56C (±0.1˚C) – and publicly explaining what it means and where it came from.
[ PLEASE – I know that many, maybe even almost everyone reading here, think that the Met Office’s OME is too narrow. But the Met Office gets credit from me for the above – especially given that the effect is to validate The Pause publically and scientifically. They give their two papers**** supporting their OME number which readers should read out of collegial courtesy before weighing in with lots of objections to the number itself. ]
Notice carefully that the Met Office calculates the OME for the metric and then assigns that whole OME to the final Global Average. They do not divide the error range by the number of data points, they do not reduce it, they do not minimize it, they do not pretend that averaging eliminates it because it is “random”, they do not simply ignore it as if was not there at all. They just tack it on to the final mean value – Global_Mean( +/- 0.1°C ).
In my previous essay on Uncertainty Ranges… there was quite a bit of discussion of this very interesting, and apparently controversial, point:
Does deriving a mean* of a data set reduce the measurement error?
Short Answer: No, it does not.
I am sure some of you will not agree with this.
So, let’s start with a couple of kindergarten examples:
Here’s our data set: 1.7(+/-0.1)
Pretty small data set, but let’s work with it.
Here are the possible values: 1.8, 1.7, 1.6 (and all values in between)
We state the mean = 1.7 Obviously, with one datum, it itself is the mean.
What are the other values, the whole range represented by 1.7(+/-0.1)?:
1.8 and every other value to and including 1.6
What is the uncertainty range?: + or – 0.1 or in total, 0.2
How do we write this?: 1.7(+/-0.1)
Here is our new data set: 1.7(+/-0.1) and 1.8(+/-0.1)
Here are the possible values:
1.7 (and its +/-s) 1.8, 1.6
1.8 (and its +/-s) 1.9, 1.7
What’s the mean of the data points? 1.75
What are the other possible values for the mean?
If both data are raised to their highest value +0.1:
1.7 + 0.1 = 1.8
1.8 + 0.1 = 1.9
If both are lowered to their lowest -0.1:
1.7 – 0.1 = 1.6
1.8 – 0.1 = 1.7
What is the mean of the widest spread?
1.9 + 1.6 / 2 = 1.75
What is the mean of the lowest two data?
1.6 + 1.7 / 2 = 1.65
What is the mean of the highest two data:
1.8 + 1.9 / 2 = 1.85
The above give us the range of possible means: 1.65 to 1.85
0.1 above the mean and 0.1 below the mean, a range of 0.2
Of which the mean of the range is: 1.75
Thus, the mean is accurately expressed as 1.75(+/-0.1)
Notice: The Uncertainty Range, +/-0.1, remains after the mean has been determined. It has not been reduced at all, despite doubling the “n” (number of data). This is not a statistical trick, it is elementary arithmetic.
We could do this same example for data sets of three data, then four data, then five data, then five hundred data, and the result would be the same. I have actually done this for up to five data, using a matrix of data, all the pluses and minuses, all the means of the different combinations – and I assure you, it always comes out the same. The uncertainty range, the original measurement accuracy or error, does not reduce or disappear when finding of the mean of a set of data.
I invite you to do this experiment yourself. Try the simpler 3-data example using the data like 1.6, 1.7 and 1.8 ~~ all +/- 0.1s. Make a matrix of the nine +/- values: 1.6, 1.6 + 0.1, 1.6 – 0.1, etc. Figure all the means. You will find a range of means with the highest possible mean 1.8 and the lowest possible mean 1.6 and a median of 1.7, or, in other notation, 1.7(+/-0.1).
Really, do it yourself.
This has nothing to do with the precision of the mean. You can figure a mean to whatever precision you like from as many data points as you like. If your data share a common uncertainty range (original measurement error, a calculated ensemble uncertainty range such as found in HADCRUT4, or determined by whatever method) it will appear in your results exactly the same as the original – in this case, exactly +/- 0.1.
The reason for this is clearly demonstrated in our kindergarten example of 1, 2 and 3-data data sets – it is a result of the actual arithmetical process one must use in finding the mean of data each of which represent a range of values with a common range width*****. No amount of throwing statistical theory at this will change it – it is not a statistical idea, but rather an application of common grade-school arithmetic. The results are a range of possible means, the mean of which we use as “the mean” – it will be the same as the mean of the data points when not taking into account the fact that they are ranges. This range of means is commonly represented with the notation:
Mean_of_the_Data Points(+/- one half of the range)
– in one of our examples, the mean found by averaging the data points is 1.75, the mean of the range of possible means is 1.75, the range is 0.2, one-half of which is 0.1 — thus our mean is represented 1.75(+/-0.1).
If this notation X(+/-y) represents a value with its original measurement error (OME), maximum accuracy of measurement, or any of the other ways of saying that the (+/-y) bit results from the measurement of the metric then X(+/-y) is a range of values and must be treated as such.
Original Measurement Error of the data points in a data set, by whatever name**, is not reduced or diminished by finding the mean of the set – it must be attached to the resulting mean***.
# # # # #
* – To prevent quibbling, I use this definition of “Mean”: Mean (or arithmetic mean) is a type of average. It is computed by adding the values and dividing by the number of values. Average is a synonym for arithmetic mean – which is the value obtained by dividing the sum of a set of quantities by the number of quantities in the set. An example is (3 + 4 + 5) ÷ 3 = 4. The average or mean is 4. http://dictionary.reference.com/help/faq/language/d72.html
** – For example, HADCRUT4 uses the language “the accuracy with which we can measure” the data points.
*** – Also note that any use of the mean in further calculations must acknowledge and account for – both logically and mathematically – that the mean written as “1.7(+/-0.1)” is in reality a range and not a single data point.
**** – The two supporting papers for the Met Office measurement error calculation are:
Colin P. Morice, John J. Kennedy, Nick A. Rayner, and Phil D. Jones
J. J. Kennedy , N. A. Rayner, R. O. Smith, D. E. Parker, and M. Saunby
***** – There are more complicated methods for calculating the mean and the range when the ranges of the data (OME ranges) are different from datum to datum. This essay does not cover that case. Note that the HADCRUT4 papers do discuss this somewhat as the OMEs for Land and Sea temps are themselves different.
# # # # #
Author’s Comment Policies: I already know that “everybody” thinks the UK Met Office’s OME is [pick one or more]: way too small, ridiculous, delusional, an intentional fraud, just made up or the result of too many 1960s libations. Repeating that opinion (with endless reasons why) or any of its many incarnations will not further enlighten me nor the other readers here. I have clearly stated that it is the fact that they give it at all and admit to its consequences that I applaud. Also, this is not the place continue your One Man War for Truth in Climate Science (no matter which ‘side’ you are on) – please take that elsewhere.
Please try to keep comments to the main points of this essay –
Met Office’s remarkable admission of “accuracy with which we can measure the global average temperature” and that statement’s implications.
“Finding the Mean does not Reduce Original Measurement Error”.
I expect a lot of disagreement – this simple fact runs against the tide of “Everybody- Knows Folk Science” and I expect that if admitted to be true it would “invalidate my PhD”, “deny all of science”, or represent some other existential threat to some of our readers.
Basic truths are important – they keep us sane.
I warn commenters against the most common errors: substituting definitions from specialized fields (like “statistics”) for the simple arithmetical concepts used in the essay and/or quoting The Learned as if their words were proofs. I will not respond to comments that appear to be intentionally misunderstanding the essay.
# # # # #