Guest Essay by Kip Hansen
It seems that every time we turn around, we are presented with a new Science Fact that such-and-so metric — Sea Level Rise, Global Average Surface Temperature, Ocean Heat Content, Polar Bear populations, Puffin populations — has changed dramatically — “It’s unprecedented!” — and these statements are often backed by a graph illustrating the sharp rise (or, in other cases, sharp fall) as the anomaly of the metric from some baseline. In most cases, the anomaly is actually very small and the change is magnified by cranking up the y-axis to make this very small change appear to be a steep rise (or fall). Adding power to these statements and their graphs is the claimed precision of the anomaly — in Global Average Surface Temperature, it is often shown in tenths or even hundredths of a Centigrade degree. Compounding the situation, the anomaly is shown with no (or very small) “error” or “uncertainty” bars, which are, even when shown, not error bars or uncertainty bars but actually statistical Standard Deviations (and only sometimes so marked or labelled).
I wrote about this several weeks ago in an essay here titled “Almost Earth-like, We’re Certain”. In that essay, which the Science and Environmental Policy Project’s Weekly News Roundup characterized as “light reading”, I stated my opinion that “they use anomalies and pretend that the uncertainty has been reduced. It is nothing other than a pretense. It is a trick to cover-up known large uncertainty.”
Admitting first that my opinion has not changed, I thought it would be good to explain more fully why I say such a thing — which is rather insulting to a broad swath of the climate science world. There are two things we have to look at:
- Why I call it a “trick”, and 2. Who is being tricked.
WHY I CALL THE USE OF ANOMALIES A TRICK
What exactly is “finding the anomaly”? Well, it is not what it is generally thought. The simplified explanation is that one takes the annual averaged surface temperature and subtracts from that the 30-year climatic average and what you have left is “The Anomaly”.
That’s the idea, but that is not exactly what they do in practice. They start finding anomalies at a lower level and work their way up to the Global Anomaly. Even when Gavin Schmidt is explaining the use of anomalies, careful readers see that he has to work backwards to Absolute Global Averages in Degrees — by adding the agreed upon anomaly to the 30-year mean.
“…when we try and estimate the absolute global mean temperature for, say, 2016. The climatology for 1981-2010 is 287.4±0.5K, and the anomaly for 2016 is (from GISTEMP w.r.t. that baseline) 0.56±0.05ºC. So our estimate for the absolute value is (using the first rule shown above) is 287.96±0.502K, and then using the second, that reduces to 288.0±0.5K.”
But for our purposes, let’s just consider that the anomaly is just the 30-year mean subtracted from the calculated GAST in degrees.
As Schmidt kindly points out, the correct notation for a GAST in degrees is something along the lines of 288.0±0.5K — that is a number of degrees to tenths of a degree and the uncertainty range ±0.5K. When a number is expressed in that manner, with that notation, it means that the actual value is not known exactly, but is known to be within the range expressed by the plus/minus amount.

This illustration shows this in actual practice with temperature records….the measured temperatures are rounded to full degrees Fahrenheit — a notation that represents ANY of the infinite number of continuous values between 71.5 and 72.4999999…
It is not a measurement error, it is the measured temperature represented as a range of values 72 +/- 0.5. It is an uncertainty range, we are totally in the dark as to the actual temperature — we know only the range.
Well, for the normal purposes of human beings, the one-degree-wide range is quite enough information. It gets tricky for some purposes when the temperature approaches freezing — above or below frost/freezing temperatures being Climatically Important for farmers, road maintenance crews and airport airplane maintenance people.
No matter what we do to temperature records, we have to deal with the fact that the actual temperatures were not recorded — we only recorded ranges within which the actual temperature occurred.
This means that when these recorded temperatures are used in calculations, they must remain as ranges and be treated as such. What cannot be discarded is the range of the value. Averaging (finding the mean or the median) does not eliminate the range — the average still has the same range. (see Durable Original Measurement Uncertainty ).
As an aside: when Climate Science and meteorology present us with the Daily Average temperature from any weather station, they are not giving us what you would think of as the “average”, which in plain language refers to the arithmetic mean — rather we are given the median temperature — the number that is exactly halfway between the Daily High and the Daily Low. So, rather than finding the mean by adding the hourly temperatures and dividing by 24, we get the result of Daily High plus Daily Low divided by 2. These “Daily Averages” are then used in all subsequent calculations of weekly, monthly, seasonal, and annual averages. These Daily Averages have the same 1-degree wide uncertainty range.
On the basis of simple logic then, when we finally arrive at a Global Average Surface Temperature, it still has the original uncertainty attached — as Dr. Schmidt correctly illustrates when he gives Absolute Temperature for 2016 (link far above) as 288.0±0.5K. [Strictly speaking, this is not exactly why he does so — as the GAST is a “mean of means of medians” — a mathematical/statistical abomination of sorts.] As William Briggs would point out “These results are not statements about actual past temperatures, which we already knew, up to measurement error.” (which measurement error or uncertainty is at least +/- 0.5).
The trick comes in where the actual calculated absolute temperature value is converted to an anomaly of means. When one calculates a mean (an arithmetical average — total of all the values divided by the number of values), one gets a very precise answer. When one takes the average of values that are ranges, such as 71 +/- 0.5, the result is a very precise number with a high probability that the mean is close to this precise number. So, while the mean is quite precise, the actual past temperatures are still uncertain to +/-0.5.
Expressing the mean with the customary ”+/- 2 Standard Deviations” tells us ONLY what we can expect the mean to be — we can be pretty sure the mean is within that range. The actual temperatures, if we were to honestly express them in degrees as is done in the following graph, are still subject to the uncertainty of measurement: +/- 0.5 degrees.

[ The original graph shown here was included in error — showing the wrong Photoshop layers. Thanks to “BoyfromTottenham” for pointing it out. — kh ]
The illustration was used (without my annotations) by Dr. Schmidt in his essay on anomalies. I have added the requisite I-bars for +/- 0.5 degrees. Note that the results of the various re-analyses themselves have a spread of 0.4 degrees — one could make an argument for using the additive figure of 0.9 degrees as the uncertainty for the Global Mean Temperature based on the uncertainties above (see the two greenish uncertainty bars, one atop the other.)
This illustrates the true uncertainty of Global Mean Surface Temperature — Schmidt’s acknowledged +/- 0.5 and the uncertainty range between reanalysis products.
In the real world sense, the uncertainty presented above should be considered the minimum uncertainty — the original measurement uncertainty plus the uncertainty of reanalysis. There are many other uncertainties that would properly be additive — such as those brought in by infilling of temperature data.
The trick is to present the same data set as anomalies and claim the uncertainty is thus reduced to 0.1 degrees (when admitted at all) — BEST doubles down and claims 0.05 degrees!



Reducing the data set to a statistical product called anomaly of the mean does not inform us of the true uncertainty in the actual metric itself — the Global Average Surface Temperature — any more than looking at a mountain range backwards through a set of binoculars makes the mountains smaller, however much it might trick the eye.
Here’s a sample from the data that makes up the featured image graph at the very beginning of the essay. The columns are: Year — GAST Anomaly — Lowess Smoothed
2010 0.7 0.62
2011 0.57 0.63
2012 0.61 0.67
2013 0.64 0.71
2014 0.73 0.77
2015 0.86 0.83
2016 0.99 0.89
2017 0.9 0.95
The blow-up of the 2000-2017 portion of the graph:

We see global anomalies given to a precision of hundredths of a degree Centigrade. No uncertainty is shown — none is mentioned on the NASA web page displaying the graph (it is actually a little app, that allows zooming). This NASA web page, found in NASA’s Vital Signs – Global Climate Change section, goes on to say that “This research is broadly consistent with similar constructions prepared by the Climatic Research Unit and the National Oceanic and Atmospheric Administration.” So, let’s see:
From the CRU:

Here we see the CRU Global Temp (base period 1961-90) — annoyingly a different base period than NASA which used 1951-1980. The difference offers us some insight into the huge differences that Base Periods make in the results.
2010 0.56 0.512
2011 0.425 0.528
2012 0.47 0.547
2013 0.514 0.569
2014 0.579 0.59
2015 0.763 0.608
2016 0.797 0.62
2017 0.675 0.625
The official CRU anomaly for 2017 is 0.675 °C — precise to thousandths of a degree. They then graph it at 0.68°C. [Lest we think that CR anomalies are really only precise to “half a tenth”, see 2014, which is 0.579 °C. ] CRU manages to have the same precision in their smoothed values — 2015 = 0.608.
And, not to discriminate, NOAA offers these values, precise to hundredths of a degree:
2010, 0.70
2011, 0.58
2012, 0.62
2013, 0.67
2014, 0.74
2015, 0.91
2016, 0.95
2017, 0.85
[Another graph won’t help…]
What we notice is that, unlike absolute global surface temperatures such as those quoted by Gavin Schmidt at RealClimate, these anomalies are offered without any uncertainty measure at all. No SDs, no 95% CIs, no error bars, nothing. And precisely to the 100th of a degree C (or K if you prefer).
Let’s review then: The major climate agencies around the world inform us about the state of the climate through offering us graphs of the anomalies of the Global Average Surface Temperature showing a steady alarmingly sharp rise since about 1980. This alarming rise consists of a global change of about 0.6°C. Only GISS offers any type of uncertainty estimate and that only in the graph with the lime green 0.1 degree CI bar used above. Let’s do a simple example: we will follow the lead of Gavin Schmidt in this August 2017 post and use GAST absolute values in degrees C with his suggested uncertainty of 0.5°C. [In the following, remember that all values have °C after them – I will use just the numerals from now on.]
What is the mean of two GAST values, one for Northern Hemisphere and one for Southern Hemisphere? To make a real simple example, we will assign each hemisphere the same value of 20 +/- 0.5 (remembering that these are both °C). So, our calculation: 20 +/- 0.5 + 20 +/- 0.5 divided by 2 equals ….. The Mean is an exact 20. (now, that’s precision…)
What about the Range? The range is +/- 0.5. A range 1 wide. So, the Mean with the Range is 20 +/- 0.5.
But what about the uncertainty? Well the range states the uncertainty — or the certainty if you prefer — we are certain that the mean is between 20.5 and 19.5.
Let’s see about the probabilities — this is where we slide over to “statistics”.
Here are some of the values for the Northern and Southern Hemispheres, out of the infinite possibilities inferred by 20 +/- 0.5: [we note that 20.5 is really 20.49999999999…rounded to 20.5 for illustrative purposes.] When we take equal values, the mean is the same, of course. But we want probabilities — so how many ways can the result be 20.5 or 19.5? Just one way each.
NH SH
20.5 —— 20.5 = 20.5 only one possible combination
20.4 20.4
20.3 20.3
20.2 20.2
20.1 20.1
20.0 20.0
19.9 19.9
19.8 19.8
19.7 19.7
19.6 19.6
19.5 —— 19.5 = 19.5 only one possible combination
But how about 20.4 ? We could have 20.4-20.4, or 20.5-20.3, or 20.3-20.5 — three possible combinations. 20.3? 5 ways 20.2? 7 ways 20.1? 9 ways 20.0? 11 ways . Now we are over the hump and 19.9? 9 ways 19.8? 7 ways 19.7? 5 ways 19.6? 3 ways and 19.5? 1 way.
You will recognize the shape of the distribution:

As we’ve only used eleven values for each of the temperatures being averaged, we get a little pointed curve. There are two little graphs….the second (below) shows what would happen if we found the mean of two identical numbers, each with an uncertainty range of +/- 0.5, if they had been rounded to the nearest half degree instead of the usual whole degree. The result is intuitive — the mean always has the highest probability of being the central value.

Now, that may seem so obvious as to be silly. After all, that’s that a mean is — the central value (mathematically). The point is that with our evenly spread values across the range — and, remember, when we see a temperature record give as XX +/- 0.5 we are talking about a range of evenly spread possible values, the mean will always be the central value, whether we are finding the mean of a single temperature or a thousand temperatures of the same value. The uncertainty range, however, is always the same. Well, of course it is! Yes, has to be.
Therein lies the trick — when they take the anomaly of the mean, they drop the uncertainty range altogether and concentrate only on the central number, the mean, which is always precise and statistically close to that central number. When any uncertainty is expressed at all, it is expressed as the probability of the mean being close to the central number — and is disassociated from the actual uncertainty range of the original data.
As William Briggs tells us: “These results are not statements about actual past temperatures, which we already knew, up to measurement error.”
We already know the calculated GAST (see the re-analyses above). But we only know it being somewhere within its known uncertainty range, which is as stated by Dr. Schmidt to be +/- 0.5 degrees. Calculations of the anomalies of the various means do not tell us about the actual temperature of the past — we already knew that — and we knew how uncertain it was.
It is a TRICK to claim that by altering the annual Global Average Surface Temperatures to anomalies we can UNKNOW the known uncertainty.
WHO IS BEING TRICKED?
As Dick Feynman might say: They are fooling themselves. They already know the GAST as close as they are able to calculate it using their current methods. They know the uncertainty involved — Dr. Schmidt readily admits it is around 0.5 K. Thus, their use of anomalies (or the means of anomalies…) is simply a way of fooling themselves that somehow, magically, that the known uncertainty will simply go away utilizing the statistical equivalent of “if we squint our eyes like this and tilt our heads to one side….”.
Good luck with that.
# # # # #
Author’s Comment Policy:
This essay will displease a certain segment of the readership here but that fact doesn’t make it any less valid. Those who wish to fool themselves into disappearing the known uncertainty of Global Average Surface Temperature will object to the simple arguments used. It is their loss.
I do understand the argument of the statisticians who will insist that the mean is really far more precise than the original data (that is an artifact of long division and must be so). But they allow that fact to give them permission to ignore the real world uncertainty range of the original data. Don’t get me wrong, they are not trying to fool us. They are sure that this is scientifically and statistically correct. They are however, fooling themselves, because, in effect, all they are really doing is changing the values on the y-axis (from ‘absolute GAST in K’ to ‘absolute GAST in K minus the climatic mean in K’) and dropping the uncertainty, with a lot of justification from statistical/probability theory.
I’d like to read your take on this topic. I am happy to answer your questions on my opinions. Be forewarned, I will not argue about it in comments.
# # # # #
Discover more from Watts Up With That?
Subscribe to get the latest posts sent to your email.
And what of the satellite measurements? Are they subject to this same self-foolery?
Thomas Graney ==> That’s a good question. Anytime we see a very precise value for a metric that is known to be highly uncertain — you know someone is fooling themselves.
Like Wayne’s World’s “No Stairway” sign in the guitar shop, some statistics departments need a “No Texas Sharpshooting” sign prominently displayed. 🙂
At least in the original story/joke, the Texas Sharpshooter knew what he was doing.
Satellite measures absolutely have to use anomalies, and they do. What do you think the average temperature of the troposphere would be? What would it mean? It would depend entirely on how you weight the various altitudes.
I liked it when GISS or NOAA was presenting annual estimates of global temperature to 0.01 degrees F…then the one year adjusted their methodology so that they shifted 3-4 degrees warmer. So much for that pretend 0.01 degree accuracy.
I had a great laugh at that Trenberth flux graph.
Firstly , it was 2d non rotating.
but the really funny thing was that their fluxes were often +/- 2 digit numbers..
then they claimed to get an imbalance of 0.6 whatevers it was.
So FUNNY !!
(someone might have access to that little piece of mathematical garbage)
If they can seriously produce something AS BAD as that, they should be immediately sacked from any position related to anything to do with data or maths.
Excellent article, Kip!
Which means that 0.5°K. is the absolute minimum uncertainty.
Several times, here on WUWT, engineers and others experienced in defining real world uncertainty have listed all uncertainties that should be included in Schmidt’s or the IPCC’s anomalies.
Perhaps, at some point, WUWT could capture those real world recommendations into a reference page?
One certain missing uncertainty is ‘adjustments’.
That daily temperatures are undergo constant adjustments is admission that uncertainty is much higher.
ATheoK ==> I should have said, +/- 0.5K of course. And that is the “absolute minimum uncertainty” of the temperature record — that’s how we have recorded it — as uncertainty ranges of +/- 0.5….
We have to admit in here somewhere that the +/- 0.5 of the original record is +/- 0.5 Fahrenheit — other uncertainties make it add up to Schmidt’s +/-0.5K.
Kip,
My question is so what if you are right? Even if the uncertainty is as large as you claim it to be
that doesn’t effect any of the underlying facts just our ability to measure them. Taking the central
point of the temperature and looking at how it varies with time will still show a considerable rise
over the past century even if we are not 100% sure of the precise value. Furthermore if the uncertainty
is as large are you are claiming then I can claim that climate models fit the measured temperatures to
within the uncertainty and so they can be relied on to predict the future.
Furthermore a rising global temperature is in agreement with all other measurements such as global
sea level rise, energy imbalance at the top of the atmosphere, earlier dates for spring, earlier harvesting of
grapes, steady decline of arctic sea ice over decadal time scales etc. Taken all together the evidence for global warming over the past century is overwhelming.
Percy ==> So what? Not an unfair question. Gavin Schmidt says the uncertainty he admits means, among other things, that the whole “warmest year” contest is nonsense — we can’t tell one year from the one before — given the real uncertainties, most years of the last decade are indistinguishable from one another — and that also means that it is uncertain whether is is getting warmer or staying the same on a decadal level.
We are pretty sure that is generally a bit warmer now that it was at the end of the Little Ice Age. And, a little less certain, but still pretty sure, that the first 17 years of the 21st century are a little warmer than the flat/or/cooling 40 years from 1940-1980.
We are not certain at any quantitative level at a precision of hundredths or tenths or even halves of a degree K. Given that the real uncertainty according to Dr.Schmidt is +/- 0.5K — we are not yet sure that the 1980s temperature is really different from last years temperature — it is within the uncertainty.
Thus Policy Recommendations based on the panicked claims of runaway global warming are not justified — we are not yet even certain about the last 40 years….
Kip,
I would disagree that we are not yet certain that the 1980s were different from last year’s temperature. This is the difference between systematic errors and random errors. The systematic
error in the global temperature might be +/- 0.5K but as long as that is constant then we can certainly tell whether one year was hotter than another. Most of the errors you are discussing are
systematic errors and thus cancel out when you take the difference to create an anomaly.
I would also disagree that the policy recommendations are not justified. Imagine if you went to
the doctor and they told you that you had a tumour but weren’t certain whether or not it was
cancerous. And furthermore the doctor said that by the time it was big enough for them to be
certain then it would have spread and your chances of dying would be 100%. Would you wait to
be certain or take the doctor’s recommendation and have it removed now?
The policy recommendations are based on the expert’s best predictions about the future. Whether or not society should act on them depends on a wide range of things such as a cost/benefit analysis how much we care about people in other countries, people living in the future who haven’t been born yet etc.
Percy, it does make a difference, a huge difference. We don’t know what the global temperature was in 1850 and certainly not to one tenth of one degree Celsius. Policy decisions based on pseudo-science are bad decisions.
The whole “what about the children” argument is laughable. What about the 21 trillion dollar debt that our children and grandchildren will inherit? Do you care about that? It’s a far more tangible problem, but I don’t see anyone talking about that.
Reg,
Again you are forgetting the difference between systematic errors and random errors.
If I weight myself on my scales at home I get an answer that is about 1kg lighter than if I use the scales at the gym. Hence I might claim that I know my weight with an accuracy of +/- 0.5 kg. However if I weight myself repeated on the same scales (over a short period of time) I get the same answer to within the precision of the scales which is 0.05 kg. Now the
relevant question is what is the smallest amount of weight I can gain and still be certain that I have gained weight – is it 0.5 kg or 0.05 kg? The answer is 0.05kg which is because I am measuring an anomaly and the accuracy of that is 10 times the accuracy with which I know my actual weight.
The whole “what about the children” argument is the whole point. The question is how much do we care
about the children especially child who will be born in other countries and who will have a lower standard of
living if the predictions about global warming are true. Which is a political and moral question not a scientific
one.
Percy, does your home scale have a digital readout? Does it display tenths of a kilogram? I recommend you test it by weighing yourself twice to be sure of consistency, then weigh yourself again holding 2 tenths of a kilogram (measuring cup + water). I predict your indicated weight will usually not change when you run this test.
I tested my home scale and determined the scale was rounding my weight to the nearest .5 kilogram, then converting that number to pounds – displayed to 1 decimal space. (I live in the US, so set the scale to indicate pounds.)
Even though my scale indicates tenths of a pound, it takes a gain of about a pound, on average, to make the display change. That change is always by 1.1 pound. I cannot say I know my weight within .05 pounds (per the precision of the display), I can only say I know my weight within .55 pounds – the precision of the rounding.
Yes, my scale must be a cheap model, but it shares a problem with the weather service:
rounding the reading. The precision/accuracy of the display doesn’t matter as much as the rounding of the measurement. It doesn’t matter if an instrument measures within a tenth of a degree if the measurement gets rounded to the nearest degree when recorded.
Once rounded, the precision of the original measurement is degraded to that of the rounding method. Never to be rehabilitated
SR
Steve, You hit the nail on the head.
Steve,
You are missing the point about systematic errors and random errors. In the example
above there is a systematic error between the two sets of scales of about 1kg so I do only
know my absolute weight with an error of +/- 0.5 kg. However each set of scales is consistent and can reproducibly tell me my weight to an error of 0.05 kg and hence I can measure any weight gain to an accuracy of 0.05 kg and that value is the same with both sets of scales. So any systematic errors cancel out when you calculate the anomaly (i.e. weight gain) and thus I know my weight gain with an accuracy that is 10 times better than I know my actual weight.
The same is true with any measurement where there are systematic errors. Taking the difference between two readings will remove the systematic errors resulting in a much more accurate measurement of differences.
Percy ==> And where, in the GAST world, do you think we have systematic errors that are fixed between “scales”? And what, pray tell, do we do with the ranges of the original individual measurements? (I suggest that”ignore them” is the wrong answer.)
Percy ==> One last time — the majority of the +/-0.5K is NOT ERROR– it is uncertainty that stems from original temperature records being recorded at 1-degree-Farhenhiet ranges rather than discreet measurements. You may choose to ignore the uncertainty but you do so at the risk of fooling yourself about important matters that bear on public policy.
Because the +/-0.5K stems from real uncertainty and not random error or systemic error, it does not resolve or reduce by long division and we have the same “no idea” of the real values within the range. We could possibly ignore uncertainty on a range of 1K if we were looking at a change of 10-20K in the metric in question. Unfortunately, we are confronted with real uncertainty on the same scale as as the change in the metric — GAST — which leaves us uncertain.
No amount of statistical samba will remove the uncertainty.
We are somewhat sure that it might be a little warmer now than 1940-1980 — in another ten years we ought to be able to be more certain.
Kip,
I have an experiment for you.
Step 1: Choose your favourite probability distribution
Step 2: Using that distribution choose 1000 random numbers in the range 0 to 10.
Step 3: Calculate the average of those 1000 numbers call this number T1.
Step 4: Repeat steps 2 and 3 and call the resulting number T2.
Step 5: Calculate the temperature anomaly T2-T1. You will get an answer somewhere between
-0.2 and 0.2 (With an exceedingly high probability).
Now imagine that this is the average of a set of 1000 temperature measurement with an accuracy of +/- 5 degrees. Despite the fact that each individual temperature reading has an error of 5 degrees the difference between the two averages will be less than 0.2 which is much less than the individual errors. This is the advantage of calculating the anomaly.
Percy ==> I don’t want probabilities — I want to know what the global average temperature temperature was for 1990. Not what it probably might have been or how probable such and such a guess might be.
I want to look at the data set and arrive at the mean of the global temperatures (quite complicated, area weighted and all that) — that GAST will come with its own uncertainty as a notation such as suggested by Gavin Schmidt 288K +/-0.5K. That complicatedly arrived at mathematical/arithmetical mean is the data as close as we are going to get it. It’s uncertainty is not reduced by switching to probabilities — we can’t UNKNOW the KNOWN UNCERTAINTY — we can only pretend that we can…..
Kip,
Again you are missing the point of the experiment. While it is not possible to say
with a high degree of accuracy what the numbers T1 and T2 will be (since I do not
know what probability distribution you chose) I do know that T1-T2 will be very small
and within will lie within +/- 0.2. Thus I can calculate temperature anomalies with a much higher degree of accuracy than I can calculate absolute temperatures.
“We use the period 1981-2010 to determine the average which is World Meteorological Organisation standard. It is a moving 30 year average so in 2021 we will begin using 1991-2020 as the new “average” if that makes sense.”
The above quote is from NIWA NZ. It is a reply I received to my query on how they establish baselines (the “average”) .
Neither media or NIWA describe the baseline when making statements such as “above average” in relation to weather events. But they have a problem looming.
In 3 year’s time they shift to baseline 1991-2020 which (depending on what data you use) is around 0.2 C higher than the 1981-2010.
What with the flattening of the trend over the last 19 years, results (for e.g. monthly mean) may well be below average. Oh dear!
What will they do then?
Regards
M
They’re not fooling themselves, they know exactly what they are doing.
Kip,
For more than 6 months now, I have been asking the Australian BOM to provide a figure expressing the total error expected when you wany to compare 2 past daily temperatures from Acorn stations in the historic record. As in Tdeg C +/-X deg C, what is the averall best value of X?
They say they have a report in progress for later this year, that will have the answer. They have declined some invitations to quote one from every-day working experience.
It might be interewsting for folk in other countries to ask, or if the answer already exists, to dig it up for their countries as an attributable statement.
For years now I have been pointing to the existence of formalism in error estimation as by the Paris-based BIPM (Bureau of Weights and Measures). There are others. Signatories to some of these that are Conventions are expected to report findings derived in ways laid out in the Convention Agreement. Sometimes there is domextic legislation that requires country bodies of a government nature to conform to certain procedures relating to uncertainty, for which audits are available.
The whole business of error has been badly downplayed in climate work, probably in other fields as well, where I have not looked so closely. There are very good reasons why we have procedures to measure and to estimate error, but lip serice or less seems the name of the game these days. In the ideal world, proper error estmates are good for forming initial appraisals of the standard of scientific work, just as their misuse is a sign of poor science.
Thank you, Kip, for again shining a light that is needed. Geoff.
Goeff ==> Thank you. It is a difficult thing for those (over-) trained in statistical theory and practice to wrap their heads around — they do truly believe that uncertainty can be reduced by the methods they have learned — even known uncertainty can be reduced to wee tiny uncertainty for the same metric by a little subtraction and division….
Kip, nice write up. Thanks.
I challenge Schmidt’s assumption of +/- 0.5C uncertainty. I think it is much larger and probably not symmetrical. I suspect a larger error range to the minus side due to warming bias mentality that affects their processing algorithms.
There are several reasons for my suspicion that the error range is greater. First, the instruments for much of the record were not calibrated. This might add another degree or 2. That error for a particular instrument would be constant up to the point the instrument broke or was replaced and then a step discontinuity would be introduced with a different error. There is reading error (like parallax – or not reading the meniscus – assuming glass thermometer). If a 6’6″ person is reading the thermometer at the post office and then the reader is replaced with a 5’1″ person, unless they get some training you get a different reading. What happens if the readers alternate days. Of course, there is thermal corruption of various sorts (bad siting, UHI, internal transmitters, etc.).
Another one that bothers me is the violation of the Nyquist Sampling Theorem. To accurately recreate the original signal without aliasing, then the signal must be sampled at a frequency greater than twice the highest frequency component of the signal. The daily temperature signal is not a purely sinusoidal 1Hz signal. There are lots of higher frequency components. Therefore, it must be sampled much more than twice a day – perhaps every 5 minutes or so.
The lower the thermal mass of the measuring device (thermocouple for example) the more quickly it responds to changes. And therefore, it needs to be sampled more often. Instruments with different thermal masses will measure the same event with different results. I’m pretty sure there is no standardization in this regard.
The goal for modern instrumentation readings should be first to start off with calibrated instruments. Each instrument should be required to adhere to a certain thermal mass response model – perhaps modeled after glass/mercury thermometers if the desire is to best match the historical instruments. Then, using the proper sample rate, the “area under the curve” should be determined to come up with an accurate “average daily temperature”. In some quick experiments I have run, I can get 0.5C – 1.0C difference using this method vs. the simple (max+min)/2 method. (And rarely is the actual max or min captured for the day.)
This method would create a discontinuity with the older record, I know. But for the past 40 years we could have easily performed measurements along 2 tracks: 1 to align with past methods and 1 to align with some real engineering methods. Note I didn’t say scientific methods because as I see it the world of “science” is often about publishing papers and not much is ever grounded in reality. Engineers get numbers right or people die. (“I know the plane crashed, but the publisher loved my paper!”)
What do you think about the uncertainty range? If it is much larger, then what does that do to the Alarmist Narrative?
Willard ==> The true uncertainty of GAST (its absolute value or its “anomaly from the climatic mean”) is certainly greater than +/-0.5K — how much? Don’t know.
The failure is believing that statistical definitions can change physical facts — something an engineer understands is incorrect and dangerous.
‘Another one that bothers me is the violation of the Nyquist Sampling Theorem. To accurately recreate the original signal without aliasing, then the signal must be sampled at a frequency greater than twice the highest frequency component of the signal. The daily temperature signal is not a purely sinusoidal 1Hz signal. There are lots of higher frequency components. Therefore, it must be sampled much more than twice a day – perhaps every 5 minutes or so.’
I reckon that matters not, at least for the purpose of ‘global averages’. They don’t want to actually reconstruct the signal – in this sense variation of the daily temperature in the function of time. All they do is a simply median between highest and lowest recorder temperatures per day. Whether that’s good enough – that a different question. On the other hand quantisation inevitably introduces errors as we approximate and assign values to the discrete levels, so yes, quantisation is always lossy – some information from the original signal (in our case temperature measurement) is irreversibly lost.
Paramenter, I think you miss the point. You can’t calculate the average accurately unless you have all of the information to do so. Under sample and you alias. Your average is wrong. I just posted to someone else so won’t repeat it all here, but this is the summary. If you integrate/average the 5-minute NOAA reference data for any location over a 24 hour period, that average may differ from the NOAA reference data stated daily mean by 0.5C – 1.0C.
Quantization error is an additional problem. As is reading error, thermal corruption, data infill, homogenization, etc.
The instrument record is badly sampled temporally. The instrument record is badly sampled spatially. It is not adequate for scientific endeavor.
Hey William, I reckon I’ve got the point quite nicely. I can’t see just yet how this point applies to the problem we discuss. But – I’m slow thinker. Will get there, eventually.
‘Under sample and you alias. Your average is wrong.’
For older temperature records you even don’t sample. All you’ve got is a midpoint between minimal and maximal amplitude . You don’t even know how many times and when the signal crossed this value. When we capture daily median eventually we build another signal consisting of several datapoints in the form of daily medians. This signal has obviously very little common with underlying temperature signal due to massive undersampling and quite substantial quantisation errors. Therefore uncertainties must be huge and daily, monthly and yearly average series are utterly unreliable, at least for the orders of magnitude we’re concerned. Is it what you try to say?
‘If you integrate/average the 5-minute NOAA reference data for any location over a 24 hour period, that average may differ from the NOAA reference data stated daily mean by 0.5C – 1.0C.’
I know it is easy to check but would you mind provide some examples?
Paramenter, (and Kip)
Thanks for the opportunity to expand on this. This is a vitally important issue, because it undercuts the entire measurement record used in climate alarmism.
I doubt this will be easy to digest if you have not studied it but here is a basic link:
https://en.wikipedia.org/wiki/Nyquist%E2%80%93Shannon_sampling_theorem
I recommend you don’t give up if this is confusing. It takes some formal study for it to make sense. It is not esoteric – it is not some abstract thing that is wrong to apply here. This is what underlies so much of the technology you enjoy today. Digital music, digital movies and digital instrumentation would not be possible without the application of Nyquist. This is not math that can be ignored – but somehow it is ignored by climate science.
Here are some fundamentals. This has to do with basic signal analysis. It’s an “engineering thing” – it is a “science thing” and a “math thing”. Engineers use it because engineers make things that work in the real world. Scientists and mathematicians may choose to ignore it because what they do never comes back to reality. Mathematicians can take numbers and do operations on them. They never stop to realize that the numbers they take are COMPLETELY INVALID from the start. I’ll explain.
First: Any signal that is continuous and varies with time has frequency components. The term “signal” may not be a familiar one to many, but temperature measured at a point in space is continuous – it doesn’t stop and start – and it varies with time. It is a signal, even if that vocabulary is not widely understood.
Second: Any measurement of that signal is a “sampling”. This too may be an unfamiliar vocabulary term. Whether the sampling is done by a human or a piece of electronic equipment is not important. Whether it was done today or in 1850 it was sampling.
Third: The sampling must not violate Nyquist, or the data is corrupted with aliasing. The sampling must be performed at a frequency (rate) that is at least twice the highest frequency component in the signal. If the temperature, measured at a point in space, varies exactly as a 1Hz pure sine wave then you can sample 2 or more times a day and not violate Nyquist. However, the temperature does not vary as a pure 1Hz sine wave, so the twice per day measurement/sampling of the signal by definition violates Nyquist.
Summary: The twice a day measurements:
1) Violate Nyquist
2) Cannot be averaged to get an accurate mean
3) Cannot be used to understand the thermal energy present with any accuracy
4) Are not valid for scientific use.
When people started recording temperature in the mid 1800’s they didn’t understand this. Measuring the high and low of the day served them as it related to planting crops and living life in the mid 19th century. It was not intended to be used for science. Now, we hear “that is the data we have – so we have to use it”. You know, we don’t have to accept that position! The correct response to that is “no we don’t – that data is corrupted and not valid for scientific or mathematical work!” The Nyquist requirements have been known since 1928. There is no good reason why we have continued to use aliased (NYQUIST VIOLATED) data since then. Even our satellite data does a 2x/day measurement. How can so-called scientists allow this?
Now for the example you request. Go to the NOAA climate reference network data. This is supposed to be some of the best data available.
Start with this data: DAILY (2017/Cordova Alaska/August 17, 2017):
https://www1.ncdc.noaa.gov/pub/data/uscrn/products/daily01/2017/CRND0103-2017-AK_Cordova_14_ESE.txt
You can import this into Excel.
In columns 39-45 you see “DAILY MAX TEMP”: 13.7C
In columns 47-53 you see “DAILY MIN TEMP”: 6.1C
In columns 55-61 you see “DAILY MEAN TEMP”: 9.9C
9.9C is the stated DAILY MEAN temperature. You can calculate it yourself (13.7+6.1)/2 = 9.9C.
Now, compare that to the NOAA climate reference data for the same location (Cordova Alaska), on the same date (8/17/2017) – but use the “SUB-HOURLY” data (taken every 5 minutes):
https://www1.ncdc.noaa.gov/pub/data/uscrn/products/subhourly01/2017/CRNS0101-05-2017-AK_Cordova_14_ESE.txt
Import this data into an Excel file. In columns 58-64 you see “AIR TEMP”. You can integrate (average) this data (sampled every 5 minutes) over the 24-hour period. You will see the Max and min temps (13.7 and 6.1) used in the DAILY data above.
HOWEVER – if you average the data using the higher sample rate you get an average of 9.524C which can be rounded to 9.5C.
Note, “quantization error” (due to the limits of the ADC) is already included in these measurements. The quantization error is small and not worth discussing at this point. But the error introduced due to ALIASING – VIOLATING NYQUIST – equates to 0.4C.
IMPORTANT: Without the 5-minute sampling, which is used for the daily data, it is unlikely that the actual MIN and MAX will be captured. This will introduce even more error.
Run more examples and you will find the error can be even greater.
The twice a day sampling of temperature is mathematically invalid. All of the data, whether taken in 1850 by a farmer or in 2018 by a satellite is invalid/wrong/flawed.
Sure, you can take the measurement and do all kinds of math on those numbers, but you are dead from the start. The fact that this is not well known and beat like a drum is simply confounding.
I hope this helps and inspires more discussion.
William==> I am out of the office for the rest of the day — I will try to get you a complete reply in the morning.
Mr. Ward says: ‘ so the twice per day measurement/sampling of the signal by definition violates Nyquist. ”
…
and Mr. Ward says: “The twice a day sampling of temperature is mathematically invalid.”
…
But what Mr. Ward does not understand is the Tmin, and Tmax are not two “samples.” The instrument that collects the data samples continuously throughout the day and records two distinct values, namely Tmin and Tmax for a given 24 hour period.
Mr. Dirkse,
Tmax and Tmin are 2 samples. There is no need to place “” quotes around the word samples. Most older instruments did not accurately capture the actual max or min temperature for the day, if for example the instruments required a person to be reading it at the time the max or min occurred. But this is not relevant to the core point. Even if you assume you arrive at 2 accurate samples of the highest and lowest recorded temperature for a 24-hour period, 2 samples are not enough to not violate Nyquist.
No one will stop you from taking those 2 numbers, adding them together and dividing the result by 2. But this will not yield the same result that sampling according to Nyquist will yield. Your (Max+Min)/2 result may be more accurate that a scenario where the highest or lowest numbers are not accurately obtained, but your number will be highly aliased and full or error – wrong. But if it makes you feel any better, you will be in the company of countless climate scientists who likewise are content with using mathematically flawed data.
If you care about accurate data, being scientifically and mathematically correct, then digest this point about Nyquist. It is a huge gaping hole in climate science.
Tmin and Tmax are not two “samples”. You are inappropriately apply the Nyquist in a place it should not be applied. The “climate” at any given location can be wholly specified by the daily Tmin and the daily Tmax as an anomaly with respect to the 30 year average at said location. When you collect 30 years worth of daily data, the Tmax-Tmin number exactly represents the sinusoidal waveform of daily averages that are exhibited on a yearly (seasonal) cycle.
….
Now if you wish to disprove the appropriateness of the existing methodology, then you would need to collect 30 years worth of data at a given location with a higher “sampling” rate, then compare the derived averages with the Tmin/Tmax methodology. Good idea for a study/paper that you can get published, shouldn’t be too hard for someone like you. I’m sure that someone like you with a wealth of experience in signal processing can overturn decades how meteorology has been done.
Another reason Mr. Ward that the application of Nyquist in this situation is invalid, is because both Tmin and Tmax are auto correlated with the prior day. To illustrate this consider a monitoring station located 50 miles north of New York City on January 20th. If Jan20 Tmin is 15 degrees F, and Jan 20 Tmax is 42 degrees F, then Jan 21 most likely will not have a Tmin of 52 degrees F and a Tmax of 78 Degrees F. This phenomena is known as “Winter.” Nyquist doesn’t work in this scenario.
Mr. Dirkse,
Of course they are samples. You have 2 discrete numbers measured from a continuous time varying signal. By definition they are samples.
Of course Nyquist applies. The signal is continuous, time varying, band-limited and is being sampled.
I don’t need to collect data, write a paper or prove anything. Nyquist did that (and Shannon).
But what people like you and I can do, and others on this forum can do is first, understand that Nyquist exists, second see that it applies, third see that it results and big measurement error and fourth start to cry foul and refuse to accept a body of alarmism and manipulation built upon a foundation of violating mathematics principles.
I’m just one person. I present this here to help spread the knowledge with the hope our voices against manipulation can be stronger.
The “signal” you are applying Nyquist is auto-correlated. How does Nyquist handle an auto-correlated “signal” ?? What modifications to your application of the theory must you make because the input signal is technically not random because of the auto-correlation ?
Ward says: “they are samples. You have 2 discrete numbers measured from a continuous time varying signal.”
…
Not exactly. The values have been selected from a set of samples.
…
Suppose you have a station that collects hourly data. You have 24 samples. You then select the highest value and the lowest value from the 24, then discard the other 22 samples.
…
You technically do not have 2 samples, because you are ignoring the 22 you’ve discarded.
…
You can’t apply Nyquist in this situation.
Mr. Dirkse,
Regarding your comments about auto-correlation all I can say is that you are speaking mumbo-jumbo. I don’t mean to be disrespectful. You are resisting a new concept, apparently. When you are sampling a signal, it doesn’t matter what you had for breakfast or what is going on in New York. If you want to make an analog signal digital, then Nyquist is your man. The max and min would be of value if the day was a constant max temp and then at a certain hour the temp snapped to the min temp – AND if the time of high and low were exactly equal then a simple average would yield the beloved mean temperature. The fact is that temperature is varying up and down all day and night. The goal is to capture all of the energy in those “nooks and crannies” of the changes. If you follow Nyquist, you get the right answer – the real Mean temp. If you violate Nyquist, you get a number – but not an accurate one. Why are you fighting this? I’m glad to offer the knowledge I have on this if it is helpful, but if you want to throw mumbo-jumbo because the concepts I bring up rattle your cage then I can’t help you. Further mumbo-jumbo will not be responded to. Peace to you.
Dear Mr. Ward, please review the following “mumbo-jumbo” before you post your next comment. You obviously do not understand that time series data and a “signal” in your Shannon-world are not orthogonal.
..
..
https://en.wikipedia.org/wiki/Autocorrelation
..
What I like about the wiki post is that it talks in something you’ll “understand,” namely signals.
You see Mr. Ward, I’m not ignorant of your Shannon/Nyquist worldview. Suppose you have a communications channel that is one byte wide. If you receive a byte of value X at time t, then in your world, the value of the byte you get at time t+1 has a probability of 1/256. In an auto-correlated time series of climate data, this is not true. the value at time t+1 is dependent/correlated on/to the value at time t. In the real world what this means is that on January 20th, if the Tmax is 20 degrees F, the probability of the Tmax on January 21st being 25 degrees F and the probability of Tmax on January 21st being 83 degrees F are NOT THE SAME . This is why you can’t apply Shannon’s theory. In your world the probabilities would be identical, but auto-correlation (WINTER) says this is not the case. In your “Shannon”/signal processing world, the temp on January 21st could be 73 degrees F with equal probability as it being 22 degrees F.
Mr. Dirkse,
I assume you know a great deal about doing digital signal processing in the field of digital communications, so I mean no disrespect.
I’m really not sure how to address what you say. We seem to be on a different wavelength. Nyquist is not a worldview. Probability and auto-correlation have nothing to do with sampling. Digital communication protocols are already in the digital domain and don’t involve Nyquist. You can’t just arbitrarily throw away samples or cherry pick 2 (Max and Min).
Maybe we should try to engage again later on another subject. Peace to you.
Kip,
I added some good data in replies to Phil and Paramenter. In the latest I provide NOAA supplied data – from their REFERENCE network, showing the extent of the error in the (Tmax+Tmin)/2. NOAA already provides a mean that appears to comply with Nyquist, but they still use the old method. Take a look at the amount of error. It should turn your head. Help me figure out how to upload a chart or Excel file and I’ll show a graph of the error over a 140 day record from NOAA.
You are probably drinking from the fire hose on this post – so reply when you get a chance to first take in the info.
William ==> see my other reply — email the whole kit…
Well the use of a Max-Min thermometer means that the temperature is monitored continuously with a time constant of 20sec so I would expect that Nyquist is adequately covered.
Phil,
You said a time constant of 20 seconds. That might mean it samples every 20 seconds, but time constant can mean other things. We really need to know the frequency content of the temperature signal. Some days might have much more high frequency content than others and therefore would require a higher sample rate. I’ll make a generalization that if we sample every 5 minutes then the result is probably good – but that is a guess. So, a device that samples every 20 seconds would likely satisfy Nyquist – HOWEVER:
While this 20 second sample rate may give you an accurate Max and Min Temperature, just getting 2 samples (Max and Min) from this system does NOT satisfy Nyquist. You must use all of the samples that the Nyquist rate would deliver if you are to do any processing or calculations. Once you have all of the data required by Nyquist then you are free to do any mathematical operation on that information and do so without the added error of aliasing. You may not want to reconstruct the original analog signal, but if Nyquist is followed you could. With just a Max and Min you cannot – even if the Max and Min are accurate. If you want an accurate Mean Temperature from that day, then you can use the samples provided by following Nyquist and get an accurate Mean Temperature. Averaging Max and Min alone does not give you an accurate Mean Temperature.
This would be a lot easier if I could find a good picture to illustrate and help increase the intuitive factor. Let me try to create an example that illustrates.
If you have an electric range with settings from 0-10 and set it to 10 for 30 seconds and 0 for 30 seconds. The average or mean for that 1-minute period is 5. The amount of energy the burner put out for that 1 minute could be exactly duplicated if you ran the burner at 5 for 1 minute. If you have the correct Mean, then you understand the amount of energy delivered over the time of interest. Now what happens if you operate the electric burner with a more complex pattern? What if it changes every second and goes 0, 10, 7, 3, 5, 2, 9, 3, 2, 1, 0, 8, 6, etc.? Can you just take the Max of 10 and the Min of 0 and average it to 5 and know you have captured the Mean correctly? Let me make it easier. It operates for a minute, changes every second, starts at 0 for 1 second, goes to 10 for 1 second and then goes to 2 and oscillates between 2 and 3 for the remainder of the minute. You intuitively know that the average/mean is somewhere between 2 and 3 (probably close to 2.5). If you just use the Max and Min, you get 5. That is not correct. In this example the highest frequency component is 1Hz – the signal changes once every second. Sampling must be done fast enough to capture all of the transitions and you have to use at least the minimum required samples for math operations on that data to be correct.
One more thing about the word accuracy: There are a lot of things that can reduce accuracy. Quantization was discussed in other posts. Aliasing can reduce accuracy. But even with a high-resolution instrument sampled at the right rate, we still need to know that all components in the signal chain are not introducing error. Assuming a digital system, we need a calibrated instrument that takes into consideration the thermocouple, thermal mass of the front end, amplifier chain, power supply and converters. Noise or error can come from any of these things. Noise can also come from external factors related to siting, Stevenson screen, external thermal corruption, etc. It can also come from making up and manipulating data after the fact. All of these sources of noise reduce the accuracy of the measured signal.
(Max+Min)/2 only works if the temperature signal is a pure sine wave that changes once per day (which is never).
[Note I incorrectly referred to the 1 cycle per day as 1Hz in an earlier post and that is not correct. I cycle per day is actually 0.000012 cycles/second or 0.000012Hz. I just mention to be correct – not that it matters to the points made.]
Phil,
One more example to try to bring in the intuition.
Imagine (or better yet draw on scrap paper) a rectangle with a base of 10 and a height of 2. Now draw an isosceles triangle with a base of 10 and a height of 2. They both have the same height (Max+Min) and average height (Max+Min)/2, but which shape holds more volume? The rectangle. We can’t know this from (Max+Min)/2 no matter how carefully we calculate it.
When the temperature changes throughout the day, the signal is not a nice smooth curve. It will go up and down a degree or so every 5 or 10 minutes if you have clouds moving overhead and breezes blowing. So, the curve becomes irregular shaped. These irregularities to the shape are caused mathematically by higher frequency components of the signal. Nyquist helps you collect up all of the thermal energy in all of the irregular spaces of the signal. We know intuitively that if it is 100F for 1 minute in a day that it isn’t as “hot” as another day where the temperature is 100F for 180 minutes. Nyquist allows us to know not only what temperature but effectively how long at that temperature.
Here’s a biological anomaly – a beluga whale in the Thames:
https://www.bbc.com/news/uk-england-kent-45649502
A tad further south than normal, one would think.
That CO2 global warming is negatively warming the north Atlantic and making whale populations move negatively north.
Phil ==> Interesting …is it a small anomaly as beluga are small whales?
Yes – the Beluga is like an oversized white dolphin.
In fact they sometimes travel together with dolphins or porpoises.
It is thought that this is how the beluga got into the Thames.
It’s not lost or distressed apparently – just happily eating fish and other stuff in the central deeper colder more saline parts of the river.
https://binged.it/2zxIY5p
Kip,
This is interesting and speaks to the long term concern many have had about how anomalies are presented and used by regarding climate.
Hunter ==> Yes, that is the problem – how they are used. They may have some utility for something (though I doubt that the “global anomaly” has any physical meaning).
” So, while the mean is quite precise, the actual past temperatures are still uncertain to +/-0.5.”
— I’m not sure this is true. If you take one measurement and get 72 degrees +/- 0.5, then that is the range of your uncertainty. If you take 1,000 measurements, and the result is 72 degrees 51% of the time and 71 degrees 49% of the time, your average is 71.5, but the error bars are no longer +/- 0.5. If your mean is precise, then so is the estimate of actual past temperature.
Your comment is incorrect. If you take 1000 measurements of which 500 are 72 and 500 are 71 the mean is indeed 71.5 but the standard deviation is 0.5 (try it out in excel).
To calculate uncertainty you need first to identify all uncertainties in the measurement process, categorize them as Type A or Type B, determine the std uncertainty for each uncertainty, combine them (if uncorrelated use RMS) and then apply a coverage factor (2 is recommended for 95% confidence).
The uncertainty you arrive at is then a +/- figure about the mean of the measurements, which is your best estimate of the measurement. That doesn’t mean btw that your measurement is accurate. Being precise just means that taking a no. of measurements results in a relatively low value of std deviation. It says nothing about the accuracy of the measurement. Imagine throwing darts at a dartboard but without the board present. How do you know if you are hitting the bull, even if all the darts seem very close. The mean is only the best estimate of the measurand (e.g. temperature)
Your comment is incorrect. If you take 1000 measurements of which 500 are 72 and 500 are 71 the mean is indeed 71.5 but the standard deviation is 0.5 (try it out in excel).
And the standard error of the mean is 0.016.
ADE and Phil ==> Therein we see the confusion of mixing measurements with statistical theory. Standard error of the mean vs Standard Deviation — both totally ignoring the original uncertainty of the temperature record which is the DATA. In temperature records we are dealing with values that have been recorded as ranges – the ranges are not errors.
No they’ve been recorded as lying within a range, taking a set of whole numbers as data certainly can be averaged and treated statistically.
For example I generated 100 random numbers between 71 and 73 and the mean was 72.0367 (se=0.06), when I rounded the set of numbers to the nearest integer and then took the mean was 72.030 (se=0.07). Repeat it multiple times and the mean still falls within +/- 2se.
There is a reason why we apply statistics when we make scientific measurements.
First, I took +/- 0.5 as the error range, not the standard deviation. So I have an error of 0.5 being 2 std dev from the mean. But in any case, the more measurements you take, the better will be your estimate of the mean.
Using your dartboard analogy, the more darts you throw, the more likely I’ll be able to tell you exactly the position of the bullseye even if you never actually hit the bullseye. That will be true for any distribution pattern — normal, random, skewed, etc.
Lets say you have an unfair coin that doesn’t come up heads or tails evenly. I comes up heads 86.568794% of the time. Give me enough tosses and I’ll tell you exactly how unfair your coin is to as many decimal places as you wish to know.
Steve O
In your experiment you know the answer, heads or tails. But without the dartboard present you don’t know where the bull is, no matter how many darts you throw. The mean is only the best estimate of the location of the bull. But throwing 1000 darts does not make this more accurate. If an instrument has a built in systematic error you cannot remove it by repeating the measurements. If a sensor reads high by 1dgC it will always do so. The only way to improve on this is to use a better meter or method. Why do you think we use 4.5, 6.5 and 8.5 digit multi-meters? Its for the better specified accuracy, not the resolution. Otherwise we could just take lots of measurements with a cheap meter. And when calculating the error you must take into account the uncertainty of the meter as well as the distribution of your readings plus all other uncertainties in the measurement process.
Steve ==> I give a lot of examples in the essay and in Durable Original Measurement Uncertainty.
Each of your measurement (temperature records) is a range 1 degree wide within which the real temperature existed but is totally unknown.
Kip,
But a real temperature does exist.
Pick a value between 71.0 and 73.0.
Use a random number generator to add or subtract an amount to simulate 1,000 “measurements” so that the measurement accuracy is +/- 0.5.
Give me the 1,000 measurements, all of which now have inaccuracy built in.
How close do you think I can get to estimating the true value?
Now let’s make it harder. Only give me the whole number results of the 1,000 measurements. Round all the measurements to the nearest whole number.
Now how close do you think I can get to estimating the true value? You might want to try this in Excel before you bet me any money, but I can get a heck of a lot closer than +/- 0.5
Also, the accuracy of my estimate is not seemingly impacted by the fact you round all the measurements.
Steve ==> We don’t know the “real” temperature — it only existed for a moment, which has passed. What we know is that at that moment of measurement, we wrote down “72” for any temperature between 72.5 and 71.5. We can not somehow “know” the true temperature by applying either arithmetic or statistics. The actual true temperature of that moment will forever remain unknown — it is lost data.
The ONLY thing we know is the range in which the true temperature would have been found. We can not ever recover the actual temperature to any value of greater precision than the original uncertainty range.
Kip
“We can not ever recover the actual temperature to any value of greater precision than the original uncertainty range.”
No-one claims you can. The claim is that the average of a lot of different readings is a lot less variable than is each individual one.
Nick ==> That is just an artifact of averaging — averaging any set of numbers, related or not, gives a single answer, quite precisely. what it doesn’t do is add new information (the data is the data) nor does it reduce the uncertainty of the data. Averaging creates a “fictional data point” — with an equally fictional smaller variability. In the sense used by Wm. Briggs in “Do not smooth times series..”
Averaging creates a “fictional data point” — with an equally fictional smaller variability. In the sense used by Wm. Briggs in “Do not smooth times series..”
And it’s an equally nonsensical statement.
When I measured the velocity of a gas jet using Laser Doppler Velocimetry I would seed the flow with extremely small particles and measure the velocity of each particle as it passed through the control volume, I would then calculate the mean and standard deviation of the distribution. Nothing ‘fictional’ about the data, and it’s much more useful than knowing an individual velocity.
Phil ==> Did you read Briggs?
“We can not ever recover the actual temperature to any value of greater precision than the original uncertainty range.”
— I can prove this to be an untrue statement with an experiment in Excel. The actual value will be revealed by the distribution. While it is true that you will never improve the variability of any particular measurement, you can determine the underlying value. But you don’t need to take my word for it because you can see for yourself.
In one column, for 1,000 rows enter =Rand() to create a random number between 0 and 1.
In the next column enter any fixed value you choose.
In the third column, add the two columns together. You can subtract 0.5 if you want to keep eliminate the net effect of the addition while keeping the randomness. These are our inaccurate “measurements.”
Calculate the average of the range and see how far it is from the value you chose.
Now for the magic.
Add another column that rounds your randomized “measurement” result to the nearest whole number, and at the bottom of that column calculated the average again.
You should see that it is possible to determine the exact number you chose by averaging all the inaccurate measurements, and you’ll also see that rounding each inaccurate measurement had no effect on your ability to determine the true value.
Steve 0 ==> The world is full of maths tricks…. you can not recover an unknown value from an infinite set. Since the actual thermometer reading at an exact time was not recorded, and only the range was recorded (“72 +/- 0.5”) there are an infinite number of potential values for that instantaneous temperature. No amount of statistic-ing or arithmetic-ing will reveal the reading that would have been recorded had they actually recorded it.
I guess that most here know of Aesop’s fable: “The Tortoise and the Hare.” Basically it’s story where a hare challenges a tortoise to a race. At the start of the race the tortoise begins a slow, plodding gait while the hare takes off, returns, circles the tortoise several times, chastises and ridicules the tortoise, and then takes off again for the finish line. About halfway to the finish line the hare decides to take a nap. Unfortunately for the hare, he sleeps too long and lets the tortoise gain a major lead. Although the hare easily catches up to the tortoise, he doesn’t quite make it before the tortoise crosses the finish line. The hare loses by a hair.
Now what is the average speed of the race participants? The tortoise is easy to calculate–it’s the distance from start to finish divided by the duration of the race. What about the hare? Except for the width of a hair, the hare traveled the same distance in the same amount of time. So the two animals have the same average speed (minus the width of a hair).
Is the average speed representative of either animal? It seems to apply to the tortoise, but we don’t know if the tortoise took rest breaks. In any case, the average speed of the tortoise matches pretty well with his actual speeds during the race.
What about the hare? If anything, the hare’s average speed was something he was passing through to another greater or smaller speed. The hare was probably never going that average speed during the entire race.
Anyone who claims averages tell all is kidding themselves.
There’s no way to tell if a changing temperature is taking a tortoise-like route or a hare-like route, and an average temperature tells you nothing about the actual path of those two routes.
Jim
Kip posts: ” you can not recover an unknown value from an infinite set.”
..
FALSE
…
Here is an infinite set: { 3/10, 3/100, 3/1000, 3/10000, 3/100000……. }
Unknown is the sum of all of the elements of this set.
…
Put in more familiar terms:
..
0.33333….
…
We all know the sum of this infinite set….it is 1/3
Or in more precise terms:
..
3/10 + 3/100 + 3/1000 + 3/10000 + 3/100000+ ………. = 1/3
Phil ==> Did you read Briggs?
Yes.
Phil ==> So do you now understand what “fictional data” is?
Phil ==> So do you now understand what “fictional data” is?
Yes, irrelevant sophistry.
“The claim is that the average of a lot of different readings is a lot less variable than is each individual one.”
if they are not taken at the same time and place, then they really are not related and dividing the sum of them is silly.
or maybe averaging a tangerine from 1402 with an kiwi-fruit form 1989 is how to get a nasty salad. i wouldn’t eat it.
A related discussion of error estimation in temperatures arises when conversions are made from F to C and the reverse. A group of us did a lot of work in this, about 5 years ago.
It is not feasible to make an umbrella summary. However, as in a lot of climate work, the official view, when stated, was often optimistic. Our BOM admits to a possible rounding inaccuracy of about 0.15 degrees C, but that the time of metrication was when the great pacific climate shift was in progress, so attribution was equivocal and so the solution was to do nothing about it. At least we know that there is some probability that a known mechanism has created an error now forgotten. Geoff
There was a lengthy post on here by Pat Frank who has done a lot of work on the subject.
See
https://wattsupwiththat.com/2016/04/19/systematic-error-in-climate-measurements-the-surface-air-temperature-record/
There was also a lengthy reply by Mosher saying the same things about Large Numbers here.
https://moyhu.blogspot.com/2016/04/averaging-temperature-data-improves.html
However I do not see how Large Number theory applies when the subject being measured is ever changing where no 2 measurements are actually measuring the same thing, by the same instrument in the same location.
In Mosher’s case he takes the maximum daily temperature for Melbourne using a modern instrument, which is absolutely nothing like an old Max/Min Thermometer being read by different people.
One is very Accurate but reads fast transients without interpretation whereas the other is very slow and relies on the interpretation of a Human Eyeball.
And yet they want to apply the same rules to both, which makes absolutely no sense whatsoever.
The problems with the accuracy of modern BOM data has been completely exposed by many in Australia.
I worked in a Government Metrology Lab and the one thing about physical measurement I learned there was that a Controlled Environment is essential for both Accuracy and Repeatability.
Anyone who thinks a wooden box with Screen is a “Controlled Environment” is crazy, they were not designed to measure “Global Temperatures” to tenths of a degree, just to give the very very local conditions.
Even 20 yards away could be completely different conditions.
Interesting, but I’m not sure how much this small uncertainty really matters relative to other larger uncertainties with the data. The actual measurement error is much larger than Gavin admits, as is obvious from the much larger changes they’ve made to temperature just since he made the original claim of .1 degrees — and every adjustment adds its own new potential sources of error.
https://stevengoddard.wordpress.com/2013/12/21/thirteen-years-of-nasa-data-tampering-in-six-seconds/
TallDave==> It matters only to the degree that we have a whole scientific field (almost the whole field) fooling themselves with the “precision and accuracy” of this GAST(anomaly) metric — and insisting that the itty-bitty changes in GAST(anomaly) are exempt from the actual uncertainty of the metric GAST(absolute).
It leads to a vast underestimation of uncertainty — and thus the field is over-confident about things that are still pretty vague.
Read the paper that is discussed at Dr. Curry’s blog “The lure of incredible certitude”.
I know this is a bit off topic but can someone provide a like to how satellites measure temperature.
Thanks
Mike ==> You will probably find this answer unsatisfying — but it is accurate. See the two paragraphs on the Wiki page for “Satellite temperature measurements” under Measurements .
Hint = Satellites do not measure Earth surface temperature.
Kip, thanks for informative and accessible to an average human mind post. For my benefit I would summarize how I understand matters (Kip it in seven points ;-):
1. As per global average temperature ‘official’ climatology freely admits uncertainty range (measurement error) +/- 0.5 K (Schmidt)
2. What means that for the last 37 years there is no detectable increase in the global average temperature, i.e. increase beyond uncertainty range (your reanalysis chart).
3. Official climatology tries to hide this uncertainty range in the graphs carefully prepared for public presentations where uncertainty bars are either missing or greatly reduced.
4. To reduce measurement/rounding errors many climatologists apply various statistical techniques in hope of ‘improving’ quality of the input data and removing large part of uncertainty.
5. Alas, no such reliable techniques exist.
6. It appears that there is widespread belief (superstition?) among many climatologists that by applying such techniques values of the calculated global average temperature will start to converge with the actual values and they can somehow disperse gloomy shadows of measurement uncertainty.
7. Accuracy of .000 C per global average temperature values is just mere artifact of mathematical operations where we can define any arbitrary number of decimals. Unfortunately, it has very little to do with the true values that may be significantly different, i.e. within large uncertainty range.
Have I got the picture, Kip?
Paramenter ==> Close enough for CliSci.
The problem is not limited to Climate Science (CliSci). Epidemiology is rife with nonsensical findings claiming great precision from extremely messy data. Psychology has for many years relied on studies of a handful of subjects to make population wide proclamations about the mind and psyche. Researchers in almost all fields have used pre-packaged statistical software — without any real understanding of its limits — to make all kinds of unwarranted conclusions.
This is referred to in the science press as The Crisis in Science.
I’m glad to hear that uncertainty range 0.5 K means there is no detectable warming in the last almost 40 years. That’s settled science and rather good news! This chart definitely requires more attention.
Kip, the uncertainty of the GAST is not +/- 0.5 degrees.
..
The Standard error is equal to “s” (in your case 0.5 representing the thermometer SD) divided by the square root of the number of obs.
….
Please stop confusing instrument error with sampling error.
David ==> You really should read the essay. Gavin Schmidt put the uncertainty of GAST(absolute) at 0.5K — I think it is higher, but he is the Climate science expert, so I’ll use his figure.
Kip,
You may or may not have seen this from 7 years ago, : https://wattsupwiththat.com/2011/01/22/the-metrology-of-thermometers/
The increase in global average temperature over the last 100 years is less than the noise threshold of the surface measuring instruments…
Mark ==> Yes, “Pat Frank that deals with the inherent uncertainty of temperature measurement, establishing a new minimum uncertainty value of ±0.46 C for the instrumental surface temperature record” — pretty close to GFavin’s +/- 0.5K.
Kip, nice work. Sorry I made it late to the game, but from my engineering education and from over 30 years experience working with air quality and meteorological measurements, I remember a lot of references to “precision” and “accuracy”. Precision is much more easy to determine than accuracy and most of the discussion here seems to be involved on what I understand to be “precision”. From what I recall, precision has more to do with repeatability than anything else. Good repeatability does not necessarily mean good accuracy. Instruments that are assumed to have a linear response may not have a linear response, for example. Calibrations often drift over time and affect the overall “accuracy”. Accuracy in my mind must also include “representativeness” relative to what you are trying to measure. Awhile back I wrote down some more detailed thoughts about accuracy of global temperature and temperature anomaly estimates here:
https://oz4caster.wordpress.com/2015/02/16/uncertainty-in-global-temperature-assessments/
For sake of brevity, I won’t copy that long post here, but please take a read to consider some of the many additional issues I have not seen mentioned in the discussions here. I’m sure there are some I missed as well.
Bryan ==> Thanks for the informative link — I too believe that Gavin’s 0.5K is the minimum uncertainty…lots to add to it.
Propagation of uncertainties isn’t that complicated https://en.wikipedia.org/wiki/Propagation_of_uncertainty#Simplification
Uncertainty for the mean is reduced by factor 1/√N, however uncertainty for an anomaly is again √((0,5/√N)² + 0.5²) = 0.502, as correctly stated by Gavin.
vandoren ==> Well, at least you end up with an approximately true Known Uncertainty for GAST = 0.5K.
I was speaking though — and this whole essay is about — the real uncertainty (We Really Just Don’t Know) that results from recording the original temperature measurements as 1-degree-F ranges.
Kip: Consider trying this experiment for yourself. Find a random number generator. In my case, it is part of the free add-on for Excel for MAC (2010) and was built in the previous version. Generated 10,000 random numbers with five digits after the decimal point. I request an mean of 60 degF (and got 59.96342) and a standard deviation of 20 degF and got (19.88341). Round all of the numbers to the nearest unit using “=round(A1,0)”. Calculate the mean (59.96510) and standard deviation (19.88491). Does rounding make any difference in the mean?
Then I added about 12 degF of GW and generated 10,000 new temperatures with a mean of 71.76008 and std of 20.07188. Then rounded as before. 71.76070 and 20.07334. Did rounding ruin the precision that the mean warming between the two data sets was 11.79666. Yep. The difference in means after rounding was only 11.79560, and error of 0.00106 degF caused by rounding.
Not arguing.
Frank ==> We are not really rounding — we are dropping the uncertainty altogether and using ranges. It is NOT about the VALUE of the MEAN — it is about the UNCERTAINTY RANGE that devolves correctly to the MEAN when using values that are RANGES.
We are not using ranges we are rounding a value to the nearest mark on the thermometer. As I showed above if you take for example 100 random numbers between 71 and 73 to be the actual values with a uniform distribution for that set the mean was 72.0367 (se=0.06). When I rounded the set of numbers to the nearest integer and then took the mean was 72.030 (se=0.07). Repeat it multiple times and the mean still falls within +/- 2se.
So when taking readings from within a uniform range to the nearest integer value the uncertainty range that is appropriate is derived from the standard error of the mean of that set. As shown the mean and standard error of the rounded set will converge on that of the original set, in the sets of 100 values that I studied they’re virtually indistinguishable.
Phil ==> I am talking about the REAL WORLD — not an idealized example. In REAL meteorology, the recorded value for a single temperature reading is a RANGE — in Fahrenheit — 1 degree wide. Always has been. Still is.
It’s not an idealized example, it’s what happens. The temperature if it lies between two limits, say 71 and 72, and fluctuates within those two bounds and will be recorded as one of either of these values. As I have shown if sufficient readings are sampled you end up with the same mean as if you’d sampled the original temperatures with higher precision and with a standard error that will depend on the squareroot of number of samples.
Phil ==> One more time — I don’t give a hoot about the fictitious value “the Mean” and its Standard Error — both are statistical animals — unrelated to our real world uncertainty. about the item of our concern, the temperature. We are not dealing with millions of fictitous values, we are dealing with real world measurements recorded as ranges. You are using the Gamblers Fallacy in a sense — “I would have beat the house if I had made a billion bets — the odds were in my favor.” (He may be right — but he made one big bet and lost).
You may end up with the same mean — but you will be just as uncertain in the real world (statisticians will be certain as long as they have enough numbers a a big enough computer — but you wouldn’t want to blast off to the Moon in a rocket built by statisticians — you’d want one built by cynical hard-boiled engineers.)
Kip says: “both are statistical animals — unrelated to our real world uncertainty”
..
Kip, did you know that “temperature” is defined by “statistics?”
..
https://en.wikipedia.org/wiki/Statistical_thermodynamics
Phil ==> One more time — I don’t give a hoot about the fictitious value “the Mean” and its Standard Error — both are statistical animals — unrelated to our real world uncertainty. about the item of our concern, the temperature.
Rubbish, they are essential to understanding the uncertainty of the measured quantity in the real world.
You may end up with the same mean — but you will be just as uncertain in the real world
No you will not.
but you wouldn’t want to blast off to the Moon in a rocket built by statisticians — you’d want one built by cynical hard-boiled engineers.)
Built by some of the engineers I’ve taught, no problem, but they understood the underlying statistics of measurements, one of them even flew on the Space Shuttle as a payload specialist a couple of times. Also over 40 years ago I was the co-inventor of a measurement technique/instrument which revolutionized the field and is still the pre-eminent technique, knowledge of the statistics of measurement was essential.
Kip: Respectfully, the question – as best I understand it – is whether we can accurately measure a change (such as 0.85+/-0.12 K) between two states that we know quite inaccurately (such as 287.36+/-0.73 and 288.21+/-0.69). The answer depends on what causes the error:
1) Obviously, I’ve proven that our inability to read a thermometer more accurately that +/-0.5 degF doesn’t limit our ability to measure a CHANGE IN MEAN to within +/-0.001 degF in 10,000 measurements. (I’m disappointed to see you bring up this red-herring.)
2) Constant systematic error: If a thermometer always reads 5 degF too high, we can still take 10,000 daily measurements starting at noon of 1/1/2019 and 10,000 measurements starting at noon of 1/1/2119 and measure change over a century between these two 27.4-year means to within +/-0.001 degF. A constant systematic bias causes no problems. Andy’s station project surveyed potential station bias, but only CHANGING bias will bias a trend.
3) Varying systematic error: Based on what we know about time-of-observation bias with min-max thermometers, a systematic error of this type can introduce a bias of 0.4 degF in a change.
4) Another varying systematic bias: Suppose Stevenson screens GRADUALLY get dirty (decreasing albedo) and clogged with leaves (reducing circulation). The average temperature reading will gradually increase. After ten years or so, the station is cleaned, removing the upward bias in the trend AND introducing a break point. The data is homogenized, removing the break point that restored initial unbiased measurement conditions – keeping the biased trend and removing a needed correction.
5) Calculating local temperature anomalies for a grid cell helps remove some systematic errors. At the end of the Soviet era, a hundred or stations is cold Siberia stopped reporting.
If Schmidt is talking about a CONSTANT systematic uncertainty of 0.5 K, then we can measure change to within 0.1 degK. CHANGING uncertainty this size is a disaster.
All of the data adjustments add 0.2 K to 20th century warming. Maybe half those adjustments weren’t caused by systematic errors that should be corrected (station moves, TOB, new equipment).
Frank ==> You can read what Gavin Schmidt says: I give the link above and here once more — there is no misunderstanding him.
Here is the quote again:
We don’t need any “stinkin’ anomalies” to see the CHANGE in the GAST(absolute) — we just look at the values:
2016 288.0±0.5K
2015 287.8±0.5K
2014 287.7±0.5K
The total change between 2014 AND 2016 is 0.3K. As the known uncertainty for each value is +/- 0.5K — all the values “appear to be the same within the uncertainty”.
There — that’s it — we are done.
It is KNOWN UNCERTAINTY in the GAST(absolute) — we can not UNKNOW that through statistical manipulation and definition.
Current figures from NOAA For THE ANOMALY says the dif between 2014 and 2016 is 0.21K, CRU says the same difference is 0.218K , NASA/GISS says 0.26 — all pretty close to Gavin’s figures for which the difference is 0.3K.
So to say that the Anomaly of the Annual GAST is more certain than the actual GAST(absolute) — all figured from the same uncertain data is a self-deluding statistical trick.
We already KNOW the real uncertainty of GAST ….. it is ten to one hundred times the uncertainty claimed for the “anomaly” of the same metric. That’s a lot of “fooling oneself” — the justification for that “fooling” does not matter at all. Being able to justify it doesn’t change the facts.
Frank — an aside: I am a father of four, and was a Cubmaster and Scoutmaster for ten years — I have been a Marriage Counselor — I have a been a Personal Behavior Counselor — I have a LOT of experience with justification — I have listened to hundreds and hundreds of hours of kids and adults justifying their own behavior and justifying their actions and justifying their opinions. Nearly everyone (at least at first) is absolutely sure that their justification makes their actions right. Seeing scientists justifying their actions to make their results look more certain saddens me — if they want more certain results they have to have more certain data, not more justification.
Kip,
“So to say that the Anomaly of the Annual GAST is more certain than the actual GAST(absolute)…”
For Heaven’s sake, don’t you ever learn anything? No one says that about the Anomaly of the Annual GAST. They say that about the mean of the anomalies, not the anomaly of the mean. Gavin is saying that GAST is unusable because of its large error, and they don’t use it. GISS long ago explained why.
Nick ==> You are still fooling yourself.
I know they don’t use the GAST(absolute) because of its uncertainty. So they try an end run around the uncertainty by looking at a smoothing of smoothing of tiny numbers — that can only have a tiny variance — and claim that that tiny variance is the real world uncertainty about changes to Global Average Surface Temperature. It still isn’t — we know the real uncertainty — we can not unknow it using statistics.
Nick ==> You are still fooling yourself.
I know they don’t use the GAST(absolute) because of its uncertainty. So they try an end run around the uncertainty by looking at a smoothing of smoothing of tiny numbers — that can only have a tiny variance — and claim that that tiny variance is the real world uncertainty about changes to Global Average Surface Temperature. It still isn’t — we know the real uncertainty — we can not unknow it using statistics on anomalies.
Kip: Thank you for respectful reply and pointing out the link to Schmidt again.
Above I discussed the difference between a CONSTANT systematic error and other kinds of errors. I explained why a thermometer that consistently read 5 degF too high could perfectly accurately measure a CHANGE in MEAN from a 10,000 measurements to within less than 0.01 degF and proved this with some normally distributed artificial data. Were you convinced by this?
As I understand it, Dr. Schmidt is discussing constant systematic errors between different methods of calculating global mean temperature. These methods use different criteria for collecting temperature records, processing that data to get a single value of each grid cell, different sets of grid cells, different methods of dealing with grid cells with no data (interpolation?), different methods for dealing with breakpoints, and probably a dozen other things I know nothing about. Nick Stokes has constructed TWO of his own temperature indices, which differ. He could speak more authoritatively about this. (Nick and I probably disagree about some aspects of homogenization.)
By analogy, Schmidt is talking about 5 different thermometers (processing methods) that measure the Earth’s temperature. The NCEP1 “thermometer” reads the lowest at all times and the JRA55 “thermometer” reads the highest all the time. They all go up and down in parallel – they all show a 97/98 El Nino followed by a La Nina. To some extent, they all have a CONSTANT systematic error that doesn’t interfere with our ability to quantify change (which is exactly analogous to my thermometer that always reads exactly 5 degF too high.
What Schmidt doesn’t show us is the difference between each of these records. Is it really constant? If these systematic errors (differences in processing) are not constant, then there is greater uncertainty in our assessment of change. As usual, in blog posts Schmidt leaves out the critical information needed to judge whether there is a constant systematic error between these records or not. There is no point in providing fodder for the skeptics. He is preaching to the choir.
As a scientist, I don’t listen to justifications. What is the data? How was it processed? How much difference is there between processing methods? Which method(s) do I judge more reliable for correcting systematic errors? You may not be aware, but the global mean temperature (not temperature anomaly) rises and falls 3.5 degC every year, peaking during summer in the Northern Hemisphere. (FWIW, Willis has written about this. The cause is the smaller heat capacity in the NH, more land and shallower mixed layer due to weaker winds.) Taking anomalies removes this seasonal signal and lets us see changes of 0.1 or 0.2 degK/decade. There are vastly more thermometers in the NH than the SH and the set of available thermometers varies with time. Temperature anomalies are merely a practical way of dealing with these systematic problems in the data that interfere with properly measuring change. One can calculate that anomaly by subtracting a 1951-1980 mean or a 1961-1990 mean as determined by any of Schmidt’s five “thermometers” discussed above. If the warming calculated by any of these ten methods differs from another by 0.5K, climate science would be in big trouble. The 0.5 K Schmidt is referring to is not this difference.
If I were your Counselor, I would be saying you and Schmidt are failing to communicate what is meant by an error of 0.5 K, and it may not be possible to bridge that gap because confirmation bias (about Schmidt’s motivations, which also make me deeply suspicious) makes it difficult to process new information.
Frank ==> I have had no communication with Gavin Schmidt, and he isn’t involved here today.
Gavin’s +/-0.5K is the uncertainty of the metric GAST(absolute). Freely admitted, acknowledged by all, including Mr. Stokes. What this means is that the Global Average Surface Temperature (GAST) is an uncertain metric — uncertain to within a range 1 degree C (or 1K) wide. This is why they don’t use it — because the entire CHANGE in GAST since 1980 is LESS THAN the uncertainty.
Now, just follow the logic here — the GAST is uncertain — everyone admits it and complains about it and refuses to use it because of its uncertainty.
Despite that KNOWN UNCERTAINTY, they wish to claim that they can statistically derive an anomaly that represents the change in GAST ( see https://www.ncdc.noaa.gov/cag/global/time-series/globe/land_ocean/p12/12/1880-2018.csv ) to an “error-less” one hundredth of a degree C. The title of the data file linked is “Global Land and Ocean Temperature Anomalies”. It does not say “The mean of the world’s individual station monthly or annual anomalies”. It gives the GLOBAL anomaly from the “Base Period: 1901-2000”. This is the oft graph GAST Anomaly (2018 = 0.74°C)
The referring page says:
“What can the mean global temperature anomaly be used for?
This product is a global-scale climate diagnostic tool and provides a big picture overview of average global temperatures compared to a reference value.”
So, regardless of Mr. Stokes’ rambling insistence anout the vast difference between the mean of anomalies and anomalies of means, the data is presented to the rest of the world as “average global temperatures compared to a reference value.” and presented without any statement of uncertainty.
We know that the GAST(absolute) is subject to +/-0.5K uncertainty. Yet the 2018 anomaly from 1901-2000 is presented as 0.74°C (no uncertainty admitted at all).
You see the disconnect…..
It is not that I misunderstand…it is that I don’t agree that the method is valid — particularly as it is specifically carried out to rid the true metric of its real uncertainty. It is a trick that allows Climate Science to ignore the known uncertainty about Global Temperature and its change.
Kip: No scientists calculates warming by subtracting the global mean temperature anomaly in say January 1901 from the global mean temperature anomaly in January 2001. That is the only way one can get an answer accurate to 0.01 degC. Someone might average all of the months between 1890 and 1910 and then between 1990 and 2010. There is a statistical for properly calculating the “confidence interval around the difference in two means” with confidence intervals. The confidence intervals add “in quadrature”. Google the phrases in quotes if they aren’t familiar. The method used by the IPCC is to do a least squares linear fit to the data, which gives a slope and a 95% confidence interval for that slope. The slope might be 0.0083+/-0.0007 degC/y and then multiply by 100 years – or 90 years or 118 years.
The right way to think about this is to say we have 5 different thermometers for measuring the temperature of the planet named MERRA2, CFSR, NCEP1, JRA55 and ERAi. Each thermometer is slightly different and gives a slightly different temperatures for the globe, say because the diameters of the tube the liquid rises through varies slightly (a CONSTANT systematic error). The JRA thermometer almost always reads the highest and the NCEP1 thermometer reads the lowest. They always move in the same direction and nearly the same amount from year to year and over the entire period.
http://www.realclimate.org/images/rean_abs.png
Click on the Figure (here or at RC) so that it opens full screen in a new window. Print the graph. Measure the amount of warming from the first year to the last. The differences are all between about 0.7 and 0.8 degC. NCEP1 shows the most warming. MERRA2 the least. That is the error in the amount of warming. Not 0.5 degC.
Now look at the year to year shifts. How consistent is the gap between the lowest two lines, which averages about 0.3 degC. (0.25-0.35 degC?) You are getting an idea of our consistent (or inconsistent) the systematic error is between these to records.
Good luck accepting the analogy with five different thermometers. I’ll try to leave you the last word.
Frank ==> You have correctly determined the measurement difference between your five thermometers — there is some systematic difference between the measurements apparently, which may be caused by the thermometers themselves, and if so, and evenly spread across the quantitative values, can be identified as systemic.
What we haven’t identified yet is the real uncertainty of the temperature of MERRA2 as measured by the five thermometer method — which may (and almost always is) much different than the simple variation between measurements.
Kip wrote: “What we haven’t identified yet is the real uncertainty of the temperature of MERRA2 as measured by the five thermometer method — which may (and almost always is) much different than the simple variation between measurements.”
What does “real uncertainty” mean here? We have real uncertainty in the absolute measurement of temperature and we have uncertainty in the CHANGE in temperature – warming. The uncertainty in warming is not +/-0.5 K.
To understand why there are two different kinds of uncertainty, let’s look at how a thermometer is made? You start with a piece of glass tubing with a very uniform internal diameter and attach a large bulb on the bottom that is filled with specified amount of mercury (or some other working fluid), evacuate and seal. You put the bulb in a ice-water bath and mark 0 degC. The distance the mercury rises per degree depends on how much mercury is in the bulb and the accuracy and uniformity of the inner tube. If your ice bath is mostly melted or you didn’t let the thermometer equilibrate in the ice bath for long enough or parallax interfered with proper placement of the 0 degC line, all of your measurements could be 1 or even a few degC off. This is a constant systematic error that won’t interfere with your ability to quantify warming, if the amount of mercury and the diameter of the tube through which it rises are correct.
The five “thermometers” that Schmidt is referring to are actually re-analyses that process a variety of temperature and other data from around the world and reconstruct the temperature everywhere in the atmosphere. He is showing you their output for the surface. Each reanalysis does things slightly differently. HadCRUT4, GISS, NOAA, etc create surface temperature records using only surface temperature data. There are a lot of mathematical techniques available for combining (incomplete) data from a non-uniformly spaced set of surface stations involving grids, and separate grids for land and ocean. The Climategate email scandal forced the release of a lot of previous secret data and methods, and a bunch of technically competent bloggers constructed their own temperature indices. Below is a link to some of their earlier results. You might recognize Jeff_Id (skeptical owner of The Air Vent), RomanM (statistician? and then frequent commenter at ClimateAudit), Nick (supporter of consensus), Steve (probably Mosher, then author of a scathing book on Climategate). Each independently created there own temperature records from the same data set used by the Phil Jones. Many thought that kriging was a better statistical method for processing this data, but was challenging to apply. When Mueller got funding from the Koch brothers to develop kriging, Mosher and Zeke joined that project, which became BEST.
https://moyhu.blogspot.com/2010/03/comparison-of-ghcn-results.html
As a scientist who has watched some scientifically sound skeptical attacks on the consensus succeed (the hockey stick) and other fail (the amount of warming) for more than a decade, I find it distressing to read naive posts questioning the validity of means reported to tenths of a degree from data to the nearest degree, calling temperature anomalies “tricks”, and asserting we don’t know warming to better than +/-0.5 degC. IMO, the real controversy today should be with homogenization: correcting abrupt breaks in station data without understanding their cause. Some are caused by a documented change in TOB, some by a documented station move, some by new instrumentation, but most are not. Some could be caused by maintenance that restores earlier measurement conditions. Correcting such breaks introduces bias into the record. BEST splits records at breakpoints, effectively homogenizing them. One estimate of the size of the homogenization problem can be found here:
https://moyhu.blogspot.com/2015/02/breakdown-of-effects-of-ghcn-adjustments.html
Despite his biases (we all have them), Nick is enough of a scientist to publish warming with and without adjustments, even though he believes in those adjustments. He has been refining his methodology over nearly a decade and it is absurd to ignore his expertise in technical aspects of this subject.
Frank==> Let your mind out of narrow box of statistical and scientific definitions for a moment.
What is “real uncertainty”? It is how uncertain you really are about something. It has nothing to do with how many thermometers or measurements or accuracy of solid state thermistors or whatever. It is your answer to “How uncertain am I?” or “How certain am I?”
This shifts into the scientific realm when we look at the scope of the problem, the tools for investigating it, the accuracy and precision we can expect, the logistics of getting those measurements, the methods of recording the measurements, the Murphy’s Law factors in our field, etc etc.
I would suggest that we also have to take into account the questions raised in my essay “What Are They Really Counting?”.
The acknowledged problems of CliSci with GAST(absolute) should inform us about our real uncertainty in understanding the changes in that metric. It is simple common sense, often abandoned in the trench wars of science, to realize that if we can not figure/calculate/discover the GAST(absolute) any closer than +/-0.5K (or in simple English, “to within a single degree”) that we must realize that the claim of knowing the annual “change” in Global Temperature to hundredths of a degree is a highly unlikely prospect — no matter how much jargon is used to support the claim.
Kip writes: “… we must realize that the claim of knowing the annual “change” in Global Temperature to hundredths of a degree is a highly unlikely prospect — no matter how much jargon is used to support the claim”
However, I didn’t use jargon to support my claim. I explained how a thermometer works and how an inaccurate 0 degC mark logically could make absolute temperature readings less accurate than temperature change. And I showed you how to demonstrate for YOURSELF how thousands of measurements rounded or not rounded to the nearest degree nevertheless produce the same mean temperature to one hundredth of a degree.
No one claims hundredths of a degree accuracy, but if they did, relying on intuition isn’t good enough. Doesn’t your intuition tell you that it is impossible for GPS to calculate your location to within a few feet using the time difference in signals from four satellites (and the speed of light)?
When you analyze Schmidt’s 5 temperature records, it is absolutely true that they disagree by more than 0.5K in absolute temperature, but agree on the total amount of warming for every possible period within about 0.1 K. (There are about 25 one-year period on the graph, 24 two-year periods, 23 three-year periods … See how consistent the differences are. )
I suggest you think about confirmation bias. Wikipedia:
“Confirmation bias … is the tendency to search for, interpret, favor, and recall information in a way that confirms one’s preexisting beliefs or hypotheses. It is a type of cognitive bias and a systematic error of inductive reasoning. People display this bias when they gather or remember information selectively, or when they interpret it in a biased way. The effect is stronger for EMOTIONALLY CHARGED ISSUES AND FOR DEEPLY HELD BELIEFS.”
Kip writes: “Let your mind out of narrow box of statistical and scientific definitions for a moment.”
Are you kidding? Precise definitions are essential to discussing any scientific subject. Statistics has been referred to as “obtaining meaning from data”. This scientific discipline is what helps scientists – or used to help scientists – avoid confirmation bias. Since I’ve read Climateaudit for more than a decade, I’m probably a skeptic at heart. However, science isn’t about selecting a few observations or ideas that appear to support what you want to believe, it’s about what experiments and the data say.
Jack of all trades Hansen
was a Cubmaster,
Scoutmaster,
Marriage Counselor,
Personal Behavior Counselor
Just add magician, and your background
will be perfect for “modern” climate science,
run by those pesky government bureaucrats
with science degrees, or so they claim:
( “when we need data, we’ll just pull the numbers out of a hat” )
Richard ==> Honestly, that’s the short list…..
Climate scientists claim more magic than I’ve every been able to master.
“Steve 0 ==> The world is full of maths tricks…. you can not recover an unknown value from an infinite set. Since the actual thermometer reading at an exact time was not recorded, and only the range was recorded (“72 +/- 0.5”) there are an infinite number of potential values for that instantaneous temperature. No amount of statistic-ing or arithmetic-ing will reveal the reading that would have been recorded had they actually recorded it.”
— Kip, it’s not a trick. It’s just arithmetic. We don’t have to go back and forth anymore of yes you can, no you can’t, yes you can. I did describe a simple way you can prove it to yourself in five minutes using Excel.
You absolutely can determine the true underlying value very precisely, even though you don’t have a single accurate measurement.
Steve ==> I used to do stage magic too…..
Kip
In the past month I had been concerned
that articles on this website were not
as good as they used to be.
Then your article showed up,
and lifted this website to a higher level.
Your many contribution to the comments
were equally good.
So, who did you hires as your ghostwriter?
Just kidding, sometimes
I can’t help myself
Richard ==> My wife is my best editor and critic — she keeps meas honest as she can — it is an uphill battle but she has stuck it out for 44 years so far.
“she has stuck it out for 44 years so far”.
Cruel and unusual punishment?
Richard ==> It looks like it will be a life sentence….
Kip,
I’m very sorry – I didn’t go through all the comments, and I may be repeating what others have said. There are so many comments discrediting scientists, I don’t have patience to go through it all. I’ve forgotten much of what I learned about calculating error, but for my purposes it doesn’t matter. (I hope this is coherent, I sort of cobbled it together.)
I have a hard time believing that you don’t know why anomalies are used, but just to make sure, this is a reasonable explanation:
“Absolute estimates of global average surface temperature are difficult to compile for several reasons. Some regions have few temperature measurement stations (e.g., the Sahara Desert) and interpolation must be made over large, data-sparse regions. In mountainous areas, most observations come from the inhabited valleys, so the effect of elevation on a region’s average temperature must be considered as well. For example, a summer month over an area may be cooler than average, both at a mountain top and in a nearby valley, but the absolute temperatures will be quite different at the two locations. The use of anomalies in this case will show that temperatures for both locations were below average.
“Using reference values computed on smaller [more local] scales over the same time period establishes a baseline from which anomalies are calculated. This effectively normalizes the data so they can be compared and combined to more accurately represent temperature patterns with respect to what is normal for different places within a region.
“For these reasons, large-area summaries incorporate anomalies, not the temperature itself. Anomalies more accurately describe climate variability over larger areas than absolute temperatures do, and they give a frame of reference that allows more meaningful comparisons between locations and more accurate calculations of temperature trends.”
https://data.giss.nasa.gov/gistemp/faq/abs_temp.html
…………………………….
” In REAL meteorology, the recorded value for a single temperature reading is a RANGE”
Why would you record a range? Don’t you mean it represents a range? Are you assuming that the measuring instruments used are marked only by whole degrees?
“As Schmidt kindly points out, the correct notation for a GAST in degrees is something along the lines of 288.0±0.5K — that is a number of degrees to tenths of a degree and the uncertainty range ±0.5K. When a number is expressed in that manner, with that notation, it means that the actual value is not known exactly, but is known to be within the range expressed by the plus/minus amount.”
This has little to do with rounding error, imprecision, inaccuracy, etc., it’s a function of the spread of the data in the base period.
“The rest of us can know that the uncertainty of the GAST(absolute) of minimally +/-0.5K means that we can not be certain whether or not the average global temperature has changed much (or how much) over the last decade, and we may be uncertain about it over the last 40 years.”
NOT AS LONG AS YOU USE ANOMALIES!
The CRU and GISS series from 2010 to 2017 in your post look different, but the graphs of the data are almost exactly the same (apart from the offset). This is an illustration of why the chosen base period is (grrr, can’t think of the word. Starts with “a.” I hate that! Something like “irrelevant”) when looking at trends.
One could use this as an illustration of what might happen if you imagine the two datasets were actually absolute temperature from a mountain and a valley in the same area. One set of temperatures is lower, but the trends are almost exactly the same. If you subtracted the baseline from each (say, 1.7 and 1.56, respectively), your trends would overlap almost perfectly.
The reanalysis products are somewhat different, though with the same trends. Why did you only post the reanalyses of the absolute temperatures, and not the anomalies? Why did you not post this about the reanalyses:
“In contrast, the uncertainty in the station-based anomaly products [reanalyses] are around 0.05ºC for recent years, going up to about 0.1ºC for years earlier in the 20th century. Those uncertainties are based on issues of interpolation, homogenization (for non-climatic changes in location/measurements) etc. and have been evaluated multiple ways – including totally independent homogenization schemes, non-overlapping data subsets etc.”
>>>”evaluated multiple ways”<<< In other words, we don't know exactly how errors are determined from looking at this post alone. And the BEST reanalysis is not alone in its calculation of error, which changes over time;' it is 0.05 only during part of the period, when there were more (and more reliable) sources of observation. Intuitively, you would expect error to decrease when you have readings from ASOS, satellites, radioisondes, as well as correlation among neighboring sites that all support a given measurement. (I don't know how many reanalyses include this variety of information for land measurements)
(Reanalyses in brief: http://www.realclimate.org/index.php/archives/2011/07/reanalyses-r-us/)
What about the overlap in their errors?
Schmidt doesn't address the error when you use absolute temperatures as they are recorded (without "reverse engineering" them from the anomalies) to do a reanalysis, which is what most of Schmidt's post is about. This would be a whole different process from using anomalies to calculate absolutes, using an extremely large and variable dataset – in the original, it would contain biases, errors, different instruments… This is why people rely on adjusted data and reanalyses: all the work of figuring out the myriad measurement effects has already been done – including the error.
Kip: "It does not reduce the uncertainty of the values of the data set — nor of the real world uncertainty surrounding the Global Average Surface Temperature — in any of its forms and permutations."
Perhaps it would be helpful to conceptualize the issue by distinguishing between the actual variance in the data, and the uncertainty associated with measurements. The difference between a station in Zimbabwe and one in Iceland is a source of variance, but not a source of uncertainty. Likewise the mean of August temps vs January. What anomalies do is remove this variance.
When discussing trends in global temperature, the error estimate of absolute temperatures is largely an artifact of the wide range of climate norms. This source of error is not attributable to measuring accuracy or precision. (NH and SH temperatures would also tend to cancel each other out. )
I agree that there should be error estimates on some graphs, but the fact that there aren't doesn't mean they don't know what they are doing. In the graphs of running means, error bars might not even be appropriate – they would just be confusing, since the data that the running mean is based on are already represented in the graph – the error would be simply a measure of year-to-year variability that is already obvious from the plot. Graphing the error of each year shown in the plot, as in the BEST graph, might be more informative – but that could be supplied in a different graph or in the text, rather than cluttering the original with too much data. I depends on what one wants to represent, and also the audience. Those who want more detail can read the literature. The graphs presenting reanalyses in Schmidt's essay are for illustration only.
…………………………..
Anomalies are appropriate for looking at global (or regional) trends, absolute temperatures are not. It is not a trick.
Once you know how error is calculated in the reanalyses, THEN you can make comments about whether it's appropriate or not. It is not right to make assumptions that scientists are trying to hide something without knowing what they are doing. Here, maybe this will help:
https://www.ncdc.noaa.gov/monitoring-references/docs/smith-et-al-2008.pdf
Kristi ==> Sorry I missed this from days ago … weekends are busy for me. You have handicapped yourself by not reading all the links and background information…thus not realizing such things as the reanalysis graph is from Gavin Schmidt’s post at RealClimate…..
I assure you, I know all of the justifications for using anomalies …. Gavin makes it very clear, and I have no reason to doubt him. The real calculation of GASTabsolute is just too uncertain to prove the AGW hypothesis, therefore, we must use something (anything) else that we can justifiably claim is less uncertain.
I don’t think they are trying to hide something, I think they are falling back on statistical theory to allow them to fool themselves into believing that they can know that the Earth’s climate system is retaining more energy from the Sun due to CO2 — thus raising the annual average temperature by 100ths of a degree.
Kip, real life field examples are always good as the antidote for purely mental exercises from behind the desk! In another comment Mr Frank argues however, that known uncertainty matters not so much in detecting the change. When we’ve got sufficiently large sample of the readings with known and large constant uncertainty (readings randomly distributed I presume) we can figure out mean very precisely. Subsequently, we can detect variations in the mean orders of magnitude smaller than order of actual readings. From the other side you argue that we can detect tiny variations in the mean indeed, that however tells us nothing about the change because those variations still will be within wide range of the original known uncertainty. Right?
Paramenter ==> Yes, that is basically correct.
Paramenter ==> Sorry, hit “send” too soon.
The tiny variations in the mean tell us something….it tells us that there have been tiny variations in the mean — but if the original data was quantitatively uncertain — in our case +/-0.5K, then variations in the mean on the same scale or smaller still fall within the uncertainty and can not be distinguished from one another.
When we known the uncertainty of the metric itself (our +/-0.5K) — and we know the values of the metric (three years given by Gavin Schmidt, for example) we can see the anomaly for ourselves — about 0.3K — and that anomaly falls within the uncertainty known to exist for that metric.
A lot of fancy footwork with statistical definitions and statistical theory can;t defeat the basic scientific facts — we know the values and their uncertainty — the difference between those values (either from one another or from some base) is the anomaly — and the uncertainty of the metric devolves to the anomaly.
You can do this visually by putting the metric with its uncertainty (our three values) on a graph, put the base period value on the graph (as a straight horizontal line), the anomaly is the distance between the base line and the metric. Notice that the uncertainty didn’t disappear when measured the distance to each data point from the base line– your measurement is just as uncertain — base line to value, baseline to bottom of the uncertainy bar, baseline to the top of the uncertainty bar.
No uncertainty is lost in the process.