# The ‘Trick’ of Anomalous Temperature Anomalies

Guest Essay by Kip Hansen

It seems that every time  we turn around, we are presented with a new Science Fact that such-and-so metric — Sea Level Rise, Global Average Surface Temperature, Ocean Heat Content, Polar Bear populations, Puffin populations — has changed dramatically — “It’s unprecedented!” — and these statements are often backed by a graph illustrating the sharp rise (or, in other cases, sharp fall) as the anomaly of the metric from some baseline.  In most cases, the anomaly is actually very small and the change is magnified by cranking up the y-axis to make this very small change appear to be a steep rise (or fall).  Adding power to these statements and their graphs is the claimed precision of the anomaly — in Global Average Surface Temperature, it is often shown in tenths or even hundredths of a Centigrade degree.  Compounding the situation, the anomaly is shown with no (or very small) “error” or “uncertainty” bars, which are, even when shown,  not error bars or uncertainty bars  but actually statistical Standard Deviations (and only sometimes so marked or labelled).

I wrote about this several weeks ago in an essay here titled “Almost Earth-like, We’re Certain”.   In that essay, which the Science and Environmental Policy Project’s Weekly News Roundup characterized as “light reading”,  I stated my opinion that “they use anomalies and pretend that the uncertainty has been reduced.   It is nothing other than a pretense.  It is a trick to cover-up known large uncertainty.”

Admitting first that my opinion has not changed, I thought it would be good to explain more fully why I say such a thing — which is rather insulting to a broad swath of the climate science world.   There are two things we have to look at:

1. Why I call it a “trick”, and 2.  Who is being tricked.

WHY I CALL THE USE OF ANOMALIES A TRICK

What exactly is “finding the anomaly”?  Well, it is not what it is generally thought.  The simplified explanation is that one takes the annual averaged surface temperature and subtracts from that the 30-year climatic average and what you have left is “The Anomaly”.

That’s the idea, but that is not exactly what they do in practice.  They start finding anomalies at a lower level and work their way up to the Global Anomaly.  Even when Gavin Schmidt is explaining the use of anomalies, careful readers see that he has to work backwards to Absolute Global Averages in Degrees — by adding the agreed upon anomaly to the 30-year mean.

“…when we try and estimate the absolute global mean temperature for, say, 2016. The climatology for 1981-2010 is 287.4±0.5K, and the anomaly for 2016 is (from GISTEMP w.r.t. that baseline) 0.56±0.05ºC. So our estimate for the absolute value is (using the first rule shown above) is 287.96±0.502K, and then using the second, that reduces to 288.0±0.5K.”

But for our purposes, let’s just consider that the anomaly is just the 30-year mean subtracted from the calculated GAST in degrees.

As Schmidt kindly points out, the correct notation for a GAST in degrees is something along the lines of 288.0±0.5K — that is a number of degrees to tenths of a degree and the uncertainty range ±0.5K.  When a number is expressed in that manner, with that notation, it means that the actual value is not known exactly, but is known to be within the range expressed by the plus/minus amount.

This illustration shows this in actual practice with temperature records….the measured temperatures are rounded to full degrees Fahrenheit — a notation that represents ANY of the infinite number of continuous values between 71.5 and 72.4999999…

It is not a measurement error, it is the measured temperature represented as a range of values 72 +/- 0.5.  It is an uncertainty range, we are totally in the dark as to the actual temperature — we know only the range.

Well, for the normal purposes of human beings, the one-degree-wide range is quite enough information.  It gets tricky for some purposes when the temperature approaches freezing — above or below frost/freezing temperatures being Climatically Important for farmers, road maintenance crews and airport airplane maintenance people.

No matter what we do to temperature records, we have to deal with the fact that the actual temperatures were not recorded — we only recorded ranges within which the actual temperature occurred.

This means that when these recorded temperatures are used in calculations, they must remain as ranges and be treated as such.    What cannot be discarded is the range of the value.  Averaging (finding the mean or the median) does not eliminate the range — the average still has the same range.  (see Durable Original Measurement Uncertainty ).

As an aside:  when Climate Science and meteorology present us with the Daily Average temperature from any weather station, they are not giving us what you would think of as the “average”, which in plain language refers to the arithmetic mean — rather we are given the median temperature — the number that is exactly halfway between the Daily High and the Daily Low.   So, rather than finding the mean by adding the hourly temperatures and dividing by 24, we get the result of Daily High plus Daily Low divided by 2.  These “Daily Averages” are then used in all subsequent calculations of weekly, monthly, seasonal, and annual averages.  These Daily Averages have the same 1-degree wide uncertainty range.

On  the basis of simple logic then, when we finally arrive at a Global Average Surface Temperature, it still has the original uncertainty attached — as Dr. Schmidt correctly illustrates when he gives Absolute Temperature for 2016 (link far above) as   288.0±0.5K.  [Strictly speaking, this is not exactly why he does so — as the GAST is a “mean of means of medians” — a mathematical/statistical abomination of sorts.] As William Briggs would point out  “These results are not statements about actual past temperatures, which we already knew, up to measurement error.” (which measurement error or uncertainty is at least +/- 0.5).

The trick comes in  where the actual calculated absolute temperature value is converted to an anomaly of means. When one calculates a mean (an arithmetical average — total of all the values divided by the number of values), one gets a very precise answer.  When one takes the average of values that are ranges, such as 71 +/- 0.5, the result is a very precise number with a high probability that the mean is close to this precise number.   So, while the mean is quite precise, the actual past temperatures are still uncertain to +/-0.5.

Expressing the mean with the customary ”+/- 2 Standard Deviations” tells us ONLY what we can expect the mean to be — we can be pretty sure the mean is within that range.  The actual temperatures, if we were to honestly express them in degrees as is done in the following graph, are still subject to the uncertainty of measurement:  +/- 0.5 degrees.

[ The original graph shown here was included in error — showing the wrong Photoshop layers.  Thanks to “BoyfromTottenham” for pointing it out. — kh ]

The illustration was used (without my annotations) by Dr. Schmidt in his essay on anomalies.  I have added the requisite I-bars for +/- 0.5 degrees.  Note that the results of the various re-analyses themselves have a spread of 0.4 degrees  — one could make an argument for using the additive figure of 0.9 degrees as the uncertainty for the Global Mean Temperature based  on the uncertainties above (see the two greenish uncertainty bars, one atop the other.)

This illustrates the true uncertainty of Global Mean Surface Temperature — Schmidt’s acknowledged +/- 0.5 and the uncertainty range between reanalysis products.

In the real world sense, the uncertainty presented above should be considered the minimum uncertainty — the original measurement uncertainty plus the uncertainty of reanalysis.   There are many other uncertainties that would properly be additive — such as those brought in by infilling of temperature data.

The trick is to present the same data set as anomalies and claim the uncertainty is thus reduced to 0.1 degrees (when admitted at all) — BEST doubles down and claims 0.05 degrees!

Reducing the data set to a statistical product called anomaly of the mean does not inform us of the true uncertainty in the actual metric itself — the Global Average Surface Temperature  — any more than looking at a mountain range backwards through a set of binoculars makes the mountains smaller, however much it might trick the eye.

Here’s a sample from the data that makes up the featured image graph at the very beginning of the essay.  The columns are:  Year — GAST Anomaly — Lowess Smoothed

2010  0.7    0.62
2011  0.57  0.63
2012  0.61  0.67
2013  0.64  0.71
2014  0.73  0.77
2015  0.86  0.83
2016  0.99  0.89
2017  0.9    0.95

The blow-up of the 2000-2017 portion of the graph:

We see global anomalies given to a precision of hundredths of a degree Centigrade.  No uncertainty is shown — none is mentioned on the NASA web page displaying the graph (it is actually a little app, that allows zooming).   This NASA web page, found in NASA’s Vital Signs – Global Climate Change section, goes on to say that “This research is broadly consistent with similar constructions prepared by the Climatic Research Unit and the National Oceanic and Atmospheric Administration.”   So, let’s see:

From the CRU:

Here we see the CRU Global Temp (base period 1961-90) — annoyingly a different base period than NASA which used 1951-1980.  The difference offers us some insight into the huge differences that Base Periods make in the results.

2010   0.56 0.512
2011 0.425 0.528
2012   0.47 0.547
2013 0.514 0.569
2014   0.579  0.59
2015 0.763 0.608
2016   0.797 0.62
2017 0.675 0.625

The official CRU anomaly for 2017 is 0.675 °C — precise to thousandths of a degree.  They then graph it at 0.68°C.  [Lest we think that CR anomalies are really only precise to “half a tenth”, see 2014, which is 0.579 °C. ]   CRU manages to have the same precision in their smoothed values — 2015 = 0.608.

And, not to discriminate, NOAA offers these values, precise to hundredths of a degree:

2010,   0.70
2011,   0.58
2012,   0.62
2013,   0.67
2014,  0.74
2015,  0.91
2016,  0.95
2017,  0.85

[Another graph won’t help…]

What we notice is that, unlike absolute global surface temperatures such as those quoted by Gavin Schmidt at RealClimate, these anomalies are offered without any uncertainty measure at all.  No SDs, no 95% CIs, no error bars, nothing.  And precisely to the 100th of a degree C (or K if you prefer).

Let’s review then:   The major climate agencies around the world inform us about the state of the climate through offering us graphs of the anomalies of the Global Average Surface Temperature showing a steady alarmingly sharp rise since about 1980.  This alarming rise consists of a global change of about 0.6°C.  Only GISS offers any type of uncertainty estimate and that only in the graph with the lime green 0.1 degree CI bar used above. Let’s do a simple example: we will follow the lead of Gavin Schmidt in this August 2017 post and use GAST absolute values in degrees C with  his suggested uncertainty of 0.5°C.  [In the following, remember that all values have °C after them – I will use just the numerals from now on.]

What is the mean of two GAST values, one for Northern Hemisphere  and one for Southern Hemisphere?  To make a real simple example, we will assign each hemisphere the same value of 20 +/- 0.5 (remembering that these are both °C). So, our calculation:   20 +/- 0.5 + 20 +/- 0.5 divided by 2 equals ….. The Mean is an exact 20.  (now, that’s precision…)

What about the Range?  The range is +/- 0.5.  A range 1 wide.  So, the Mean with the Range is 20 +/- 0.5.

But what about the uncertainty?     Well the range states the uncertainty — or the certainty if you prefer — we are certain that the mean is between 20.5 and 19.5.

Let’s see about the probabilities  — this is where we slide over to “statistics”.

Here are some of the values for the Northern and Southern  Hemispheres, out of the infinite possibilities inferred by 20 +/- 0.5:  [we note that 20.5 is really 20.49999999999…rounded to 20.5 for illustrative purposes.]  When we take equal values, the mean is the same, of course.  But we want probabilities — so how many ways can the result be  20.5 or 19.5?  Just one way each.

NH           SH
20.5 —— 20.5 = 20.5 only one possible combination
20.4         20.4
20.3         20.3
20.2         20.2
20.1         20.1
20.0         20.0
19.9         19.9
19.8         19.8
19.7         19.7
19.6         19.6
19.5 —— 19.5 = 19.5 only one possible combination

But how about 20.4 ?  We could have 20.4-20.4, or 20.5-20.3, or 20.3-20.5 — three possible combinations. 20.3?  5 ways    20.2?  7 ways   20.1?  9 ways   20.0?  11 ways .  Now we are over the hump and 19.9? 9  ways  19.8? 7 ways  19.7? 5 ways  19.6? 3 ways  and 19.5? 1 way.

You will recognize the shape of the distribution:

As we’ve only used eleven values for each of the temperatures being averaged, we get a little pointed curve.   There are two little graphs….the second (below) shows what would happen if we found the mean of two identical numbers, each with an uncertainty range of +/- 0.5, if they had been rounded to the nearest half degree instead of the usual whole degree.  The result is intuitive — the mean always has the highest probability of being the central value.

Now, that may seem so obvious as to be silly.  After all, that’s that a mean is — the central value (mathematically).  The point is that with our evenly spread values across the range — and, remember, when we see a temperature record give as XX +/- 0.5 we are talking about a range of evenly spread possible values,  the mean will always be the central value, whether we are finding the mean of a single temperature or a thousand temperatures of the same value.  The uncertainty range, however, is always the same.  Well, of course it is!   Yes, has to be.

Therein lies the trick — when they take the anomaly of the mean, they drop the uncertainty range altogether and concentrate only on the central number, the mean, which is always precise and statistically close to  that central number.   When any uncertainty is expressed at all, it is expressed as the probability of the mean being close to the central number — and is disassociated from the actual uncertainty range of the original data.

As William Briggs tells us:  “These results are not statements about actual past temperatures, which we already knew, up to measurement error.”

We already know the calculated GAST (see the re-analyses above).  But we only know it being somewhere within its known uncertainty range,  which is as stated by Dr. Schmidt to be +/- 0.5 degrees.   Calculations of the anomalies of the various means do not tell us about the actual temperature of the past — we already knew that — and we knew how uncertain it was.

It is a TRICK to claim that by altering the annual Global Average Surface Temperatures to anomalies we can UNKNOW the known uncertainty.

WHO IS BEING TRICKED?

As Dick Feynman might sayThey are fooling themselves.  They already know the GAST as close as they are able to calculate it using their current methods.  They know the uncertainty involved — Dr. Schmidt readily admits it is around 0.5 K.    Thus, their use of  anomalies (or the means of anomalies…) is simply a way of fooling themselves that somehow, magically, that the known uncertainty will simply go away utilizing the statistical equivalent of “if we squint our eyes like this and tilt our heads to one side….”.

Good luck with that.

# # # # #

Author’s Comment Policy:

This essay will displease a certain segment of the readership here but that fact doesn’t make it any less valid.  Those who wish to fool themselves into disappearing the known uncertainty of Global Average Surface Temperature will object to the simple arguments used.  It is their loss.

I do understand the argument of the statisticians who will insist that the mean is really far more precise than the original data (that is an artifact of long division and must be so).  But they allow that fact to give them permission to ignore the real world uncertainty range of the original data. Don’t get me wrong, they are not trying to fool us.  They are sure that this is scientifically and statistically correct.   They are however, fooling themselves, because,  in effect, all they are really doing is changing the values on the y-axis (from ‘absolute GAST in K’  to ‘absolute GAST in K minus the climatic mean in K’) and dropping the uncertainty, with a lot of justification from statistical/probability theory.

# # # # #

## 447 thoughts on “The ‘Trick’ of Anomalous Temperature Anomalies”

1. I’ve always used anomaly analysis to identify anomalous data, where in this case, the anomalous errors being identified are methodological.

There can be no doubt that the uncertainty claimed by the IPCC and its self serving consensus is highly uncertain. Consider the claimed ECS of 0.8C +/- 0.4C per W/m^2 of forcing. Starting from 288K and its 390 W/m^2 of emissions, this means that 1 W/m^2 of forcing can increase the surface emissions from between 2.25 W/m^2 and 6.6 W/m^2, where even the lower limit of 2.25 W/m^2 of emissions per W/m^2 of forcing is larger than the measured steady state value of 1.62 W/m^2 of surface emissions per W/m^2 of solar forcing. Clearly, the IPCC’s uncertain ECS, as large as the uncertainty already is, isn’t even large enough to span observations!

2. Thomas Homer says:

“… rather than finding the mean by adding the hourly temperatures and dividing by 24, we get the result of Daily High plus Daily Low divided by 2. ”

Taking the Daily High plus the Daily Low and dividing by 2 can easily give a trend in the opposite direction of actual temperatures. IOW, a set of these values could show a warming trend when it is actually cooling.

• Steve Reddish says:

Many times I have worked outside all day for several days in a row. Often my perception was that one particular day was cooler than the others, because most of that day was partly cloudy.

Often, when I see the weather record for those days, I am surprised to learn that the record shows the day I thought was cooler was actually just as warm, and sometimes warmer, than the other days. This happens when there is a short period of full sunshine during mid-afternoon.

Even though many hours of that day were cooler than the corresponding hours of the other days, that day got recorded as just as warm, and sometimes warmer, than the other days.

SR

• Vrager says:

That is why a world average temperature is a nonsense… too many variables and too many adjustments have been made. One can only get a reliable record from a set of single sources that are comparable and which have been obtained using the same method. For the world the oceans are the only places always at sea level so land based measures have to be adjusted for altitude to be comparable and of course land radiates more heat which bounces back off clouds than sea. It’s all nonsense based on groggy data.

One can only really look at one place and compare it over time, so long as the Stevenson screen remains in an open place away from development and close by trees.

• Mark Cooper says:

That feeling of being cooler is because you are in the shade of the clouds right? Weather thermometers are always in the shade, so are not affected much by the cloud cover per-se…

• Kip Hansen says:

Mark ==> The phenomena that Steve is mentioning is related to determining the “Daily Average” temperature from the median of the Max/Min for the day. To a relatively cool day with a short period of high temps (a spike in late afternoon, say) and be reported as “warmer” (Daily Average) than a day with the same Min but an evenly moderately warm afternoon with no temp spike.

The cooler day was cooler most of the time but is reported higher because of a temp spike.

• Kip Hansen says:

co2 and Steve ==> Both examples of what happens when we use the Hi+Low/2 method — pretending it is the daily average.

This problem is magnified throughout the rest of the “average temperature system.”

• Willem69 says:

Hi Kip,

quick question, are the daily min and max actually recorded in the data archives or just the midpoint? Plotting both min/max and how they develop over time would give a better indication of where the climate might be going, at least in my opinion but i have never seen such a graph.

Best,
Willem

• Kip Hansen says:

Willem ==> Usually the Min/Max are recorded along with the Daily Average. Modern AWS/ASOS stations record a great deal more detail, but still figure the Daily Average the same way. There are some such graph out there somewhere — but in my opinion are only useful at a local level.

• Steve Keohane says:

It would seem to be seasonally exaggerated, longer cold nights in winter would average colder with 24 hourly readings averaged, as would summer’s long days be actually warmer than recorded. The real difference between summer and winter is much larger than what is recorded by averaging two daily measurements, which is all our records show.

• AndyHce says:

Do I not recall correctly, in a number of articles on adjustments to temperature measurements, there is always listed a time of day adjustment?

If only the min and max temperatures are recorded, how could any adjustment for time be relevant?

• AndyHce says:

Kip,
Ideally one would replicate the work to verify its consistency and accuracy but I believe I get the idea, at least to the extent that there are different results from the same data depending on how one uses the data. The TOB adjustment described may be done entirely honestly and with good intentions but the result is still a calculated guess of what might have happened because of how things were done, not actually know values, no?

• Kip Hansen says:

Andy ==> I am suspicious of the standardized size of the adjustment — if you have the energy and the time, you might take some modern complete station record with six minute values and calculate the Daily Average (Max + Min/2) using all 24 of the hours of the day as the start of day and see if the change in Daily average is really as great as they say.

• “not actually know values, no?”
Values are known, as explained here. The time and amount of TOBS is known. The effect can be deduced from the diurnal pattern. There is often hourly or better data now at the actual site, from MMTS; if not, there are usually similar MMTS locations nearby.

• Averaging temperature is misleading at best no matter how many samples are used. To compute averages with any relation to the energy balance and the ECS, you must convert each temperature sample into equivalent emissions using the SB Law, add the emissions, divide by the number of samples and convert back to a temperature by inverting the SB Law. 2 samples per day are nowhere near enough to establish the average with any degree of certainty, even if they are the min and max temperatures of the day. If the 2 samples are provided with time stamps allowing the change from one to another to be reasonably interpreted, it gets a little better, but is still no where near as good as 24 real samples per day.

• Jeff Alberts says:

It’s worse than that. Temperature is an intensive property of the point in time and space the measurement was taken. Averaging that with other points in time and space doesn’t give you anything meaningful.

• AGW is not Science says:

Yes, isn’t that convenient? Allows them to continue to bray about nothing for far longer that way.

• Thomas Homer
I wonder … for the majority of Earth’s surface
where there are no thermometers, and the
temperature numbers are wild guesses,
by government bureaucrats with science degrees,
do they wild guess a Daily High and Daily Low number
for each “empty grid”, or do they save time and
just wild guess the Daily Average, ‘
or save even more time
and just wild guess the Monthly Average
for that grid?

That question has been keeping me up at night.

No other field of science would take the
surface temperature numbers seriously,
and it always amazes me when people here,
who are skeptics, and should know better,
do just that.

Surface “temperature numbers” =
Over half wild guess “infilling” plus

No real raw data are used in the average.

When you “adjust” data, you are claiming
the initial measurements were wrong,
and you believe you know what they
should have been.

That’s no longer real data.

raw data move closer,
or farther away,
from reality.

Interesting that having weather satellite
data, that requires far less infilling,
and correlates well with weather balloon data,
that any people here would use the horrible
surface temperature “data”
that DOES NOT CORRELATE WELL
with the other two measurement methodologies.

That’s something I’d expect only of Dumbocrats
— They’ll always choose the measurement methodology
that produces the scariest numbers !

Because truth is not a leftist value.

3. BoyfromTottenham says:

Thanks for the interesting and informative article, Kip, but shouldn’t the error bars on your ‘Global mean temperatures from reanalyses’ Br twice as long as you have shown? Each one appears to me to be equal to about 0.5 degrees on the left hand scale, not 1 degree (+/-0.5 degrees).

• Kip Hansen says:

BoyfromTottenham ==> Well Done, sir. You have caught me using the wrong version of this graph –with the wrong Photoshop layers showing. I”ll replace it with an explanation in the text.

I do so appreciate readers you really look at the graphs and words — and thus see mistakes that I have missed.

• Crispin in Waterloo says:

Kip, I will take you up on your request.

First, degrees C means Celsius not Centigrade. But that is too easy.

I appreciate that you have made a good presentation, but you have a problem.

“These “Daily Averages” are then used in all subsequent calculations of weekly, monthly, seasonal, and annual averages.  These Daily Averages have the same 1-degree wide uncertainty range.”

This is not correct. That is not how uncertainties are carried forward through calculations. You could perhaps look at some worked examples in Wikipedia which for mathematical things is a pretty reliable source.

See here for examples:
https://en.m.wikipedia.org/wiki/Propagation_of_uncertainty

I will first present an oversimplified example of adding two numbers with an uncertainty of half a degree: 20±1 + 20±1 = 40±2

Possible values are 19+19 = 38, up to 21+21 = 42

This ±2 is not the correct answer, but it is a demonstration that adding two equally uncertain numbers doesn’t mean the final uncertainty is from only one, unless the second number is absolutely known, which is not the case with an anomaly.

The correct answer is the square root of the sum of the squares of the two uncertainties.

Square root of (1.0 squared + 1.0 squared) = 1.41

If you want the average, it is 20±0.71 because both values have to be divided by the same divisor so the Relative Error remains the same.

This is the uncertainty of the sum of the two inputs. Similarly, subtraction generates the same increase in uncertainty. Try it.

An anomaly is the result of a subtraction involving two numbers that each have an uncertainty. You do not mention this. Errors propagate. The anomaly does not have an uncertainty equal to the new value because the baseline also has an uncertainty.

Suppose the baseline is 20.0 ±0.5 and the new value is 21.0 ±0.5.

The anomaly is 1 ±0.71. Why? Because the propagated error magnitude is

±SQRT( 0.5^2 + 0.5^2) = ±0.71

In all cases the anomaly has a greater uncertainty than the two input numbers because it involves a subtraction.

• William Ward says:

Crispin,

I think your math is good for “random” errors. What about “systemic” errors? I think much of what is being pointed out by some here has to do with systemic errors.

• Greg Goodman says:

Good point , well presented Crispin.

Another point where Kip goes wrong is claiming that the uncertainty of the original measurement can not be reduced: the +/-0.5 is always there.

While it is correct to call this an “uncertainty”, it can also be more precisely described as quantisation error. The smallest recorded change or “quantum” is one degree. No factions. This adds a random error to the actual temperature. A rounding or quantisation error.

Measuring at different times in a day ( min, max ) or at different sites will involved unstructured, random errors. If you average a number of such readings the quantisation errors will be distributed both + and – and of varying magnitudes and will ‘average out’. This allows reducing the expected uncertainty by dividing by SQRT(N), as Crispin does above for N=2.

This is based on the assumption that the errors are “random” or normally distributed. There are other systematic errors but that is a separate question.

If you have sufficiently large number of readings the effect of quantisation error in the original readings will become insignificantly small. There are many other errors involved in this process and the claimed uncertainties are very optimistic. However the nearest degree issue Kip goes into here is not one of them and his claim this propagates is incorrect.

• Greg Goodman says:

auto-correction. This is not a case of normally distributed errors. The root N factor derives from the normal distribution and is not the correct factor here.

As Kip correctly shows in the article, the distribution is flat and finite. There is just the same change of having a 0.1 error as a 0.5 error. This does not negate that the errors will average out over a large sample and become insignificant.

To say the uncertainty in the mean is +/- 0.5 is to say that there is an equal chance that all values had an error of +0.5 as there was a chance that there was an even mix which averaged out. That is obviously not the case for a flat distribution.

As long that the number of samples is sufficiently large for this theoretical flat distribution to be well represented in the sample the error in the mean will tend to zero. That is where the stats theory comes in in relation to the sample sixe, the expected distribution and the uncertainty levels to attribute to the mean.

• Micky H Corbett says:

Greg

You are violating the Central Limit Theorem which is what the Error in The Mean relies on.

The uncertainty in a measurement is a hard limit based on characterisation and repeatability. This is defined by metrology.

It means that the “resolution” of the sample distribution is +/- 0.5 degrees. To demonstrate identically distributed samples they need to vary by more than this.

You have assumed that other errors are random and follow a similar distribution. The MET Office made the same assumption with SST.

It is an unverified assumption. If applied your data becomes hypothetical and unfit for real life use.

It doesn’t matter how many meaurements you have. It’s like measuring a human hair multiple times with a ruler marked in cms and claiming you can get the value to microns.

Beware of slipping into hypothetical.

• A C Osborn says:

Greg, are you sure that “Large Number” theory applies to a measurement of something that changes every minute of every day, with different equipment and in different locations all over the world?
I thought it applied to repetitive measurement of an object.

• A catch with temperatures is that, typically, there is only one sensor involved. Each temperature is a sample of one, at one place and at one time. You could use the CLT if you had 30 or more sensors in that Stevenson screen, for that screen’s value; but only for that value. Extrapolation and interpolation add their own errors and uncertainties to the mix. NB that errors and uncertainties are not synonyms. In the damped=driven mathematically chaotic, dynamic system that is Earth’s weather, ceteris paribus will almost always be false.

• Kip Hansen says:

cdquarles==> You point out one of the fallacies of climate modeling — which attempts prediction of the future by running their models with one factor changing, all else ceteris paribus.

See mine @ Dr. Curry’s blog “Lorenz validated” https://judithcurry.com/2016/10/05/lorenz-validated/

• Jaap Titulaer says:

These are not repeated measures for the same phenomena, these are singular measures (with ranges) for multiple phenomena (temperature measured at multiple locations).

The reduction in uncertainty by averaging ONLY applies for repeated measures of the (exact) same thing, i.e. the temperature at a single location (at a single point in time).
NOT to the averaging of singular measures for multiple phenomena.

• Kip Hansen says:

Jaap ==> Yes, exactly right.

• Jaap Titulaer says:

Also measurement error is nice to know, but when small to stddev of sample population it hardly matters. What one does when averaging temperatures across the global is asking what is the typical temperature (on that day).

Say you do that with height of recruits for the army. We use a standard procedure to get good results and a nice measurement tool. Total expected measurement error is say 0.5 cm. We measure 20 recruits (not same recruit 20x).
Here are the results:
# height
1 182
2 178
3 175
4 183
5 177
6 176
7 168
8 193
9 181
10 187
11 181
12 172
13 180
14 175
15 175
16 167
17 186
18 188
19 193
20 180

Average 179.85
StDev (s) 7.19

95% range
min max
165.47 194.23

Remember that these are different individuals, so not repeated measures of the same thing, but multiple measures of different things, which are then averaged to get an estimate of the midpoint (average) and range (variance).
Both those min, avg and max still also have that measurement error, but we usually forget all about that because it is so small compare to the range of the sample set.

• Roberto says:

The Central Limit Theorem gets misunderstood a lot. It doesn’t mean that measuring each temperature a million times yields any different distribution from the base data. That is the accursed error called autocorrelation. Instead, what it means is that if each sample is hundreds of random measurements from the entire overall population, THOSE sample means have a different distribution from the overall population. But that’s never how these samples are taken.

• Kip Hansen says:

Crispin ==> Caught me with my age showing — “cen·ti·grade

As for the rest — you are doing statistics — not mathematics.

We are not dealing with “error” here, we are dealing with temperatures which have been recorded as ranges 1 degree (F) wide. That range is not reduced — ever.

This is not a matter of “propagation of error”.

• Crispin in Waterloo says:

Kip, cdquarles, A C Osborn, Mickey, Greg and William

There are some good points made above, by which I mean issues are raised that have to be considered when determining the reliability of a calculated result.

The most important to the conversation is that the calculation of an “anomaly” requires subtracting one value with an uncertainty from another value with its own uncertainty and there is a standard manner in which to do this correctly. Invariably the final answer will have a greater uncertainty than the two inputs because we do not know the absolute values.

A separate matter is how the temperature “averages” were produced. Addressing the example given above:

Measure the temperature of one object 30 times in quick succession using the same instrument which has a known uncertainty about its reporting values. The distribution of the readings may be Normal. One could say they will “probably be Normal”.

Let’s assume that the instrument was calibrated perfectly at the start. If the time period is long, perhaps a year, the manufacturer usually provides information on the drift of the instrument so the uncertainty of the measurement can be reported as different from when it was last calibrated.

Now consider using 30 instruments to measure 30 different objects that are 30 different temperatures to find the “average” of a large object like a bulldozer. It is not true to claim that the measurement errors are “Normally distributed” because there is no distribution pattern available for each single measurement. You could assume that the drift of the instruments over time has a normal distribution, but it probably isn’t. So we have two things to address: measurement uncertainty and instrument drift. The first is a random error and the second is a systematic error. The easy answer is the increase the expressed uncertainty with time to accommodate drift and that is what people (should) do.

I cannot possibly present all the considerations that go into the production of a global surface temperature anomaly so let’s stick to the topic of the day.

“Error propagation : A term that refers to the way in which, at a given stage of a calculation, part of the error arises out of the error at a previous stage. This is independent of the further roundoff errors inevitably introduced between the two stages. Unfavorable error propagation can seriously affect the results of a calculation.”

https://www.encyclopedia.com/computing/dictionaries-thesauruses-pictures-and-press-releases/error-propagation

There are no “favourable” error propagations. Unless the calculation involves a constant such as dividing by 100, the uncertainty increases with each processing step. And don’t get me going about the “illegal” averaging averages. The example of the diameter of a human hair and the centimetre ruler is helpful. Using that instrument read to within 1 mm, the diameter of a single hair is 0±0.1 centimetres, every time. That includes the rounding error which is in addition to the measurement uncertainty. Averaging 30, nay, 300 measurements does not improve the result at all.

Claiming to have calculated a global temperature anomaly value with 10 times or 50 times lower uncertainty than for the two values used in the subtraction is hogwash. If the baseline is 20.0±0.5 with 68% confidence and the new value is 20.3±0.5 also with the same confidence index, then the anomaly is 0.3±0.71 [68% CI]. If you want 95% confidence you need a more accurate instrument and more readings for each initial value in the data set being averaged. There is a trend towards doing exactly this: multiple instruments at each site.

• Kip Hansen says:

Crispin ==> Thanks for your exposition on errors . their propagation, and uncertainty.

• Crispin in Waterloo says:

Kip,

I just re-read the whole article again and if you have sent it to me for review, I would had insisted on at least a dozen changes. The wording is too casual, given that it attempts to point out something quite technical.

One more example:

“When one calculates a mean (an arithmetical average — total of all the values divided by the number of values), one gets a very precise answer.”

This confuses accuracy and precision. First, these are not multiple measurements of a single thing. Second, the average of a number of readings cannot be more precise than the precision of the contributing values. That would be false precision. In any case, precision has to be stated within a range defined by the accuracy.

What you are alluding to (Willis often does the same thing so don’t feel lonely) is that one can claim “to more precisely estimate” (not “know”) the position of the centre of the range of uncertainty with additional measurements. It is not an increase in the precision or accuracy of the reported value. This is basic metrology, freely abandoned in the climate science community when it comes to anomalies. They are making fantastical claims.

One cannot laugh hard enough at the silly claim that an anomaly is known to 0.05 C using a baseline value subtracted from the current average, both with uncertainties of 0.5 C. They literally cannot do the math. One cannot treat calculated values based on measurements if they are known constants.

This key error of claiming false precision for anomalies needs to be addressed in a reviewed article (posted here) delineating the steps taken to calculate an anomaly and where rules are being broken, and what the real answers are. As one of my physicist friends says, “They are trying to rewrite metrology.”

4. Bruce of Newcastle says:

Very easy to debunk the GISS global “temperature” record. Snow cover.

Can’t fool snow, it melts at 0 C. It ignores adjustments. Because snow cover trend has been flat since the late ’90’s it means GISS’s data is fiction.

Incidentally snow cover anomaly is consistent with the UAH temperature anomaly dataset, which along with the radiosonde balloon measurements doubly verifies UAH’s accuracy.

• Trevor in Ontari-owe says:

Thanks for this – didn’t know that anyone was tracking Snow Cover Extent.

It’s easy for ordinary people like me to understand a”big picture” story to the effect that significant increases in temperatures should cause significant decreases in SCE.

At least unchanging SCE ought to raise questions.

• Greg says:

Your proxy is about as convincing as tree rings being thermometers. What you are looking at is the geographic distribution of the 0 deg C isotherm. This is not a measure of global average temperature.

• Kip Hansen says:

Greg ==> I think that Bruce is using a pragmatic, “works for me” , rule-of-thumb standard when he mentions “snow extent”…. I don’t think he really believes it is a scientifically defendable idea.

• Bruce of Newcastle says:

Kip – No, I’d completely serious. I’ve worked with data for forty years in my field of science.

Unfortunately I can’t display a graph with the current WUWT commenting system, but I’ve just put it on an old Flickr account I’ve had for a while:

UAH NH land anomaly and Rutgers NH snow anomaly

That is an apples to apples comparison*. But if you check the UAH global dataset it is still a pretty good match.

I’ve added vertical gridlines so you can see how the peaks line up. The UAH data does seem to be too warm – it warmed a bit going from UAH 5.0 to 6.0 as I recall. It looks like that adjustment isn’t very supportable on the snow cover evidence.

Even so the UAH data is much closer to the snow cover data than the lurid NASA GISS data is.

(* Rather than 2m temperature anomalies I’ve used the lower troposphere UAH data as it was easier to get from Roy Spencer’s blog. Note that I’ve inverted the temperature graph so that it’s easier to line up the peaks.)

• Kip Hansen says:

Bruce==Trying to understand what you are on about with this — what I see is that when NH Tropo is warmer there is generally less snow cover. Have I got that right so far?

• Bruce of Newcastle says:

Kip – Snow cover is a direct measurement by satellite. Difficult to get wrong. Temperature by AMSU is an indirect measurement with a lot of data processing required, but better than the adjusted UHIE contaminated mess of GISStemp.

Therefore the flat trend of snow cover anomaly indicates essentially no warming since the late 1990’s. It is a crosscheck for the temperature datasets: if a temperature dataset doesn’t match the trend of the snow cover anomaly graph then the temperature dataset is wrong, and the adjustments of it are wrong.

Perhaps I should have dug out the 2m UAH anomaly data, but the lower troposphere data is probably pretty good. Clouds are some way up in the atmosphere, even if not quite LT level.

I am actually agreeing with you. You are addressing the relative errors in the surface temperature datasets, I am pointing out that snow cover represents a crosscheck of the systematic errors – ie the actual variance from the real temperature. Snow cover extent represents a metric of the area of land at or below 0 C. Thus it can be regarded as an internal standard if you like.

• taxed says:

Bruce
Yes its a real world guide to temperature changes in the NH landmasses.
As it gives extended real world data on how much of the NH land mass is at or below 0 C at any one time, and of just as much important is that it unlike climate science has no agenda to peddle.

• Pierre says:

I have lived in the same location south west of Cleveland Ohio, USA now for 30 years. Over that period of time freezing has gone from 32 degrees F to 37 degrees F. Thirty years ago a prediction of 32-33 degrees F meant bring in the freezable’s. Now it is a prediction of 37 degrees for frost.

When I have checked the official temperatures for those frost days a month or two later, the official temperatures are always well above freezing. I can’t tell you what is going on but I can tell you that they correctly predict frost even though the low is suppose to be not less than 37 degrees.

• Pierre says:

Ok, that throws everything I thought I knew into disarray. So, water can freeze above freezing. Now I gotta figure out why. In the end it will probably make sense.

• hunter says:

Kip, read what Pierre is actually saying.

• Kip Hansen says:

Hunter ==> Read the NOAA pdf linked. It answers his specific question, which paraphrased is “How can there be a frost (which is freezing dew, basically) at a temperature above 32F?”

• michael hart says:

The ice/snow-cover issue illustrates another important point which I don’t think Kip addressed directly: By using anomalies the dishonest can forever claim that, say, this year/decade is x degrees hotter than last year/decade. But if absolute temperatures are always quoted then sooner or later the quoted temperature will become so high as to be clearly erroneous to even the casual observer. The melting point of frozen water is thus a useful internal standard to keep them honest when the temperatures of interest are close to 0°C/32°F

And, needless to say, the boiling point of water at 100°C is sufficient to prove that people like James Hansen are speaking out of where the sun don’t shine when they talk of the earth becoming so hot that the oceans boil away.

• Water in a bucket can freeze on a clear, still night at air temperature up to 59 degrees F.

5. Stephen Skinner says:

Thank you for this article. Using anomalies is a trick because the real world is compared to an artificial ideal. Therefore the real world will easily become anomalous.
While we get years that are warm or cool or wet or dry the experience of these has now become an anomaly.
Anomaly definition – something that deviates from what is standard, normal, or expected. Based on that definition, how does one define something like the UK weather where variation is the norm; the standard, normal and expected weather is now anomalous?

• AGW is not Science says:

That’s why I’ve always detested the use of “anomalies.” It assumes any departure from an AVERAGE of some 30-year period to be some kind of “yardstick” against which any departure is “anomalous.” Which is ridiculous. “Average” weather metrics, be they temperature, precipitation, whatever, are not “expected norms,” as the use of “anomalies” suggests – they are nothing more than MIDPOINTS OF EXTREMES.

6. Latitude says:
7. David Dirkse says:

Kip says : ” remember, when we see a temperature record give as XX +/- 0.5 we are talking about a range of evenly spread possible values”

Kip has made an assumption here that may or may not be correct. He says “evenly spread” but that is not the case. Specifically it is not the case when the values are not “evenly” spread, but spread as are in a normal distribution. Kip’s error is the assumption that they are distributed uniformly.

• Randy Bork says:

But isn’t that precisely what accuracy means, ie: for any given reading recorded, the true value is equally likely to be anywhere along the range (uniformly distributed), NOT normally distributed along the range.

• Kip Hansen says:

David ==> We don’t know what the actual temperature was when we see a record of “72”. The “72” is really the range from 72.5 down to 71.5 and any of the infinite possible values in between. Since temperature is a continuous, infinite value metric, when we know nothing except the range, then all possible values within that range are equally possible — Nature has no preference for any particular value within the range.

The possible temperatures between 72.5 and 71.5 are NOT distributed in a normal distribution.

This is an extremely important point. Any and all values within the range are equally possible.

• David Dirkse says:

Wrong Kip, taking the measurement is normally distributed. There is a very low non-zero probability that the actual temperature is 70 degrees, and the human reader observes and records 71. There is also a low non-zero probability that the actual temperature is 72 degrees, and the human again reads 71. Pretty hard to get +/- 0.5 degree 95% confidence interval when the standard deviation of a uniform distribution is = 1/12*(a-b)

• OweninGA says:

David,

Let’s try this one more time. A single thermometer is sitting in a box in the middle of a grassy field. You go out to read the thermometer. You carefully read and see the alcohol or mercury line is between the major marks on the scale. You write down the value on the closest major mark.

What in that scenario is going to give the temperature inside that box a preference to line up with one of the arbitrary (as far as nature is concerned) major marks over some random space in between? When you say “normally distributed” , you are saying that the air inside the box has a preference for heating the alcohol or mercury to expand or contract until it lines up with those major lines but is sometimes a bit off. This is most definitely a uniform distribution.

Now if you want to say you send 1000 people out to read that thermometer and each of them read a value, then you would have a normal distribution of readings about that major line. That is a totally different scenario and not applicable in the reading of thermometers. They were always read once by one person, and if the top of the alcohol or mercury were between 69 and 70, but closer to 70, then 70 is what was recorded.

• Greg Cavanagh says:

It is impossible for it to be a normal distribution around a value. The temperature is a continuous linear value between it’s possible range for any given area. There may be a normal distribution within the whole range, but not for any given temperature reading as you are stating. As Kip points out, it could be any value within X to X+1 with no bias toward any value centre.

• David Dirkse says:

OweninGA and Greg…..did both of you miss the word MEASUREMENT?
..
The actual temperature is unknown. The reading you get off of the measuring device is normally distributed.
..
Because you cannot measure any other way, you do not have any evidence or data on what the actual distribution of the temperature really is. Assuming it is uniformly distributed is not proof that it is.
..
Why don’t one or both of you give me the explanation of how you would determine the actual distirbuiton is uniform when the only way you can measure it is with something that provides you with a normally distributed result?

• Greg Cavanagh says:

David. We could be arguing the same thing using different language. My attempt at an explanation wasn’t a good one, I’ll admit.

Between the minimum and maximum temperature range, there will assumed to be a normal distribution curve. Within a single degree measurement error (24.5-25.49999) will be near linear probability of any given REAL value. We can’t measure those real values, so it gets rounded to the nearest 0.5C.

• Kip Hansen says:

Greg, Owen, David ==> For any one temperature record officiually recorded as “72” there are an infinite number of possible vaues between 72.5 and 71.5. All of those infinite values have an equal probability of having been the real temperature at the moment of measurement. The record “72” literally means “one of the infinite values between 72.5 and 71.5” — no particular value has a higher probability. There is no Normal Distribution involved.

• AndyHce says:

How many of the official temperature stations are human read rather than electronically recorded? Probably not many in the US or other more technically advanced societies.

• Micky H Corbett says:

I’m afraid that’s wrong.

You are assuming a normal distribution not demonstrating it with a more accurate instrument to calibrate it.

This is a fundamental problem with applying theory rather than characterisation. You also need to account for drift and other effects.

This is basic metrology.

• David Dirkse says:

Kip says: , “then all possible values within that range are equally possible — Nature has no preference for any particular value within the range.”

Thank you Kip for clearing up your misconception. You obviously don’t understand Quantum Mechanics. Based on QM, Nature actually HAS a preference for particular values, and most of the time they are discrete integer values.

• Clyde Spencer says:

DD,
You said, “…most of the time they are discrete integer values.” That is true at the level of quantum effects, but not at the macro-scale of degrees Celsius or Fahrenheit.

• David Dirkse says:

Clyde, please do not forget that the macro property of “temperature” is the statistical average of the normally distributed velocity of a discrete number of particles. Now all I ask is that you tell me how you determine that is value of said temperature is uniformity distributed on the interval between N and N+1 degrees on the measuring instrument? Also please tell me how you measure the individual velocity of one of these particles so that you can arrive at the average. My understanding of QM says you can’t even do that.

• Clyde Spencer says:

DD,
Yes, Heisenberg’s uncertainty principle implies that the act of measuring a single particle will alter its properties. When dealing with a very large number of them, one expects a probability distribution that smears out the quantum velocity fluctuations and provides individual particle ‘temperatures’ that are much smaller than can be measured with any thermometer. You are NOT going to see a preference for an integer temperature change!

• David Dirkse says:

At the macro level you might get the impression that the “smearing out” makes the measured item continuous, but the underlying physical theory says it is quantized and actually has values in the interval that cannot be. When Kip states: “then all possible values within that range are equally possible” he is wrong as dictated by QM. QM says there are values in the interval that are not possible, or that the value has two different measures at the same time (i.e. Schrödinger’s cat)

• Kip Hansen says:

Clyde == Thank you for trying to help David. Like many, he is confusing and conflating Quantum Mechanic theory with realk world macro effects.

Even if QM effects were seen in 2-meter air temperature readings, the probability of those Quantum effects landing preferentially at our arbitrarily assigned whole degree values would still be infinitesimal.

• AndyHce says:

And QM nature is always clued into whatever human devised scale each instrument is using — and how accurately each instrument is manufactured and calibrated!
That is at least as good a trick as noting the fall of every sparrow, probably better.

• A C Osborn says:

They also forgot that the Human is “adjusting the data” either up or down, plus the temperature is as percieved by the human, someone 6″ taller may see it slightly differently and “adjust” it in the opposite direction.

• Rick C PE says:

No, Kip is correct. The uncertainty that arises due to recording data rounded to the nearest whole number is Properly characterized by the uniform (or rectangular) distribution. However, uncertainty due to instrument calibration always includes both a systematic and random component. The systematic component is the difference between the reference’s stated and true values. The random component is determined by repeated comparisons between the reference and instrument measurement and is normally distributed.

So Kip’s essay is actually very generous in only looking at the +/- 0.5 half interval uncertainty. The real uncertainty would be the root of the sum of the squares of the half interval plus systematic plus random uncertainties. e.g. Assume half interval MU = 0.5, MU of reference = 0.2, MU due to random error = 0.3, then overall MU = 0.62

• Rick C PE says:

I probably should have said that we never actually know the systematic component since we never know the “true value” of calibration references. Calibration certificates report the MU of the reference which is used.

• Crispin in Waterloo says:

Rick, thanks supporting the “normally distributed method” of error propagation. 🙂

I presume you agree that the anomaly cannot have a lower uncertainty than the contributing measurements.

The most egregious case of misrepresentation of facts (I know of) is the NASA/GISS claim that 2015 was 0.001 C warmer than 2014. That is a true to life example of making a silk purse out of a sow’s ear.

8. David L Hagen says:

For NOAA’s official details, see the post:
Global Temperature Uncertainty

Introduction
What is a range of uncertainty?
Evaluating the temperature of the entire planet has an inherent level of uncertainty. Because of this, NCEI provides values that describe the range of this uncertainty, or simply “range”, of each month’s, season’s or year’s global temperature anomaly. These values are provided as plus/minus values. For example, a month’s temperature anomaly may be reported as “0.54°C above the 20th Century average, plus or minus 0.08°C.” This may be written in shorthand as “+0.54°C +/- 0.08°C.” Scientists, statisticians and mathematicians have several terms for this concept, such as “precision”, “margin of error” or “confidence interval”.

<a href=https://www.ncdc.noaa.gov/monitoring-references/faq/anomalies.phpBackground Information – FAQ

How is the average global temperature anomaly time-series calculated?
The global time series is produced from the Smith and Reynolds blended land and ocean data set (Smith et al., 2008). This data set consists of monthly average temperature anomalies on a 5° x 5° grid across land and ocean surfaces. These grid boxes are then averaged to provide an average global temperature anomaly. An area-weighted scheme is used to reflect the reality that the boxes are smaller near the poles and larger near the equator. Global-average anomalies are calculated on a monthly and annual time scale. Average temperature anomalies are also available for land and ocean surfaces separately, and the Northern and Southern Hemispheres separately. The global and hemispheric anomalies are provided with respect to the period 1901-2000, the 20th century average.

The two citation titles mentioning uncertainties are:
Folland, C. K., and Coauthors, 2001: Global temperature change and its uncertainties since 1861. Geophys. Res. Lett., 28, 2621–2624.
Rayner, N. A, P. Brohan, D. E. Parker, C. K. Folland, J. J. Kennedy, M. Vanicek, T. J. Ansell, and S. F. B. Tett, 2006: Improved analyses of changes and uncertainties in sea surface temperature measured in situ since the mid-nineteenth century: The HadSST2 dataset. J. Climate, 19, 446–469.
See BIPM’s JCGM_100_2008_E international standard on how to express uncertainties:
GUM: Guide to the Expression of Uncertainty in Measurement
Guide to the Expression of Uncertainty in Measurement. JCGM_100_2008_E, BIPM
https://www.bipm.org/utils/common/documents/jcgm/JCGM_100_2008_E.pdf

• Kip Hansen says:

David L Hagen ==> And they really truly believe that that represents the real uncertainty. Unfortunately, it is simply the “uncertainty” that the MEAN (a mean of means of means of medians) is close to the value given as the anomaly.

The anomaly of the mean and their uncertainty say nothing about the temperature of the past (past year, month, or whatever). It only speaks for the uncertainty of the mean — the actual temperature, at the Global Average Surface Temperature level is still uncertainty to a minimum of +/- 0.5 K.

You citations do show exactly how badly they have fooled themselves and how convinced they are that it makes sense.

They know very well that the absolute GAST (in degrees k) carries a KNOWN UNCERTAINTY of at least 0.5K. That known uncertainty does not disappear just because they choose to look at the anomaly of the GAST.

• Clyde Spencer says:

Kip,
As to how badly they are fooling themselves, I’d suggest what I have written before. A probability distribution function for all the temperatures for Earth for a year is an asymmetric curve with a long tail on the cold side. The peak of the curve is close to the calculated annual mean temperature. However, Tschbycheff’s Theorem provides an estimate of the standard deviation based on the range of values. Fundamentally, any way you cut it, the standard deviation about the mean is going to be some tens of degrees, not hundredths or thousandths of a degree.

https://wattsupwiththat.com/2017/04/23/the-meaning-and-utility-of-averages-as-it-applies-to-climate/

• Kip Hansen says:

Clyde Spencer ==> I have no doubt that you are right with “A probability distribution function for all the temperatures for Earth for a year is an asymmetric curve with a long tail on the cold side”.

If only we were dealing with something as simple as that…..a data set of the temperature of every 5 degree grid of the Earth taken accurately every ten minutes then we might be able to come up with something that might pragmatically be called the “Global Average Surface Temperature” to some functional degree of precision.

I agree that the true uncertainty surrounding GAST is far greater than +/- 0.5K — and have stated that this is the absolute minimum uncertainty…. The true total range of uncertainty is probably greater than the whole change since 1880.

• Alan Tomalty says:

The real reason that NASA and the other agencies use anomalies; are to be able to extrapolate temperatures to areas where there are no temperature stations. Read below.

https://data.giss.nasa.gov/gistemp/faq/abs_temp.html

Read the following from the NASA site

“If Surface Air Temperatures cannot be measured, how are SAT maps created?
A. This can only be done with the help of computer models, the same models that are used to create the daily weather forecasts. We may start out the model with the few observed data that are available and fill in the rest with guesses (also called extrapolations) and then let the model run long enough so that the initial guesses no longer matter, but not too long in order to avoid that the inaccuracies of the model become relevant. This may be done starting from conditions from many years, so that the average (called a ‘climatology’) hopefully represents a typical map for the particular month or day of the year.”

So in the end temperature datasets like the above are computer generated with FAKE data.

Kip has correctly pointed out the junk science of dropping of the uncertainty range but the whole anomaly method was started by James Hansen in 1987 see below
https://pubs.giss.nasa.gov/docs/1987/1987_Hansen_ha00700d.pdf

In this above paper, Hansen has admitted in his own words ; that he did not follow the scientific method of testing a null hypothesis when it comes to analyzing the effects of CO2. I quote his paper.

“Such global data would provide the most appropriate comparisons for global climate models and would enhance our ability to detect possible effects
of global climate forcings, such as increasing atmospheric CO2.”

In that one statement he has admitted that up to then he had no evidence that CO2 affects temperature. The only indication that it might was from a US Air force study (see below) . This is true even in the face of him producing 8 prior different studies on CO2 and the atmosphere starting in 1976. It seems that somebody in the World Meterological organization actually beat Hansen to the alarmist podium, since Hansen references a paper (in his 1st study on CO2 in 1976) by the WMO introduced at their Stockholm conference in 1974. However Hansen in his 1976 paper gave the 1st clue that he had already condemned CO2 and the other trace radiative gases.

https://pubs.giss.nasa.gov/docs/1976/1976_Wang_wa07100z.pd

In that study Hansen said “By studying and reaching a quantitative understanding of the evolution of planetary atmospheres we can hope to be able to predict the climatic consequences of the accelerated atmospheric evolution that man is producing on Earth.”

He had already developed a 1 dimensional radiative convective model to compute the climate sensitivity of each radiative gas by 1976. it is interesting that his model divided the solar radiation into 59 frequencies and the thermal spectrum(IR) into 49 frequencies. However it seems that we can blame the US Air force with their 8 researchers who came up with an actual greenhouse temperature effect in 1973. So it seems that Hansen just took their numbers and ran with it. The same numbers are probably in the code today in all the world’s climate models.

They are ; quoting from Hansen’s study paper above :

” CO2 doubling greenhouse effect Fixed cloud top temperature 0.79K
Fixed cloud top height 0.53K
Factor modifying concentration 1.25
This was based on then concentration of 330ppm in 1973.

H2O Fixed cloud top temperature 1.03K
Fixed cloud top height 0.65K

Don’t forget that if you are looking at table 3 in that study where I quote the above figures, according to Hansen you have to add up all the temperatures if there are also doublings of the other trace gases. It is interesting that doubling of ozone gives negative temperature forcings -0.47K and -0.34K.

Also interesting are the methane numbers 0.4K and 0.2K.

If you add the highest doubling forcing of both CO2 and methane you get 0.79K + 0.4K
= ~1.2K. That is very suspiciously close to many researchers of the present day to the climate sensitivity numbers.

• Kip Hansen says:

Alan ==> A great deal of the calculated data about climate can correctly called “fictional data” or “fictitious data sets” — in which the data is neither measured nor observed, but depends on functions based on assumptions not in evidence. Some of those fictitious data sets are useful, some not.

• Alan Tomalty says:

That person who beat Hansen to the alarmist podium was the Swedish scientist Bert Bolin. However Bolin himself didnt have any experimental proof of CO2 raising temperature. He basically took Hansen’s numbers which as I said came from the 8 US air force researchers in a study done in 1973. How they came up with the forcing temperature numbers from a doubling I dont know; because I cant find that study and since I am not an American I cannot access their Freedom of information requests. The names of those 8 Air Force researchers are 1)R.A. McClatchey 2) W.S. Benedict 3) S.A. Clough 4) D.E. Burch 5) R.F. Calfee 6) K. Fox 7) I.S. Rothman 8)J.S. Garing

The only reference to the study is AirForce Camb, Res. Lab. Rep. AFCRI-TR-73-0096 (1973)

That has to be the most important document in the history of mankind, seeing that the CO2 scam is the most costly scam in human history.

• Kip Hansen says:
• lee says:

Can you explain how an estimate is “data”?

• Kip Hansen says:

lee ==> An estimate is an estimate — less generously called a “guess”. Hopefully the estimate will be based on real data which has been measured or scientifically observed in some way.

• David L. Hagen says:

Solar Changes
For context, paleo-reconstruction estimates of solar insolation show the change from the Maunder Minimum to the present to be about twice that of the Maunder Minimum to the Medieval Warm Period insolation. See:
Estimating Solar Irradiance Since 850 CE, J. L. Lean
Space Science Division, Naval Research Laboratory, Washington, DC, USA
Abstract

https://agupubs.onlinelibrary.wiley.com/doi/pdf/10.1002/2017EA000357

9. EdB says:

Add to the uncertainties discussed here the FACT that up until 2003(ocean buoys) the ocean temperatures are simply made up. The likely error from 1850 to 1978 is about +-3C, and after 1978, with satellites, probably +-1.5C, and with buoys after 2003, +-0.1C.

That is for 70% of the earth.

Then there is the Arctic and Antarctic. Add in more made up numbers.

Other than a few long term, good quality Stevenson screen readings, climate science has little to work with to calculate long term GAT. See: Lansner and Pepke Pederson 2018

http://notrickszone.com/2018/03/23/uncertainty-mounts-global-temperature-data-presentation-flat-wrong-new-danish-findings-show/

10. MIKE MCHENRY says:

Can someone tell me how you can come up with a global/land temperature index. The heat capacity of the 2 are hugely different

• MIKE MCHENRY says:

I meant to say land/sea temperature

• Kip Hansen says:

Mike ==> The usual response is that it is an INDEX — like the Dow Jones Stock Index — of unlike values but looking at the combined index can tell us something.

Mixing sea surface (skin) temperature with 2-meter land air temperatures is one of the extremely odd practices of CliSci.

• William Ward says:

Mike, what you ask is a big part of my counter argument against alarmists. Don’t forget the thermal capacity of the polar ice caps as well. The stored thermal energy in the ice below 0C is close to the stored thermal energy in the oceans above 0C and both are approximately 1000 times the thermal energy stored in the atmosphere above 0C. My cynical view is that global average temperature is used as the metric because the end goal is to sell this to (force this on) the population. Temperature is “intuitive”. People can be scared with a story about temperature. If thermodynamics is brought into the discussion or Joules of energy is the metric then it will not be possible to con the public – because they can’t understand it.

11. coaldust says:

“As an aside: when Climate Science and meteorology present us with the Daily Average temperature from any weather station, they are not giving us what you would think of as the “average”, which in plain language refers to the arithmetic mean — rather we are given the median temperature — the number that is exactly halfway between the Daily High and the Daily Low. So, rather than finding the mean by adding the hourly temperatures and dividing by 24, we get the result of Daily High plus Daily Low divided by 2. These “Daily Averages” are then used in all subsequent calculations of weekly, monthly, seasonal, and annual averages.”

This is a major problem in climate science. It hides what is really going on by taking a false average at the very beginning and using it going forward. That number is not the average at all. The high/low can occur at different times of the day depending on the weather. For instance, on a mostly cloudy day, the high might occur when the sun peeks through the clouds. It may be the high for the day but using it and one other to compute the average is bogus. The average should be over many samples. Perhaps one sample per minute giving 1440 samples per day, each one with equal weight. Even better, keep all the samples. Storage is cheap.

Another example is when a cold front comes through at 2:00 AM, and the high for the 24-hour period (“day”) occurs in the middle of the night (at midnight!). Using that high and averaging with the low hides the fact that the day was cold.

The fact that this is how it has been done for a long time is no excuse. It is a problem, so fix it.

• pochas94 says:

It was fixed. That (and other “fixes”) is why we have Global Warming.

• Kip Hansen says:

coaldust ==> The Hi+Low/2 daily average is an historical artifact left over from when weather stations used Hi/Low recording thermometers. The His and Lows were all that they recorded, and the daily average was figured from them. In order to be able to compare modern records with older records they have continued with the same method — nutty as it is — out of necessity.

The Hi+Low/2 method does not give what your sixth grader would call the average temperature for the day. It is really the median of a two-value record.

• Phil R says:

Kip,

I wasn’t sure where to put this, or whether you will see it, or even whether it’s relevant, but I live in SE Virginia. Several years ago in January (not sure which year now, but can look it up) when I first heard of the Polar vortex, we had temperatures drop over 50 °F in less than 24 hours (mild winter day, ~67 °F one afternoon to 14 °F early the next morning). If you just looked at the daily or even weekly average temperature, you would have never picked this up. The average T for the first day was, I think in the 40’s or 50’s (have calcs, but not with me). The second day was colder, but probably around 18-19°.

• Kip Hansen says:

Phil ==> Weather is highly changeable and can be wild. Daily Averages hide more information than they reveal.

• Remy Mermelstein says:

Tell us Kip, what do daily averages “hide?”

• Kip Hansen says:

Remy ==> such a basic question….daily averages hide everything about the daily temperatures except the Daily Median…they even hide the Max and Min. used to derive them. we no longer know whether we had a cool morning or a warm morning, an overnight freeze followed by a warm spring day, or a mild night followed by a mild day, we lose all the temperature information except that one tiny bit of information, the Median between the Max and the Min.

(The other casze is the full daily record of the temperatures as measured, say at six ,minute intervals by an ASOS automatic weather station. )

• Remy Mermelstein says:

No Kip, you are building a huge strawman. The “daily average” from the National Climatic Data Center/NESDIS/NOAA tells me that on Sept 27th (tomorrow) where I live, is 59. None of the things you mention are “hidden,” because none of them HAVE HAPPENED YET !

What it does tell me is that shorts and a tee shirt might be uncomfortable for a wardrobe choice for outside activities.

• Remy Mermelstein says:

That is why Kip, in Math/Stat the average (arithmetic mean) is referred to as the “Expected Value” of a random variable.

https://en.wikipedia.org/wiki/Expected_value

Emphasis on the word EXPECTED

12. R Shearer says:

Arrhenius gave the average surface temp of earth as 15C in 1896 and 1906 papers. Today it is no different within error (15C is 288.15K). NOAA gave earth’s temperature as 14.4C in 2011. If one is to believe NOAA’s precision, it might have actually cooled in the past 100 years or so.

• Kip Hansen says:

R Shearer ==> “Arrhenius gave the average surface temp of earth as 15C in 1896 and 1906 papers.” and we are almost there — just a little warmer and we will be Earth-like!

13. markl says:

So climate is not only getting worse than we could ever imagine we also know less about it then we ever have.

14. “But for our purposes, let’s just consider that the anomaly is just the 30-year mean subtracted from the calculated GAST in degrees.”
I don’t know what your purposes are, but that is strawman stuff. No-one does that.

“The trick comes in where the actual calculated absolute temperature value is converted to an anomaly of means.”
Wearily, no, it is a mean of anomalies.

“Reducing the data set to a statistical product called anomaly of the mean does not inform “
Wearily, again…

“No matter what we do to temperature records, we have to deal with the fact that the actual temperatures were not recorded — we only recorded ranges within which the actual temperature occurred.”
Literally, not true. Ranges were not recorded, only the estimate. Every measurement ever made, of anything, could have that said about it. You never know the actual …. You have an estimate.

• Kip Hansen says:

Nick ==> Thanks for checking in — sorry to weary you so.

When you wake up,you can admit to the real uncertainty in the Global Average Surface Temperature. Gavin did….

• Kip,
You’ve been writing about this for a long time, so you should have got on top of the basic difference between a mean of anomalies and an anomaly of means. It matters.

Gavin was saying that an anomaly of mean temperature would, like the mean itself, have a large uncertainty. He isn’t “admitting” to anything – he’s simply explaining why neither GISS, nor anyone else sensible, calculate such a mean. A mean of anomalies does not have that error. That is a basic distinction that you never seem to get on top of.

• fred250 says:

“A mean of anomalies does not have that error.”

Sorry, but it DOES. !!

The mean itself has an error margin of +/- 0.5, so the anomalies can be no better.

The laws of large numbers DO NOT APPLY

This is a basic fact you never seem to comprehend.

• LdB says:

For both of you.

An anomaly of means is a meaningless quantity on an unknown sample space. It has no error because it has no meaning without putting it into a background and in doing so you have to plot it in an error range.

Nick is correct but it appears to me he does not know the second part that the moment you try and use that errorless number you have to put it in an error range.

Want to try it, roll a dice 6 times and each number is supposed to come up once. So an number not turning up or any number coming up more than once is your anamoly count. Now take the mean of the anomolies and it tells you what?

Even if you were trying to work out if a dice was loaded to use the mean of the anamolies you have to now bring in the distribution range and deviation you would expect and now you get your error back. Your errorless, meaningless number when put into a background now has an error range.

If you want to see real scientists do it here is the Higgs discovery in it’s background distribution
http://cms.web.cern.ch/sites/cms.web.cern.ch/files/styles/large/public/field/image/Fig3-MassFactSoBWeightedMass.png?itok=mrA7uJV2

• David Dirkse says:

“Want to try it, roll a dice 6 times and each number is supposed to come up once. ”
..
FALSE.
..
You do not have a basic understanding of probability theory. Probability theory says that if you get six ones in a row, the chances of that happening are 1 in 46656. The probability of getting each number to come up once is (6*5*4*3*2*1)/46656 = 720/46656 = 0.015432

• LdB says:

Yes and that is the point you need to bring in the distribution and to prove the dice is loaded you would very quickly establish you have to roll the dice a lot more than 6 times.

• LdB says:

I guess for you David I can reverse the question how do I record the mean of anomoly of a range of rolls, and what is it relative to?.

• David Dirkse says:

This is where you lose it LdB: “So an number not turning up or any number coming up more than once is your anamoly count.”

That happens with a probability of 0.984568, so in 2000 rolls, your anomaly count would be about 1969. Taking a mean of this number over a bunch of tries doesn’t tell you anything. You are not using the correct procedure to detect a loaded die.

• LdB says:

The topic at hand is not what about any of that .. can we stick to the subject this is just junk discussion about the bleeding obvious 🙂

• LdB says:

So getting this back on track .. if anyone wishes to pick it up

So if we wish to talk about means of anomolies you must first define what our definition of anomoly is. The only way to define an anomoly is by reference.

Now in david’s case he objects to how I defind and measured the anomoly (its wrong apparently 🙂 ). Instead of getting into a long argument I asked him to make his own answer which he ignored but had he attempted it he would have had to define a reference.

• “So if we wish to talk about means of anomolies you must first define what our definition of anomoly is.”

For usual temperature averaging (HAD, GISS etc) it is clear. It is the historic average, for the month and for that station, of the temperature over a reference period, eg 1951-80 for GISS. There is some further analysis if the data for that time is incomplete.

• LdB says:

Yep so now you need to add in the errors for that background. This is Mosher’s unicorns, correlation does not equal causation you need to pull apart the background and assign errors to your anomoly measurement.

So lets ask the question your reference is moving in the period and if you want it to be absolute so write the mathematical formula for the curve (points that don’t sit on the line you have an error). If we don’t have a formula then we have Davids distribution problem so what is the standard deviation you are claiming for the period.

I am not interested in the actual answer just the process, and it shows you get back to the situation Kip was saying your mean of an anomolies has an error and it does when you put it in a proper background.

Your claim it doesn’t have a error is trite because you want to not talk about the background.

• Kip Hansen says:

Nick ==> Look at your own methods — in the end, you take a mean of anomalies, true — before that, you had anomalies between means, the means were means of medians.

That’s the TRICK. Shifting to a statistical animal — a mean of anomalies — allows you to ignore the basic KNOWN UNCERTAINTY of the metric and pretend that the SDs of the Mean are the sum total of the uncertainty.

Thus you give yourself permission to UNKNOW the KNOWN UNCERTAINTY.

• “Literally, not true. Ranges were not recorded, only the estimate. Every measurement ever made, of anything, could have that said about it. You never know the actual …. You have an estimate.”

Might I suggest that an “estimate” is a chosen value within a range, where the range is an understood field from which the estimate is taken. It is a convention to write down the focal point first, and then the “plus/minus” is placed beside it to show this very fact. Your focus might be on the “estimate”, but the reality that this “estimate” represents is a RANGE. Hence, the estimate REPRESENTS the focal point of the range, and, as such, is an indication that a RANGE is what the measure really is.

• Kip Hansen says:

Robert ==> Nicely put — say it enough times and the statisticians will find the mean of it to great pr4ecision.

15. Dave Bidwell says:

I thought that averaging the data points to arrive at a very highly probable mean value only works when you are making the same measurement of the same thing over and over again. For example, what is the weight of this screw? If we take 100 measurements, we will come up with a very accurate (probable) measure of it’s weight even though we acknowledge there is an error in our scale. On the other hand, to come up with a global average temperature we are taking many, many measurements and comparing them on different days. Apples and oranges.

• Kip Hansen says:

David ==> You are speaking about the Law of Large Numbers. It deals with multiple measurements of the same thing at the same time being averaged to arrive closer and closer to the actual size of the thing.

You are right that it does not apply to multiple measurements of different things at different times.

What taking a mean does is predict the probability of the mean being at a certain value, with higher probabilities being closer to the calculated mean.
Means however, do not inform us about the thing measured — only about the probabilities of the mean being near such and such a result.

See the links in the essay to William Briggs on the topic.

• Gary Wescom says:

Kip,
I’ve got to call you on this one:

“It deals with multiple measurements of the same thing at the same time being averaged to arrive closer and closer to the actual size of the thing.”

The averaging provides a more precise value but the final accuracy is still determined by the accuracy of the measurements. You can’t measure a stick to nano-meter accuracy with a yard stick no matter how many million measurements you take with that yard stick. You can, however feel happy about how PRECISE your measurement calculates out to be. Just don’t claim you have improved its accuracy.

• David Dirkse says:

Gary, if you take a stick that is 10 feet tall with markings on at one foot intervals, you can measure the average height of adult males if you take enough samples with the stick for each individual to the nearest foot. You cannot accurately measure any individual’s height with the stick, but you can get any degree of accuracy measuring the population mean with it by sample size. Your data set will be a series of numbers like 5,6,5,5,4,5,6,5,5, ……. When you take 10,000 measurements the sum of this series will be between 57,910 and 57,920. Do the math and you’ll see 5 ft, 9 and 1/2 inches is the result. Want more accuracy? Use 20,000 samples.

• Clyde Spencer says:

David,
A surveyor of some reputation who wrote a textbook (Smirnoff), disagrees with you. To whit, he said, ”… at a low order of precision no increase in accuracy will result from repeated measurements.” He expands on this with the remark, “…the prerequisite condition for improving the accuracy is that measurements must be of such an order of precision that there will be some variations in recorded values.” The implication here is that there is a limit to how much the precision can be increased. Thus, while the definition of the Standard Error of the Mean is the Standard Deviation of samples divided by the square-root of the number of samples, the process cannot be repeated indefinitely to obtain any precision desired!”

https://wattsupwiththat.com/2017/04/12/are-claimed-global-record-temperatures-valid/

• Geoff Sherrington says:

Partially correct, David Dirske.
The first extra error.
The type of flagrant error that people like Nick want to ignore arises if your stick has the wrong size, when traced back to say the standard metre that used to be a platinum rod in Paris. Or, the markings on it are inaccurate. You have to test for this, and you have to report your findings.
It is an essential part of error analysis, to use wherever the circumstances permit, more than one type of calibration. Let’s try for an example from climate work. The radiation balance at Top of Atmosphere (TOA, in W/m2) has been measured by detectors on satellites to be in the 1300 W/m2 range. The scientists want to see the effects of tiny differences, some even doing math with figures of 0.008 W/m2. The half-dozen satellite devices have drift and orbit problems, as well as slightly different designs, so in absolte terms they differ by some +/- 6 W/m2.
The semi-philosophic question here in Kip’s article is whether that +/- 6 W/m2 is immutable. I say it is. The experts say no, we know reasons why some of the satellites were wrong, so we can adjust. But, after they adjust, how is the error calculated? They seldom say. Surely, they introduce even more error because of the unknowns in the assumtion that adjustment can be done.

The second extra error.
People say that even using a coarsely-calibrated stick, with enough measurements of enough people, you can deduce the average height of people in a defined population. Wrong. This can only be done if the distribution of heights of people is known beforehand. To know that, you first must have, then use, a more accurate method of measurement.
It is all a frightful mess.
If classical, established scientific error treatment had been used from the start, many,many papers would never have been published and by now the climate change bogey would have been put on the back shelf.

• “A surveyor of some reputation … disagrees with you.”

But note that last sentence of Smirnoff
“Thus, while the definition of the Standard Error of the Mean is the Standard Deviation of samples divided by the square-root of the number of samples, the process cannot be repeated indefinitely to obtain any precision desired!”

He, like the whole scientific world, sets out the process by which increasing sample size reduces the uncertainty of the mean, by a factor of 1/√n. Common knowledge, but endlessly disputed here. His proviso about how it can’t be repeated indefinitely does not mean that it doesn’t give the effect desired. It’s true that, while the one foot divisions work quite well, one metre divisions would not work. The spacing cannot be too far beyond the range of variation of the measurand.

• Kip Hansen says:

All ==> If all you want is “the uncertainty of the mean” — then have at it — you are welcome to it.

The “uncertainty of the mean” tells us nothing about the true uncertainty of the temperature — which was and remains, after all the hoopala, at least +/- 0.5K.

That is the TRICK — by claiming that the statistical definition of the “uncertainty of the mean” is a true reflection of the uncertainty about the global temperature. GASTabsolute has an uncertainty of AT LEAST +/- 0.5K. You cannot UNKNOW that uncertainty by closing your eyes to it or putting on statistical blinders.

• Clyde Spencer says:

Nick,
As usual, you are being disingenuously selective in your facts. The approach of improving the precision of a measurement by taking many readings only applies to something with a fixed value. The multiple +/- readings cancel the random errors introduced by the observer’s judgement, small inaccuracies in the scale of the measuring instrument, etc. The process of using the Standard Error of the Mean assumes a normal distribution of the random errors. However, in the world of climatology, one is not measuring a single temperature many times. One is measuring many temperatures one time, and synthesizing a representative temperature that is closer to being a mode than a mean. One is measuring a variable that may well have other than a normal distribution. Using the Standard Error of the Mean with a variable (versus a constant) is not warranted because the requirements for its use is not met by something that is always changing. At best, the Law of Large Numbers predicts that the accuracy of the synthetic number will be improved by many readings, but the precision still remains low. Do you also believe in “The wisdom of crowds?”

As to the last statement by Smirnoff, which you quote, consider the following: Take a meter stick with no scale markings, compare it to a piece of lumber. You find that the piece of lumber is ALMOST the same length, but not quite. And without any markings on the meter stick, you are required to record the length as one (1) meter. Now, by your logic, if you take 100 readings, each 1 meter, you can now claim that the piece of lumber (which obviously is not exactly 1 meter), has a length of 1.0 meters.

• Clyde,
“The approach of improving the precision of a measurement by taking many readings only applies to something with a fixed value.”

“The process of using the Standard Error of the Mean assumes a normal distribution of the random errors.”

Often asserted here, but with no authority cited in support. And it is just wrong. Kip cited above an article in a “simple” wikipedia in which “a random variable is repeatedly observed” is cited as an example of application of the Law of Large Numbers. But the proper Wikipedia article is much more thorough and sets out the history and rheory properly. And there is no mention of any such restrictions. It describes, for example, how a casino can get steady income from operating a roulette wheel, though the outcome of any one spin is highly uncertain. This is not taking repeated measurements of the same thing. It is just adding random variables, which is the process described mathematically in Wiki.

Basically, for independent variables, the variances add. That fact has no requirement of normal distribution. If you add N variables of equal variance, the combined variance increases by factor N. The sd is the sqrt, so increases by √N. When you take the average, you end up dividing by N, so net effect is 1/√N. Nothing about normality, and the mention of equal variance was only to simplify the arithmetic. You can sum unequal variances and the effect of reducing the sd of the mean will be similar.

” disingenuously selective in your facts”
In fact, I just pointed out what your quote actually said.

• Clyde Spencer says:

Nick,
I believe that you are misinterpreting the Law of Large Numbers. It principally applies to probabilistic discrete events, such as flipping a coin or throwing a die. Mandelbrot gave considerable attention to ‘runs’ in such activities, suggesting that the behavior was fractal. However, the important thing is, if you only toss a coin a few times, it is probably more likely to get a short run of head or tails than to get an equal number of both. It is ONLY after a large number of tosses that one can expect the ratio to approach 1:1.

A similar thing can be observed with a sampled population with a probability distribution function. It is only after a large number of samples that the shape of the distribution is resolved and one can say anything about the probability of any sample being close to the mean. That is, after a large number of samples, one can have confidence in what the true mean is, or how accurate the sample is. However, for measured values that are not integers or discrete, the large number of samples tells one little about the precision of the measurement of the individual samples. It is only when what is being measured has a singular fixed value, that one can gain insight on the precision of the measurements because the measured values will have a small range and approach the mean as a limit.

• Crispin in Waterloo says:

Kip wrote:

>>All ==> If all you want is “the uncertainty of the mean” — then have at it — you are welcome to it.

>The “uncertainty of the mean” tells us nothing about the true uncertainty of the temperature — which was and remains, after all the hoopala, at least +/- 0.5K.

This is correct, except for the “+/- 0.5K” at the start of the article and now “at least +/- 0.5K”. Yes it is “at least” but it is known to be larger, not just “at least”.

If anyone has a correctly calculated value for the baseline and a similarly correctly calculate current value, both with correctly stated uncertainties, the +/- part of the anomaly is easily and correctly calculated using the formula above.

The main thrust of the article is that the anomaly cannot be known with greater accuracy than the values from which it was calculated. Nick is persistently trying to defend some version of, “Oh yes it can in certain cases”.

David Dirkse provides an example of the average height of males made using a method that can deliver a “falsely precise” result. You might say the average height is 6 ft. Of 5’10”, or 5’9.5″, or 5’9.53″ or 5’9.528″ and so on.

Which is permissible? None. That is not the answer, it is the location of the centre of the range within which the true answer probably lies.

There is a formula for determining how many digits of precision one can use to express the centre of the range. It is important to have this concept clear on one’s mind: There is a difference between the accuracy (a range), and the precision (number of significant digits) with which one can state the value for the centre of the uncertainty band.

You may have two instruments each giving, based on the number of measurements, exactly the same value for the location of the centre of the uncertainty band, and two very different widths of those bands. The broader band will be from the less accurate, less precise, instrument. Taking additional readings can provide a value for the center of the uncertainty range that is identical to the value produced by a more accurate instrument with fewer readings. The more accurate instrument saves time because fewer readings are needed to get that level of precision about where “the middle” is. But…that in now way alters the width of the band of uncertainty because it is inherent in the instrument and any calculations that were used to generate the output.

Ideally one has an instrument that gives an answer of acceptable accuracy, expressed with the required precision with a single measurement.

• Gary Wescom says:

Interesting. Aren’t you folks forgetting the possible errors in the calibration of your measuring stick? How can averaging remove calibration errors?

Let’s go back to measuring a stick with a meter stick. If the accuracy of the markings on the meter stick cannot be guaranteed to a couple millimeters, do you really believe measuring the same stick with the same meter stick a large number of times will improve on that? The best you can achieve is improving the PRECISION of measurement relative to the calibration marks on the meter stick. Any error in the marks remains. ACCURACY will be determined by the markings on the meter stick.

This concept of averaging many instrument measurements to increase accuracy would allow a rubber meter stick to be used as an instrumentation standard!

• Kip Hansen says:

Gary ==> Yes, the whole topic is a mess in Climate Science….so many issues and so many claims of absolutely physically impossibly small uncertainty ranges.

• AndyHce says:

Assuming some manufacturing control and reasonable calibration, there is some specifiable accuracy to an instrument, generally different than its precision (e.g. an digital readout volt meter may have three digits of precision but, within a particular range, only two digits of accuracy). Measurements involve both random and biased errors. Sometimes it is possible to know the bias distribution, such as the relative interval of each measurement graduation, so that can be included in the final calculation.

However, setting that aside and assuming for sake of discussion a highly accurate measuring instrument, that is each mm of length, degree of temperature, etc. that is depicted has a particular accuracy, multiple measurements of the same thing made in the same manner by one person within a short time frame will be more or less randomly distributed about the true value. Thus averaging the measurements provides greater accuracy.

• Kip Hansen says:

Andy ==> The first issue here is not a matter of error — it is a matter of recording individual temperature readings as ranges. The range is exactly correct (to the accuracy of the thermometer — in modern days pretty good). The range is the primary uncertainty — uncertainty, NOT error.

• William Ward says:

Kip,

I don’t think we are in disagreement, but I think it might be helpful to point out that different disciplines use terminology differently. What you are referring to as “uncertainty” is called quantization “error” or quantization “noise” when dealing with an analog to digital converter in an electronic design. As the resolution of the instrument is increased (number of bits in an ADC) the theoretical quantization noise or error is decreased. At least theoretically. In practice, other forms of noise can limit the practical resolution (precision) of the instrument. Of course, accuracy is a different issue from precision.

• Kip Hansen says:

William ==> I do understand where the signal-to-noise ratio comes from — which is why I point out that it is inappropriate to be applied to measurements of continuous variables like temperatures taken at schedule times at some particular point.

There is no signal and no noise. The measured temperature is the data and there is no “noise” in the data set. It is just the data.

The uncertainty first arises because we don’t record the temperature measurement, we record the range in which the temperature occurred. What the temperature was is now unknown — uncertain — in the most real sense…we just don’t know , we are entirely uncertain as to where in the range the temperature at the moment was.

this is the most basic uncertainty — information that was never recorded.

We are not dealing with an analog to digital converter — we just have an numerical data set, handicapped by all the records being ranges.

• Remy Mermelstein says:

LOL @ KIP: ” We are not dealing with an analog to digital converter”

Obviously you don’t understand a thing about A to D converters. Temperature is an analog item. For example, it can be 72.01 degrees F, or it can be 72.13 degrees F, and any value in between. The thermometer will read 72 in both cases. In fact the thermometer only gives you integral values for readings, exactly what an A-to-D converter does with the input signal.

PS, the thermometer might have 1001000 on the scale when the reading was taken.

• William Ward says:

Hi Kip,

I’m glad you replied because I see we actually do have a disagreement here. I think you are missing some important fundaments based upon what you said. I’m not referring to SNR (signal-to-noise ratio), but every system has a signal and noise. The signal is the thermal energy in the air from the sun, available to the thermometer or thermocouple. The noise could be the blast from the jet engine or heat from the idling car right next to the Stevenson Screen housing the instrument. (Both cases assuming a very badly sited instrument). Whether a person is looking at a mercury thermometer or an ADC is reading the voltage from the attached thermocouple, the signal being read has noise in it – in this case the additional thermal energy that is not what we really want to read. This is error relative to what the measurement would be if the noise were not present. There is no way to go back after the fact and figure out what was signal and what was noise but knowing there was significant noise means that there is uncertainty in the data. If you prefer to reserve the word “uncertainty” to only pertain to quantization error I won’t argue against that, but the net effect is no different.

The ADC (analog-digital converter) is exactly applicable – and ADCs are used to measure the temperature if the instrument is electronic. ADCs are at the heart of electronic (digital) instrumentation. What an ADC does is exactly what you describe. If the instrument is set to sample once every 5 minutes, then know that what the instrument does is exactly what a human does when reading a thermometer. It looks at a signal at an instant in time and measures that signal and fits it to the closest increment on its measuring scale. If the digital output is a 16-bit code it might be 0100 1101 0111 1010. It is understood the least significant bit (the last digit) is not correct per the limited resolution of the ADC. This process happens over and over at 5-minute intervals. It is exactly what happens when a person looks at the thermometer every 5 minutes.

You said, “The uncertainty first arises because we don’t record the temperature measurement, we record the range in which the temperature occurred.” This is technically not correct. We do record a measurement representing the temperature (along with noise), but there is an implied quantization error or uncertainty in that measurement. We are saying the same thing, but I think your perspective is tripping you up.

You keep referring to the dataset as being “numbers handicapped by a range of values”. I think a better way to talk about this is quantization error/noise/uncertainty. For this is what it is. Whether it is an ADC or a thermometer or a meter stick, quantization noise is inherent in any reading. Every measuring device has limited resolution/precision and every reading will have corresponding error resulting from that limited resolution. On top of this you can have reading error – such as parallax error. It’s hard to screw up reading a digital readout so I’m not sure what the equivalent reading error is for a digital instrument.

I hope this makes sense Kip because these are important points and I don’t think your understanding is complete without them. Every system has signal and noise and all measurements have quantization error. (The signal is the continuous thing that is sampled periodically. The noise can be thought of as any quantity of signal that would not be there under ideal conditions: UHI, siting, instrument generated heat, etc.) Not seeing the fact that we are dealing with a signal allows “scientists” to violate Nyquist – and in so doing guarantees that their average calculations just have even more noise/error/uncertainty. Said simply their numbers “are more wrong”.

I hope this is helpful and not pedantic.
William

• Kip Hansen says:

William ==> I do understand what you are saying. And, signal-to-noise is an analogy for something we wish to measure and the small(ish) (usually) perturbations are “noise” that are overlaid on our “signal”. However, like all analogies, the fit only goes so far.

With temperature, in the old days, the guy looked at the Min/Max thermometer, saw a value — let’s say 71.7 (a bit over 71.5) — and as instructed, carefully wrote down “72”mmeaning specifically that the temperature he saw was between 71.5 and 72.5, This use of the range in place of the more accurate discrete number (to the best he could discern it) is not perfectly analogous to noise. We don’t need to “get rid of” or “reduce” the it. We do need to take it into account, and do all subsequent calculations treating “72” as a true range.

There is ALSO noise added to our temperature record — of course there is. Some of it is “systemic” (siting, UHI, etc), some of it is “random” (idling ice cream truck with hot air from the freezer heat exchanger). We can try to deal with the systemic and random bits with the common solutions used in other data sets.

The problem with analogies is well known — sometimes we get so stuck on using our analogy that we can mess up our results by not realizing that the analogy doesn’t exactly fit — we end up using techniques that are specifically design for the real situation of the analogy (radio signals, phonograph needle output) that are not fit for the purpose to which we apply them.

• William Ward says:

Hi Kip,

I hope you will read my reply to Paramenter (just a few minutes ago). There is a lot there that I shouldn’t repeat here.

You said “The problem with analogies is well known — sometimes we get so stuck on using our analogy that we can mess up our results by not realizing that the analogy doesn’t exactly fit — we end up using techniques that are specifically design for the real situation of the analogy (radio signals, phonograph needle output) that are not fit for the purpose to which we apply them.”

What I’m saying is not an analogy. I’m not sure why you use quotes around the words noise and signal. I used quotes to introduce the concepts, but now that we are past that point, the quotes only serve to somehow place them in a position that doesn’t respect the science and the math. In the context of your reply, the quotes tell me you don’t understand. Maybe that is why you chose to refer to it as an analogy. What I’m presenting is Signal Analysis – which is an engineering discipline. The lay person – or even people who know quite a bit about math are not informed that temperature measurement falls under signal analysis and must comply with the math and laws that govern it. Climate science fails in this regard – and this is something that we need to amplify the understanding about.

Limitations to resolution or precision are inherent in any instrument and whether done in 1850 by a farmer or done by an electronic instrument in 2018 is no different. The result is quantization error or quantization noise. You may not be comfortable with the vocabulary – but that doesn’t change that this is what it is. The story about the guy writing down the nearest whole number is a very non-technical way of saying we have quantization noise. We agree, the quantization noise from human reading of thermometers with 1C resolution cannot be removed after the fact. We do need to take it into account as you say.

Measuring temperature is sampling it whether done in 1850 by a farmer or by a satellite in 2018. Sampling must not violate the Nyquist Theorem, or another type of noise is introduced – called aliasing. I talked more about this to Paramenter – I hope you will read.

• Clyde Spencer says:

Kip,
There is an old engineering joke that the US television standard, NTSC, stands for “Never Twice the Same Color.” The reality is, it is extremely rare to have the same temperature measured more than once at the same station. We have a data set of a large number of single measurements that one is not justified in averaging to try to improve the precision. The precision is typically +/- 0.5 deg F. While performing the arithmetic to calculate a mean will provide more digits, one is NOT justified in implying two to three orders of magnitude greater precision than the original data. Once again, the Rule of Thumb is that in any string of calculations the answer is not justified to contain more significant figures than the least precise factor in the calculation. Albeit, sometimes a ‘guard digit’ is retained if subsequent calculations might be performed with the numbers. However, properly, after the mean is calculated, it should be rounded off to the same number of significant figures as the original temperature measurements!

• William Ward says:

Clyde – I was looking for the “+” button… I wanted to mash it a few times in support of your post. The buttons seem to have been removed when I wasn’t paying attention. So I’ll use the old fashioned method.

+1

• Kip Hansen says:

Willard ==> The commenting functions changed a while back after a server crash (hack?) — the good news is that CA Assistant is working WUWT again.

16. Graphs in this article apparently end at about the peak of the el Nino which is somewhat misleading. Long term temperature trend is still up based on University of Alabama at Huntsville satellite measurements, but current temperatures are down to about what they were a decade and a half ago. https://pbs.twimg.com/media/Dn-e5d5UcAEH9_G.jpg

• Kip Hansen says:

Dan ==> Yes,you are right — as an author, one must use what graphics are available and are acceptable to both sides of the climate divide. The most current graphs are updated to end of 2017 –so, there we are.

Of course, this essay is not about what the temperature is — it is about how uncertain it is — or rather, how uncertain we should be abut the values presented to us.

All the latest and greatest guesses at GAST (and associated fictitious metrics ) re available from the navigation links at the top of every WUWT page : Reference Pages — Global Temperature — Climate.

17. There is a basic fundamental of averaging (see eg here) that is missing in all this. You are rarely interested, for its own sake, in the actual average of the entities you calculate. You want it as an estimate of a population mean, which you get by sampling. And if there is systematic variation in the population, the sampling has to be done carefully.

Suppose you wanted to know whether people in the US are becoming taller. So you’d like to know if the average is rising. But sampling matters. You need to get the right number of men and women. You probably need the right number of Asians, Dutch etc. Ages will matter And of course, the proportions keep changing. So comparing an average in absolute height from one of twenty years ago is problematic.

You can instead collect anomalies. That is, for each person, calculate the difference from some estimated type. 55 yo man of European ancestry, say. And of course you’ll never get that perfect. But if you average those anomalies, you have taken out a lot of the variation that otherwise might cause distortion from imperfect sampling. And your average of anomalies will be a much better indicator of change.

• Kip Hansen says:

Nick ==> Wearily, again…

No matter what you do with anomalies, if the individual measurements are not discrete values but significantly wide ranges, and the “estimated types” are only known to within a wide range of uncertainly — then your anomalies will not be precise.

This is the situation we have with surface temperatures.

Your means will always be precise. That will not change the uncertainty of the actual measurements nor of the calculations of the indicator of change — it too will have a wide range of uncertainty.

• Steven Mosher says:

Kip you are not even wrong, Again.

• Geoff Sherrington says:

Well, Steven,
What is your fundamental view on whether interpolated values can and should be used in the calculation of overall error?
That is, are they data from a nominated population of values?
Cheers Geoff.

• Kip Hansen says:

Mosher ==> Thank you for your restraint. I didn’t expect to change your ossified opinion.

• John Endicott says:

Mosh, you drive-by again

• David Dirkse says:

Kip, you really don’t get it. When it comes to surface temperatures, averaging cannot be dismissed. To get a global average, you need multiple measurement. You have the problem of spactial and temporal sampling to determine what the “global” temperature is. You have to do an average, because as you know the poles have ice and there is no ice in the tropics. So the “global” temperature is an idealized theoretical concept that can only be arrived at by sampling. . When it comes to sampling, you know that the accuracy is dependent on sample size, and the equation for such is the one for “standard error.” the “s” in the numerator is the instrument SD, but the sqrt of “N” (number of obs) is in the denominator. So to get a good reading on “global” temperature, what you need is a wealth of spatially distinct samples all measured at the same time. The beauty of “anomalies” is that by using them, you eliminate systemic error.

• Retired_Engineer_Jim says:

“The beauty of “anomalies” is that by using them, you eliminate systemic error.”

Only if the systemic error for each reporting station instrument, reporting protocol and is the same. However, if, for example, various reporting stations are using different instrumentation (e.g., an old-fashioned whitewashed Stevenson Screen and alcohol thermometer v an automated system), and each is read in a different manner (Mark 1 Eyeball and pencil on paper v electronic reading, recording and reporting). It is conceivable that the systemic error for each reporting station is different. Averaging them in any way will not remove the systemic error. And the historical record of temperature at each of the reporting stations has been measured and recorded / reported using different instruments and protocols over the years, so there is no way that averaging across the years will remove systemic error.

• Paramenter says:

‘ Averaging them in any way will not remove the systemic error. And the historical record of temperature at each of the reporting stations has been measured and recorded / reported using different instruments and protocols over the years, so there is no way that averaging across the years will remove systemic error.’

Following discussion I reckon a counterargument to that is as follows: eventually all those errors become randomly distributed and cancel out. So, say at the beginning of XX century a guy who was reading thermometer at a weather station always rounded numbers up, if the readout was between marks. He was doing that from February to May. At the other station another fellow did the opposite, i.e. was always flooring readings. He was doing that from June to October. I reckon some climatologists believe that eventually, having large sample, all those errors cancel out and calculated average will converge with true values pretty close.

Interesting.

• Kip Hansen says:

Paramenter ==> Only we don’t have that situation. What we have is that any temperature between 71.5 and 72.5 was correctly entered into the record as “72”so that the real reading by the Min/Max thermometer or the guy reading it is lost forever — he only wrote in his notebook “72”.

On top of that uncertainty (this is real uncertainty, we don’t know what the reading was except that it was between 71.5 and 72.5) we have the fact the the short guy looked up through the glass thermometer and read values a little high and the tall guy looked down and read them a little low.

This is a real life example, by the way. i did the Surface Station survey of the Santo Domingo, Dominican Republic weather station. The Stevenson Screen (still in use with glass thermometers, the ASOS blew away in a hurricane) had a concrete block next to the base. The venerable senior meteorologist explained to me that the block was for the short guys to stand on so they could read the thermometer at eye level — bu that very fewof the short guys did so, it was an embarrassment, so many the the readings of the station were off by a degree or more. He was quite cheerful about it — it was always “hot” there so a degree or two didn’t make much difference.

• William Ward says:

Kip – loved the story about the short guy.

• Kip Hansen says:

William ==> True story, too….

• >>
Kip, you really don’t get it.
<<

Well, if it isn’t David who doesn’t understand the truth table for implication. Sorry David (and Nick), but you can’t average intensive properties like temperature. The average of intensive properties has no physical meaning. Of course, mathematically, you can average any list of numbers.

Jim

• https://chiefio.wordpress.com/2011/07/01/intrinsic-extrinsic-intensive-extensive/

“An ‘average of temperatures’ is not a temperature. It is a property of those numbers, but is not a property of an object. If it is not a temperature, it is fundamentally wrong to call it “degrees C”.”

““Global Average Temperature” is an oxymoron. A global average of temperatures can be calculated, but it is not a temperature itself and says little about heat flow. Calculating it to 1/100 of a unit is empty of meaning and calling it “Degrees C” is a “polite lie”. It is a statistical artifact of a collection of numbers, not a temperature at all. (There is no thing for which it is the intrinsic property).”

Hence, if there is no thing that is an “average global temperature”, then there is nothing to the concept of “average global temperature” or the anomalies thereof. We are just talking about mathematics and statistics of no … things.

• Mike Borgelt says:

Kernodle, here we go with Units Analysis 101:

1) Adding or subtracting two temperatures gives you a temperature as a result. For example, yesterday it was 60 degrees F, and today is 55 degrees F, for a difference of 5 degrees F
.
2) Dividing a temperature by an integer results in a temperature. For example, 1/2 of 30 degrees F is 15 degrees F

3) An average of a set of temperatures is the sum of all of the temperatures divided by how many there are.

So you are wrong, the average of a set of temperatures is itself a temperature.

• >>
So you are wrong, the average of a set of temperatures is itself a temperature.
<<

It is a meaningless number–physically. In thermodynamics, a temperature is something where you can invoke the Zeroth Law of Thermodynamics. There is no way to place a single thermometer in equilibrium with the entire atmosphere. Therefore, the entire atmosphere does not have a single temperature. Meteorologists get around that fact by assuming that LTE (local thermodynamic equilibrium) holds. So you can measure the temperature of a smaller region–if it is in local thermodynamic equilibrium. To assume the entire atmosphere is in thermodynamic equilibrium is nonsense.

Averaging temperatures is also nonsense.

Jim

• Kip Hansen says:

Jim ==> The average of a data set of temperature will carry the same unit of measurement as the data points in the data set.

The average though is not a temperature actually experienced in the physical world at some average time. It is, in a sense, fictional data — data about an event that did not take place.

Try reading the Briggs essay onm smoothing.. See if you can get his concept.

• Mike Borgelt says:

Wrong Masterson.
.
You seem to think that “temperature” is what a thermometer measures. It is not. Temperature is the average kinetic energy of the substance being measured. Since the atmosphere has a finite number of particles. the temperature of the atmosphere would be sum of all the kinetic energy of each particle in the atmosphere divided by the number of atmospheric particles.

Because of this argument, a quantity called the “global average temperature” exists. Measuring it is done by statistical sampling. At a given instant in time if a suitable quantity of geographically dispersed thermometers are read at a given instant in time, the average of their readings is an estimator of the “global average temperature.” Now, increase the number of observations and you get a more precise estimate of the global average temperature.

• Mike Borgelt says:

Kip says: “. It is, in a sense, fictional data — data about an event that did not take place. ”

Wrong.

For example, suppose you had a hotel with 100 rooms. In each room you had a thermometer. If you take all 100 readings from these thermometers at 10 am in the morning, and averaged the readings, you would be measuring the average temperature of interior of the entire building, Now some guests have the AC on, and others may have the heat on. If you were to take all of the air in the building, and put it into a container, and let that air reach equilibrium (without heat loss/gain to the external environment), you’d find that the average you measured would be equal to this equilibrium temperature.

• Kip Hansen says:

Mike ==> You are missing the point. One can take an average of any set of numbers — that average does not necessarily represent that idea you might wish to assign it — there are so many unphysical assumptions in your “example” that it is self-disproving.

That, however, is not the point. The point is that since the air in the interior of the building is not continuous and not homogeneous and does not exist as a physical entity in and of itself, any numerical value assigned to its “temperature” is, in a sense, fictitious — as the object does not exist in the real world and thus the value of a property of it also does not exist in the real world.

It may or may not exist conceptually — which is a different fish of another color.

• Mike Borgelt says:

Kip says: “any numerical value assigned to its “temperature” is, in a sense, fictitious — as the object does not exist in the real world”

Wrong, and I’ll prove you wrong by a slight modification to my example. Suppose it is 40 degrees F outside. Now, if you take the 100 readings, and obtain the average, you can calculate the amount of BTU’s you would need to maintain an average interior temperature of 72 degrees F since the R-value of the building’s insulation would be a known and constant factor. From a HAVC point of view, the “object” we are discussing exists in that building. It will determine the amount of energy you’ll have to expend to keep the hotel guests comfortable.

• >>
Temperature is the average kinetic energy of the substance being measured.
<<

Yeah, well your statement isn’t exactly correct. The definition of temperature in kinetic theory is:

$\displaystyle k\cdot T=\frac{1}{3}\cdot m\cdot \overline{{{C}^{2}}}$;

where k is Boltzmann’s constant, T is the temperature, m is the mass of a gas particle/molecule, and C is the random velocity of a gas particle/molecule.

If we multiply through by 3/2 we get:

$\displaystyle \frac{3}{2}\cdot k\cdot T=\frac{1}{2}\cdot m\cdot \overline{{{C}^{2}}}$

Physicists will recognize the term on the right as the expression for kinetic energy. The expression of the right is the energy of a gas particle/molecule with exactly three degrees of freedom.

So a more correct statement is that temperature is proportional to the average kinetic energy of a gas particle/molecule. An even more correct statement is that temperature is proportional to the kinetic energy of the average velocity of a gas particle/molecule.

>>
Since the atmosphere has a finite number of particles. the temperature of the atmosphere would be sum of all the kinetic energy of each particle in the atmosphere divided by the number of atmospheric particles.
<<

Okay, in principle that is correct. You may now sum the kinetic energy of all the gas particles/molecules in the atmosphere and take the average. Averaging a few thermometers near the surface isn’t going to get you there–not by a long shot.

Jim

• Mike Borgelt says:

“Averaging a few thermometers near the surface isn’t going to get you there–not by a long shot.”

Wrong

Statistical Sampling says: you tell me how precise you want the average to be, and I’ll tell you how many thermometers you need. See the equation for “standard error,” it’s proportional to the reciprocal of the sqrt of the number of obs.

• I wish we had the edit feature back.

My statement: The expression of the right is the energy of a gas particle/molecule with exactly three degrees of freedom.

Should read: The expression on the left is the energy of a gas particle/molecule with exactly three degrees of freedom.

Jim

• >>
Statistical Sampling says:
<<

Says nothing about the physics of the problem. You just claimed to know what temperature is. Temperature is an intensive property in thermodynamics. You can’t average intensive properties. The definition of temperature you were referring to doesn’t use thermometers. So make up your mind. Are we averaging kinetic energy, or using thermometers? Notice that thermodynamic temperature must equal kinetic temperature at equilibrium. When was that atmosphere last in equilibrium?

Jim

• “Wrong Masterson.

You seem to think that “temperature” is what a thermometer measures. It is not. Temperature is the average kinetic energy of the substance being measured. Since the atmosphere has a finite number of particles. the temperature of the atmosphere would be sum of all the kinetic energy of each particle in the atmosphere divided by the number of atmospheric particles.”

If you define “temperature” to mean “average kinetic energy of the substance being measured”, then temperature is exactly the same as “average kinetic energy of the substance being measured”, and so “temperature” (defined as such) is exactly what the thermometer measures. Merely substituting the definition of “temperature” in place of the definition does not change what the thermometer measures — you have just stated what the thermometer measures in a different way (multiple words of a definition, as opposed to one word that the definition explains).

Reducing the word to its definition and focusing on molecules instead of average kinetic energy changes nothing. The average kinetic energy of a collection of molecules does NOT exist anywhere. There is NO planet existing where this average kinetic energy has any physical meaning. Compare this average kinetic energy to any selected REAL position on Earth, and you will find a difference that is ALWAYS NOT that average. Sometimes it will be exceedingly higher than this average, sometimes it will be exceedingly lower than this average. There is NO physical place in reality, however, where such an average exists — it is a mathematical fantasy that tells you nothing meaningful about any particular REAL point that went into its fabrication.

“Because of this argument, a quantity called the “global average temperature” exists.”

“This argument” does NOT give any further cause to establish any reality to the quantity called “global average temperature”. Yes, of course, the QUANTITY called “global average temperature” exists, but the QUANTITY says nothing meaningful about any real thing in the world. Too many different circumstances exist where temperatures/average-kinetic-energies are measured to say that these different measures are representing the same controlling factors of physical reality.

Now if you had, say, 100 different ideal planets, where somehow each planet had a uniform temperature throughout the ENTIRE planet of ONE TEMPERATURE, and THEN you measured the temperatures of each of your hundred planets, you might be able to speak of an “average global temperature” that had some semblance of physical meaning. I would call it an “average planetary temperature”. But, for this Earth we live on, where there is NOT a uniform temperature field over the entire planet, any talk of averaging to represent the whole planet is misguided.

“Measuring it is done by statistical sampling.”

I could statistically sample the velocities of every form of mechanical conveyance on Earth that could transport humans from point A to point B — the legs we walk on, automobiles, speed boats, fighter planes, ocean liners, etc — and I could come up with a huge data base of velocities, which I then could average to find the “average global velocity” for human conveyances. Could I then, from this average, determine anything about human conveyance? — say people who move at an average velocity of 1 kilometer per hour at such-and-such compass direction are more likely to die at a particular age?

I submit that something like this is going on, when you do what you say here:

“At a given instant in time if a suitable quantity of geographically dispersed thermometers are read at a given instant in time, the average of their readings is an estimator of the “global average temperature.” Now, increase the number of observations and you get a more precise estimate of the global average temperature.”

The precision of the STATISTICAL manipulation of the numbers is of no consequence, when the result of this manipulation makes no real sense for the particular circumstances to which the method applies.

• Crap, I messed up my bolding. Any timeline on when the edit function might return? In the meantime, let’s try that again:

“Wrong Masterson.

You seem to think that “temperature” is what a thermometer measures. It is not. Temperature is the average kinetic energy of the substance being measured. Since the atmosphere has a finite number of particles. the temperature of the atmosphere would be sum of all the kinetic energy of each particle in the atmosphere divided by the number of atmospheric particles.”

If you define “temperature” to mean “average kinetic energy of the substance being measured”, then temperature is exactly the same as “average kinetic energy of the substance being measured”, and so “temperature” (defined as such) is exactly what the thermometer measures. Merely substituting the definition of “temperature” in place of the definition does not change what the thermometer measures — you have just stated what the thermometer measures in a different way (multiple words of a definition, as opposed to one word that the definition explains).

Reducing the word to its definition and focusing on molecules instead of average kinetic energy changes nothing. The average kinetic energy of a collection of molecules does NOT exist anywhere. There is NO planet existing where this average kinetic energy has any physical meaning. Compare this average kinetic energy to any selected REAL position on Earth, and you will find a difference that is ALWAYS NOT that average. Sometimes it will be exceedingly higher than this average, sometimes it will be exceedingly lower than this average. There is NO physical place in reality, however, where such an average exists — it is a mathematical fantasy that tells you nothing meaningful about any particular REAL point that went into its fabrication.

“Because of this argument, a quantity called the “global average temperature” exists.”

“This argument” does NOT give any further cause to establish any reality to the quantity called “global average temperature”. Yes, of course, the QUANTITY called “global average temperature” exists, but the QUANTITY says nothing meaningful about any real thing in the world. Too many different circumstances exist where temperatures/average-kinetic-energies are measured to say that these different measures are representing the same controlling factors of physical reality.

Now if you had, say, 100 different ideal planets, where somehow each planet had a uniform temperature throughout the ENTIRE planet of ONE TEMPERATURE, and THEN you measured the temperatures of each of your hundred planets, you might be able to speak of an “average global temperature” that had some semblance of physical meaning. I would call it an “average planetary temperature”. But, for this Earth we live on, where there is NOT a uniform temperature field over the entire planet, any talk of averaging to represent the whole planet is misguided.

“Measuring it is done by statistical sampling.”

I could statistically sample the velocities of every form of mechanical conveyance on Earth that could transport humans from point A to point B — the legs we walk on, automobiles, speed boats, fighter planes, ocean liners, etc — and I could come up with a huge data base of velocities, which I then could average to find the “average global velocity” for human conveyances. Could I then, from this average, determine anything about human conveyance? — say people who move at an average velocity of 1 kilometer per hour at such-and-such compass direction are more likely to die at a particular age?

I submit that something like this is going on, when you do what you say here:

“At a given instant in time if a suitable quantity of geographically dispersed thermometers are read at a given instant in time, the average of their readings is an estimator of the “global average temperature.” Now, increase the number of observations and you get a more precise estimate of the global average temperature.”

The precision of the STATISTICAL manipulation of the numbers is of no consequence, when the result of this manipulation makes no real sense for the particular circumstances to which the method applies.

• Paramenter says:

‘The average of intensive properties has no physical meaning.’

Still, averages may carry some information about physical reality? Assume that a Sun worshiper all (s)he got is October monthly temperature average and average number of sunshine hours for Glasgow and Malaga, Spain. From the averages (s)he immediately see that (s)he should head for Malaga. And that’s right.

• Kip Hansen says:

Paramenter ==> Yes –the Moshism applies to GAST calculations: “he global temperature exists. It has a precise physical meaning. It’s this meaning that allows us to say…
The LIA was cooler than today…it’s the meaning that allows us to say the day side of the planet is warmer than the nightside…The same meaning that allows us to say Pluto is cooler than earth and mercury is warmer.”

But that may be all it tells us.

• Clyde Spencer says:

Mike Borgelt,

You said, “See the equation for “standard error,” it’s proportional to the reciprocal of the sqrt of the number of obs.”

I have a question for you. I assert that the annual Earth temperature distribution looks approximately as shown in my essay at: https://wattsupwiththat.com/2017/04/23/the-meaning-and-utility-of-averages-as-it-applies-to-climate/

In the article I state, “Immediately, the known high and low temperature records … suggest that the annual collection of data might have a range as high as 300° F, although something closer to 250° F is more likely. Using the Empirical Rule to estimate the standard deviation, a value of over 70° F would be predicted for the SD. Being more conservative, and appealing to Tschbycheff’s Theorem and dividing by 8 instead of 4, still gives an estimate of over 31° F.” That is, the standard deviation (SD) is tens of degrees, not even tenths of a degree.

The point being, is that there is a relationship between the range of a distribution and the SD. So, a stated sample mean value should have an associated SD, which is interpreted as meaning that there is a high probability that the true value of the population mean is within two or three SD of the sample mean. That is, there is uncertainty about the sample mean, and no matter how many decimal values are present, that uncertainty is related to the shape and range of the PDF. You are arguing that taking 100 times as many readings will allow the precision of the mean to be increased 10-fold. Assuming that the original sampling protocol was appropriate, there is no reason to assume the range will be changed significantly (except possibly making it larger).

So, my question to you is, “What is the meaning or importance of a claimed ‘precision’ of the mean that is orders of magnitude smaller than the SD, which speaks to the uncertainty of the accuracy of the sample mean?”

• >>
Robert Kernodle
September 27, 2018 at 6:07 am

If you define “temperature” to mean “average kinetic energy of the substance being measured”, then temperature is exactly the same as “average kinetic energy of the substance being measured”, and so “temperature” (defined as such) is exactly what the thermometer measures.
<<

Not exactly. The kinetic definition of temperature is valid whether or not the system being measured is in thermodynamic equilibrium. The kinetic temperature is only equal to thermodynamic temperature at equilibrium. A thermometer will only agree with the “average kinetic energy” definition when the system being measured is at equilibrium.

Jim

• >>
A thermometer will only agree with the “average kinetic energy” definition when the system being measured is at equilibrium.
<<

“Only” may be too strong a statement. A thermometer may agree with the “average kinetic energy” definition at times other than equilibrium, but it is only guaranteed to agree at equilibrium.

Jim

• Kip Hansen says:

David ==> We do have systematic error, on the individual station level — and maybe on the ASOS instrumentation level. But we start with “uncertainty” — not error — in that all temperatues between 72.5 and 71.5 are recorded as the same”72 +/-0.5″ — a range.

There is no error at all — the ranges are exactly correct (barring the small possible instrumental errors). There is no standard error — we are dealing with averaging ranges. The true uncertainty remains uncertain — we know the uncertainty for GAST = 0.5K (ref: Favin Schmidt) — that uncertainty cannot be unknown by shifting attention to “means of anomalies”.

Doing so is a method of fooling ourselves about our uncertainty — like the politician who answers every question from the Special Prosecutor with “Not as I recall”. He claims to UNKNOW his own known past.

But — we DO KNOW the uncertainty — and it is an MEAN about which we know the uncertainty as GAST(absolute) is a mean, and Dr. Schmidt very graciously admits its minimum uncertainty as+/- 0.5K.

The change in the mean is known by simply looking at the GAST(absolute) data — it is a simple graph, with little variation, there is no reason to do anything other than look at it.

The ONLY reason shift to anomalies is to circumvent the fact that GAST(absolute) comes with the +/-0.5K uncertainty….Gavin Schmidt admits this freely.

• Clyde Spencer says:

Kip,

The anomalies are being presented with more precision than is rigorously warranted. If you take the average (of averages of averages) of a 30-year period, and subtract it from the daily median, computational protocol demands that you retain no more significant figures to the right of the decimal point than the least precise temperature (daily median=1 +/-0.5 deg). That is to say, the daily anomaly is no more precise than the daily median!

• Kip Hansen says:

Clyde ==> I’ll stick to the Mosher-ism: “The global temperature exists. It has a precise physical meaning. It’s this meaning that allows us to say…
The LIA was cooler than today…
it’s the meaning that allows us to say the day side of the planet is warmer than the nightside…
The same meaning that allows us to say Pluto is cooler than earth and mercury is warmer.”

We got that precision nailed down….

• Jeff Alberts says:

David D, you don’t get it. Temperature is an intensive property of the item, or location, measured. Averaging measurements from one station with any other station is physically meaningless, as is any attempt at global temperature.

• Kip Hansen says:

Jeff ==> There is a lot of smearing (spreading of intensive properties across wide areas) in Climate Science. BEST Pjt uses the geology tool, kriging, to guess at temperatures not sampled — this for a property that is a not necessarily evenly spread. Lots of nutty things go on in the attempt to find a ONE NUMBER solution to the question: What is the Global Temperature? The current solution is to use anomalies to determine global change.

• Jim Gorman says:

You are using a sort of strawman here yourself. If your measuring stick only measured in say 50 cm chunks and you rounded your measurement to the nearest 50 cm value, would your anomaly be accurate, or would it have an uncertainty of +- 25 cm? Could you tell from that uncertainty what was happening with your population to within 50 mm? Remember, lots of quotes of temperatures go out to the hundredths of a degree even though the readings are only accurate to +- 0.5 degrees.

• Jim Gorman says:

The “mean” of a population of measurements does not remove the measurement error. Read the article again and try to understand. For example, take two readings, one 24 deg +- 0.5 and another 30 deg +- 0.5 degrees. The mean is 27 deg but the error term is still +- 0.5 degrees and no amount of averaging will remove that error. So you don’t know the actual temperature inside the range of 27.5 deg to 26.5 deg.

Read Kip’s explanation again. You can calculate the mean out to 10 decimal places but all you’re doing is saying the mean of your calculations is very accurate. That is not the same as saying the mean is an accurate measurement of the temperatures.

• Jim Gorman says:

The only way you can claim the accuracy of the measurements is reduced is to have multiple measurements of THE SAME THING. In other words, line up 100 people to read one, individual thermometer as quickly as they can. Then you may claim the accuracy of the measurement is better by averaging all 100 readings.

• David Dirkse says:

If you are concerned with anomalies Jim, you can use my 10 foot stick measurement once, then come back and do the same measurement 5 year later and with as much accuracy as you want, tell if the average height of the population increase, decreased or stayed the same over the five year interval. The difference between the first and the second samples would constitute the “anomaly.”

• Jim Gorman says:

If you mean the only measurements I can use are 0 feet and 10 feet. There is no way to determine where the actual measurement is therefore the average is meaningless. Sure, you can calculate a mean, but tell everyone what it means.

Even if you use the same people, although they grow , will enough of them do so to make the person taking the measurement round up to 10 feet? Remember, with your device you can only round to 0 feet or 10 feet. In other words, is the temperature 27 degrees or 28 degrees?

• David Dirkse says:

Gorman says: “Remember, with your device you can only round to 0 feet or 10 feet.”
..
Please re-read upthread where I posted: ” if you take a stick that is 10 feet tall with markings at one foot intervals”

• Clyde Spencer says:

Nick,

• Thanks, Clyde. The link is here. Actually, it’s a silly reason – I had made a spelling mistake in the original title, echoed in the link. The checker noted it, so I fixed it, and broke the link 🙁

• Clyde Spencer says:

Nick,
Thank you for the corrected link. There is some interesting material there. However, what seems to me to be a glaring error, is not addressing the error of raw measurements. For example, you present the theory of integration and how it can be used to calculate the volume of a sand pile. Like any good mathematician, you assume exact numbers. That is, you ignore the inevitable errors in the real world. In your sand pile case, the apparent depth of the measuring stick may vary by as much as two or three sand grain diameters. Additionally, if the sand is not well-sorted, a large grain on the bottom may keep the stick from getting as close to the bottom as in the other locations sampled. So, you have a mix of known confounding factors (a range of two or three grains where one must make a subjective estimate of the best average), and unknown factors (what is going on at the bottom where it can’t be observed). What you AREN’T doing is providing a rigorous analysis of the errors that modify the results of your theoretical, ideal analysis. In the real world, engineers and scientists don’t have the luxury of idealizing a situation and dealing with only exact numbers.

• RACookPE1978 says:

The engineer can, however, give you a very “accurate and precise” calculation for the number of sand grains in the sandpile.
For a perfect cone of an average pile of average damp sand at an average recline angle of the sand on a perfectly flat surface consisting only of average sized sand grains.

Now for your ACTUAL pile of sand ….. Give me the money and the time, and I WILL count every grain of sand in that pile. And be able to tell you EXACTLY how many grains of sand were in it. But your actual pile of sand may, or may not, have ANY value near the AVERAGE value of that theoretical classroom-lecture pile of sand.

You will then have a data set of 1.

• Clyde,
“In the real world, engineers and scientists don’t have the luxury of idealizing a situation…”
Well, the sand might be a place to start talking about the real world. In fact, estimating the volume of a pile of sand is a common real world activity. People buy it by the cubic yard. How is the payment determined? Probably mostly by eye and trust. But a buyer who wanted to check would have to do something pretty much as I described. And he wouldn’t be worrying about a few grains stopping the probe reaching the bottom. And you don’t get to put error bars on your payment.

But in fact I am describing basic calculus, as it goes back to Newton. Engineers do integration too. And they use they theory that I describe.

• Clyde Spencer says:

Nick,

And when you buy sand by the truckload, you don’t need a very accurate or precise estimate of the amount of sand, because you are going to be paying something like 20 per ton at the source, and only have the ability to pay to the nearest penny, assuming that the supplier is going to be worried about a price that precise. What I’m castigating you for is not paying attention to the details of reality, and pretending that everything is exact. What you should do is calculate the area under the curve for the mean value, and then do the integration with an upper-bound and a lower-bound, as determined by the error in the measurements. You only do the calculation for the mean value, which you treat as being exact. • Jaap Titulaer says: Repeating what I said above, now updated to use same for anomalies: These are not repeated measures for the same phenomena, these are singular measures (with ranges) for multiple phenomena (temperature measured at multiple locations). The reduction in uncertainty by averaging ONLY applies for repeated measures of the (exact) same thing, i.e. the temperature at a single location (at a single point in time). NOT to the averaging of singular measures for multiple phenomena. Also measurement error is nice to know, but when small to stddev of sample population it hardly matters. What one does when averaging temperatures across the global is asking what is the typical temperature (on that day). Say you do that with height of recruits for the army. We use a standard procedure to get good results and a nice measurement tool. Total expected measurement error is say 0.5 cm. We measure 20 recruits (not same recruit 20x). Here are the results: recruit # height (in cm, all +/-0.5 cm) 1 182 2 178 3 175 4 183 5 177 6 176 7 168 8 193 9 181 10 187 11 181 12 172 13 180 14 175 15 175 16 167 17 186 18 188 19 193 20 180 Average 179.9 StDev (s) 7.2 95% range min max 165.5 194.2 Average 179.9 +/-0.5 Min 165.5 +/-0.5 Max 194.2 +/-0.5 Remember that these are different individuals, so not repeated measures of the same thing, but multiple measures of different things, which are then averaged to get an estimate of the midpoint (average) and range (variance). Both those min, avg and max still also have that measurement error, but we usually forget all about that because it is so small compare to the range of the sample set. Now we go for a height ‘anomaly’. Let’s say tha the last 30 years the typical, average height of recruits (at age 18) in county X was : 178.2 (and that had the same measurement error, honest). Average 179.9 +/-0.5 179.4 179.9 180.4 Min 165.5 +/-0.5 165.0 165.5 166.0 Max 194.2 +/-0.5 193.7 194.2 194.7 base 178.2 +/-0.5 177.7 178.2 178.7 Avg Anomaly 1.7 (0) 1.7 1.7 1.7 Min Anomaly -12.7 (0) -12.7 -12.7 -12.7 Max Anomaly 16.0 (0) 16.0 16.0 16.0 Notice how measurement inaccuracy magically disappears? That’s because I substracted an earlier average (with the same measurement issue) which happens to have the same range. That is just the measurement uncertainty that I make disappear. But the range within the actual data is very large, and that does not disappear by using anomalies. The correct anomaly is not 1.7 but is in a 95% range -12.7 to +16.0 Let’s redo the averaging, now with anomaly adjustment before calculating averages and standard deviations: recruit # height Anomaly 1 3.8 2 -0.2 3 -3.2 4 4.8 5 -1.2 6 -2.2 7 -10.2 8 14.8 9 2.8 10 8.8 11 2.8 12 -6.2 13 1.8 14 -3.2 15 -3.2 16 -11.2 17 7.8 18 9.8 19 14.8 20 1.8 Average 1.7 StDev (s) 7.2 95% range min max -12.7 16.0 And the result is exactly the same. • Kip Hansen says: Japp==> Thanks for your long explanation…you just can’t get rid of that original uncertainty …. • Jim Gorman says: +100 • “Let’s redo the averaging, now with anomaly adjustment before calculating averages and standard deviations” Like Kip, you just don’t get the difference between anomaly of mean and mean of anomalies. It makes no difference if you subtract one number from the whole sample. The idea is to subtract estimates that discriminate between the sampled, to take out that aspect of variation which might confound with your sampling process. To modify your example, suppose they were school cadets, age range 14-18 say. Then the average will depend a lot on the age distribution. That is a big source of sampling uncertainty. But if you calculate an anomaly for each relative to the expected height for that age, variations in age distribution matter a lot less. Then you really do reduce variance. I’m surprised people just can’t see this distinction, because it is common in handicapping for sport. In golf, you often calculate your score relative to an expected value based on your history. If you want to know whether the club is improving, that “anomaly” is what you should average. We have a Sydney-Hobart yacht race each year. The winner is determined by “anomaly” – ie which did best relative to the expected time for that yacht class. Etc. • Kip Hansen says: Nick ==> A little more honesty wouldn’t be misplaced. In your system you first find medians, then you find averages (means), then you find anomalies of those means from other means, and only then do you find the mean of anomalies. • Kip, “A little more honesty wouldn’t be misplaced” More reading, thinking, and less aspersion from you, would be better. There is no measure of medians or interim means in the example here on heights. You simply subtract the age expectation from the measured height. But if you are referring back to temperature measurement, your assertion is the usual nonsense too. There is no median calculated. It’s true that conventionally the anomaly is formed at the monthly average stage, to save arithmetic operations. But you could equally form daily anomalies, since at that level it is another case of subtracting the same number from each day in the month. So it doesn’t make the slightest arithmetic difference in which order it is done. Where anomaly makes a difference is in forming the global average for the month. That is when different references (normals) are being subtracted from the different locations in the average. That is when anomaly makes a difference. • Kip Hansen says: Nick ==> The Daily Averages are MEDIANS — read the ASOS User’s Guide, I linked it above. If official Daily Averages are used in your calculations any where, then you have started with Medians. The MONTHLY average, according to the specs, is the MEAN of the Daily Medians. So — we have, as I stated, first medians, then means of medians, then the Anomaly of Means (Monthly Mean subtracting Climatic Mean), then somewhere up ahead, you take the Mean of Anomalies. Please correct me if that is not an accurate description of the steps. I only point it out because of your silly carping about how “Means of Anomalies” is different from “Anomalies of Means”. In effect and in fact, your method does both. • Kip “The Daily Averages are MEDIANS” You just get this stuff wrong in every way. An average of daily min and max is not a median in any sense. A median is an observed vale that is the midway point in order (so it never makes sense to talk of the median of two values). The daily average is not an observed value. But in any case, people often don’t calculate an actual daily average. GHCN publishes an average max for the month, and an average min, and combines those. However, it doesn’t matter in which order you do it. But you are continually missing the real point of anomalies. They have no useful effect for a station on its own. You can average min and max whenever, add or subtract normals, whatever. Anomalies matter when you combine different locations. They take out the expected component of the variation, which otherwise you would have to treat very carefully to be sure of correct area representation of values. • Kip Hansen says: Nick ==> Did you answer the question on your method for arriving at a Global Average Temperature anomaly? I thought we were talking about whether you do or do not take anomalies of means and/or means of anomalies…..I see that you wish to ignore all the early steps — medians/means of daily Min/Max, and rely on the fact that these statsically dicey parts are auto-magically done by “the GHCN computer”. So the GHCN — or your software — takes the Mean of Daily Maxes and the Mean of Daily Mins and then takes the Mean (mid-point — the median of a two number set is incidentally the same as the Mean, in all cases) of those two means (we could rightly call this a Mean of Means….) to arrive at what they are then calling the Average Monthly Temperature for the Station? So, next, are you taking the Anomaly of that Mean of the Means by subtracting the long term Mean of those same Means for a 30-years period? and then the Means of the Anomalies of all the stations to get the Global Mean? All in all, the system involves Means of Maxes and Means of Mins, then the Median of the two number set (you may call them Means if you wish — both are correct and both the same for a two number set), then the Anomaly of the same hodge-podge for 30 years on the same month (which I point out is a Anomaly of Means), and then aat some distant step later — you take the Mean of the Anomalies — which you will label the Global Average Surface Temperature anomaly. So, we see here BOTH Anomalies of Means and Means of Anomalies…..correct? Of course all that finding of Means reduces variability — it is just a form of smoothing. It does not reduce the uncertainty of the values of the data set — nor of the real world uncertainty surrounding the Global Average Surface Temperature — in any of its forms and permutations. • Kip, “So the GHCN — or your software — takes the Mean of Daily Maxes and the Mean of Daily Mins and then takes the Mean (mid-point …” I can’t see where your worrying how the site calculations of monthly average are done at a particular site. The various ways that it could be done will produce essentially the same arithmetic done in different order. But anyway: My program, TempLS, uses as input GHCN Monthly TAVG, unadjusted (and ERSST V5). That is a published file, as are the monthly averages of max and min (TMAX and TMIN). Most of the older data in GHCN comes from a digitisation (of print or handwriting) program in the early 1990’s. There was very little digitised daily data then; GHCN used printed monthly data from Met Offices etc. Mostly that was recorded as monthly average max and monthly min, which they would average. GHCN Daily, which is a databank of digitised daily, has become substantial only in the last decade. GHCN still takes in monthly data. It is submitted by Mets on CLIMAT forms, which you cen see eg here. They submit Tx (max), Tn (min) and T (avg). Whether they calculate T as (Tx+Tn)/2 or daily is between them and their computers. It is always true that T=(Tx+Tn)/2 (rounded), but it would be so either way. • Clyde Spencer says: Nick, When there are only two numbers in a list, the median and mean are numerically equal. However, the reason I suggested that median should be used to describe the daily ‘average’ is because it will ALWAYS be half-way between the two extremes. Whereas, in a typical arithmetic mean, it would be unusual for the mean to be in the middle of the list, except for a symmetrical distribution. That is, in the real world of temperatures, I wouldn’t expect the mean of temperatures taken every minute to be half-way between the diurnal high and low. (Maybe over the equatorial oceans at the equinox.) • Kip Hansen says: Clyde and Nick ==> The process for determining the Daily Average [ (Tmax+Tmin)/2 ] calls for arranging the data set in order of magnitude, highest to lowest (or vice versa) and them finding the middle value between the two — if there is no middle value (as in the case every time there are only two values — which is our case) then one calculates (V1 + V2)/2. Since the process involves identifying the highest and lowest values, we know we are finding a median. Finding an arithmetic mean does not involve any sorting of the data prior to the procedure. So, Daily Average is a median between the Max and Min for the 24-hour period. • Clyde Spencer says: Nick, You said, “A median is an observed vale that is the midway point in order (so it never makes sense to talk of the median of two values)” That definition only works for a list with an odd number if items. For all even numbered lists, one still has to find the the midpoint of two numbers. That is, the median. • Clyde, Yes. The median of 1:9 is 5, but what about 1:10? 5 or 6 or what? You could make an arbitrary choice, ensuring that the median is still a member of the set, or you could split the difference. Fortunately with large disordered sets, there is a good chance that the choice will be between equal values. Often you use median precisely because it is a member of the set. The median number of children per family might be 2. That might be a more useful figure than saying the mean is 1.7. But it still doesn’t make sense to talk of the median of two points. It can only be one or the other or the mean. You aren’t adding any meaning. • Clyde Spencer says: Nick, We must live in alternate universes because once again we see things differently. Maybe it is because everything is upside down in Oz. 🙂 The sources I have checked recommend that for lists with an even number of elements, one should interpolate between the two central numbers to determine the median. So, one is ALWAYS dealing with a pair of numbers to calculate the median, for even-numbered lists. A list of just two numbers (T-max, T-min) becomes a degenerate special case that, nevertheless, follows the same rule as all even-numbered lists. You seem to imply that one can only use numbers that actually occur in a list. That isn’t the case with the mean. That is, one does not select the measurement that is closest to the calculated mean. And it isn’t the case with the mode, where one typically has to round off or bin the measurements to insure that a given range of values has elements that occur more than once. The main reason we have these measurements of central tendency is to be able to quantify the skewness of a distribution. With coarse quantization of a measurement, using only actual numbers (+/- 0.5 deg F) calculations of skewness using median will have less accuracy using only actual measured values, versus using the interpolated median (particularly for small sample sizes). • Kip Hansen says: Nick ==>Referring to the currently used Daily Average of the median of the Max and Min is abetter description of both the process and the result — and disambiguates it from what would normally be called a “daliy average” (all values added/number of values). • Kip Hansen says: Nick ==> After some digging around in the GHCN and US daily summaries I have confirmed that your GHCN_Monthly TAVG is in fact the median of the Monthly Mean of Daily Maxes and the Monthly Mean of the Daily Mins. So you start you whole program with the Median of two Means — a value that does not represent the Average Temperature at the weather station for the month in question in any scientific way at all — and is just an artifact of the sorry state of past weather records — which, unfortunately, can not be made any better. There is a point in time when records began to be better kept from which sensible daily averages and monthly averages could be calculated — even annual averages. Instead we continue to use the inadequate, unscientific method which does not reflect the thing we want to know — how much energy is being retained by the climate system as sensible heat? and is that really increasing? decreasing? or staying the same? And, if we have a scientifically defensible answer, how uncertain are we about its value and the value of change? • Kip Hansen says: Nick ==> Regarding your point “But it still doesn’t make sense to talk of the median of two points. It can only be one or the other or the mean. You aren’t adding any meaning.” We are most certainty adding meaning — at least clarifying meaning. Daily Avg is NOT the average of daily temperatures. But calling it Daily Average implies that it is. Specifically calling it the Median between the Daily Max and the Daily Min makes this clear. A median is a type of average, but we are not taking an average of daily temperature — we are finding the median between the Max and the Min — a different animal altogether. • Jaap Titulaer says: Nonsense. Any time you take an average you will also have variance. You can’t get rid of that variance magically. As I just showed above, it does not matter when you switch to using anomalies, you’ll keep the variance. All you do is that you remove the measurement uncertainty, which is something else. 18. old engineer says: Does GISS really get daily temperatures by averaging the daily max and min temperature? I suppose this would be okay if the daily temperature distribution was Gaussian. Our newspaper used to publish the hourly temperatures for the previous day. Their distribution was almost never Gaussian. And the lowest temperature was occasionally not at night. I always wondered what be a good representation of the “average” temperature for the day. I don’t think this is a trivial problem. If you can’t get the daily temperature right, how are you going to get the monthly and yearly temperatures right? • Kip Hansen says: old engineer ==> Daily Averages are in fact determined by the Hi+Low/2 method. Check any station data set. • Clyde Spencer says: Kip, Which fundamentally is the median of two numbers. So, the data processing includes calculating a daily median (and its associated anomaly) followed up by weekly and/or monthly means (which may or may not account for the months having different number of days), and followed up by annual means. Each step applies a low-pass filter, which makes the variance look smaller. The calculated anomalies are used with more significant figures than warranted, and then used to infill over large areas where there is no data. A sorry state of affairs! • Kip Hansen says: Clyde ==> I’ll second that! • Clyde Spencer says: Kip, I should have said the first step is calculating the median of two extremes, without regard to how long the temperatures lasted, and then claiming that we have correctly characterized the average annual temperature. • Kip Hansen says: Clyde ==> Yes, that is exactly how I see it and this is confirmed by the User Guide for the ubiquitous ASOS system, at least for the Daily Station Averages. • GISS doesn’t measure temperatures. But historically, the people who do recorded the daily max and min. It isn’t perfect information, but it isn’t nothing. And it is what we have. • Reg Nelson says: And what we have is bad (and incomplete) data. Bad data leads to bad science. A true objective scientist would seek the truth, seek good data and good science. A politically motivated scientist would say “it is what we have” and push forward with their non-scientific agenda driven propaganda. • Michael Jankowski says: “…GISS doesn’t measure temperatures…” He didn’t say they measure anything. He asked if they calculate an average based on min and max. 19. “It is a trick to cover-up known large uncertainty” I think that there is a flaw in the use of temperature anomalies in general because the assumed constancy of the seasonal cycle is not found in the data. Please see On the uncertainty issue, Hadcrut4 does include uncertainties. The uncertainty is presented here in graphical form and their impact assessed. But there is something funny about them. Please see https://tambonthongchai.com/2018/09/17/hadcrut4-uncertainty/ • Kip Hansen says: Chaamjamal ==> The actual “uncertainty” they discuss is “THE 95% CONFIDENCE INTERVAL OF UNCERTAINTY.” It is a statistical animal — it is not the real uncertainty — 20. HD Hoese says: Where and when did the use of the word anomaly arise in statistics? As in the first statistics book. The only thing I know of close to precision use is in astronomy, but most use is for an irregular type number, not too confident. Sounds too much like the use of normal for average, which gets drilled (or should) into students as the mean. The most critical first and important application before you apply all these is (or used to be) care in sampling of attributes. Agree with S. Skinner above but the dictionary definition is different. 21. Robert B says: Something that I noticed with Australia’s BOM records in recent times. The mean of half hour readings can be a few degrees different to the mean of the highest and lowest values but the official maximum is usually higher of up to two degrees and the mean of the official max and min comes very close to the mean of half hour readings. If I was more cynical, I would say that this was deliberate. Temperature is an intensive property but of the probe at any particular time. The mean of the max and min is not. Its not as if its mean of a well spread out sampling. Its the mean of two different times when the thermometer was affected by two very different air masses. • OweninGA says: true, but they have also been caught using instantaneous readings instead of 2 minute averages on their electronic gauges, so anything is possible. • Kip Hansen says: Robert ==> You can find these same differences in any complete record of station data. It is not intentional — it is an historic artifact from when temperatures were measured with a Hi/ow (or Max/Min) thermometer — like this one.(at least, the same principle). So they still figure the Daily Average the same way from the only two values that were recorded in times past. • Robert B says: I checked quite a few years ago how close the mean of the min and max would be to the mean of 24 hours of readings. I was surprised that they were so close considering how much humidity and cloud cover has on the measurements (for an arid area so mins and max tend to vary a lot) and , for the few days that I checked eg. Within a fraction of a degree. I was expecting to show that the mean of the two extreme readings were not anything like an intensive property that a global average of would be meaningful, even if a complete record of evenly spaced measurements. Still bugs me that it was close with the official maximum rather than the highest temp recorded on the published temps eg http://www.bom.gov.au/products/IDS60801/IDS60801.94648.shtml The official max was 19.1. A couple of degrees difference and its clear both are independent indicators of changes but together are meaningless. • William Ward says: Robert B, Check out some numbers from the NOAA Climate Reference Network. https://www.ncdc.noaa.gov/crn/qcdatasets.html Using DAILY data for a particular city and date, look up the DAILY MEAN Temp. Then look at the SUB-HOURLY data for the same city and same day. Data captured at 5-minute sample rate. Calculate the average for that day (integrate). Compare to the DAILY MEAN. In the limited examples I have run, I see ~0.4C, 0.7C, 0.9C, 1.0C differences between the 5-minute and daily data. As I said in another post: the temperature signal is not a purely sinusoidal 1Hz signal. There are higher frequency components. Changes happen over 5 minute spans. Its is beyond me how climate “science” gets to ignore the Nyquist Sampling Theorem and basically violate it with (Max+Min)/2. The instrument temperature record is a dumpster fire. Its a monument to how to not be scientific, despite all of the ivory-tower mathematicians deployed in its name. 22. Dr. S. Jeevananda Reddy says: Does this has any significance with manipulated/adjusted data series. At individual station level there is a standard procedure to get averages. The same procedure is used to get state averages and country averages. The basic problem here is: they don’t represent the climate system existing in the region or country. Majority of the met stations are located in urban areas. Few stations are located in rural. Also, with passing of time network non-linearly increased and now they are coming down. To over come this problem at regional, country or global level satellite data serve the purpose but unfortunately here also entered the warmist groups to manipulate data. In fact more than a decade back, at a INDIAMET symposium, some groups presented analysis results based on the satellite data released by Ahmedabad group. At the sysmposium, they said the data is not correct. Then I questioned them: why you didn’t tell the users of your data before coming to the conference but no response. Also, there is some other problem: drawing the isolines. Dr. S. Jeevananda Reddy • Kip Hansen says: Dr. Reddy ==> Your comment is spot on. The Daily Averages of stations do not reflect the climatology of their geographic regions — for many reasons, including station location, UHI effect, and the fact that the “daily averages” are really medians of the Hi/Low. What the temperature records, up to and including GAST, don’t represent faithfully is whether or not, and if so by how much , the planetary climate system is gaining energy as sensible heat. Because of the disconnect between recorded temperatures and actual local temperature climatology, the isolines (isotherm lines) will be out of kilter with reality. 23. Jim G says: Dear Mr. Hansen. I think there is another, equally important property of measuring instruments that is missing. That is the property of repeatability. Typically, an instrument will read inaccurately, but they will display the same inaccuracy each time. eg. a NIST traceable instrument that accurately measures to .01C is used to calibrate a mercury thermometer. At 25.00C, the thermometer visually reads between 25 and 25.5C, it is thus within the measurement accuracy of the thermometer. The next time the temperature is brought to 25.00C, the thermometer will still show ~25 1/4C, it would not read 24.5 which is still in the tolerance of the instrument. Thus the temperature is inaccurate, but the inaccuracy is repeatable. As for trends, if the repeatability of the instrument is high, say .05C, then the trend would have a precision that is 10x more accurate than the actual values. In the Navy, a reading considered accurate of an analog scale would limit precision to 1/2 the value of the graduation increments. I also do not recall ever seeing signal to noise ratio discussed when it comes to the trend lines and natural variation. All of that said, when temperature adjustments that are equal to or greater than the anomaly, well, all bets are off. You might as well throw darts. • OweninGA says: Jim, That is assuming that the person reading the thermometer is likewise repeatable. My experience with humans on analog scales is one person can be all over the map reading the exact same reading. It isn’t the instrument which is necessarily picking up the +/-0.5, it is the eyeball reading it. That is even neglecting the days when it is -20 and the person just wants to get his readings and get back into the warmth inside, or it is 102 in the shade and the person wants to get out of the sun and cool off, but has to get this task done. I think +/- 0.5 is probably a little generous in the extremes of summer and winter, but probably about right for the more clement conditions. • Kip Hansen says: Jim G ==> Now days, almost all important stations are using Automated Weather Stations — Automated Surface Observing Systems. Full details at https://www.weather.gov/asos/ with the User’s Guide (really exactly what they do) in a pdf here.. I my opinion, noise to signal ratios apply perfectly fine to radio signals and other messy data. Weather and temperature do not have signal and noise — the record of temperature is simply data — it does not “contain” signal obscured by noise. 24. drednicolson says: Anomalous anomolies anomolize anomolously. • Kip Hansen says: dred ==> Good one! 25. steve case says: First the night and day time anomalies are averaged up from which the annual winter and summer anomalies are computed which in turn are averaged from the tropics to the poles to produce a number that we are expected to think actually means something. 26. Thomas Graney says: And what of the satellite measurements? Are they subject to this same self-foolery? • Kip Hansen says: Thomas Graney ==> That’s a good question. Anytime we see a very precise value for a metric that is known to be highly uncertain — you know someone is fooling themselves. • drednicolson says: Like Wayne’s World’s “No Stairway” sign in the guitar shop, some statistics departments need a “No Texas Sharpshooting” sign prominently displayed. 🙂 At least in the original story/joke, the Texas Sharpshooter knew what he was doing. • Satellite measures absolutely have to use anomalies, and they do. What do you think the average temperature of the troposphere would be? What would it mean? It would depend entirely on how you weight the various altitudes. 27. Michael Jankowski says: I liked it when GISS or NOAA was presenting annual estimates of global temperature to 0.01 degrees F…then the one year adjusted their methodology so that they shifted 3-4 degrees warmer. So much for that pretend 0.01 degree accuracy. 28. fred250 says: I had a great laugh at that Trenberth flux graph. Firstly , it was 2d non rotating. but the really funny thing was that their fluxes were often +/- 2 digit numbers.. then they claimed to get an imbalance of 0.6 whatevers it was. So FUNNY !! (someone might have access to that little piece of mathematical garbage) If they can seriously produce something AS BAD as that, they should be immediately sacked from any position related to anything to do with data or maths. 29. ATheoK says: Excellent article, Kip! “Dr. Schmidt readily admits it is around 0.5 K.” Which means that 0.5°K. is the absolute minimum uncertainty. Several times, here on WUWT, engineers and others experienced in defining real world uncertainty have listed all uncertainties that should be included in Schmidt’s or the IPCC’s anomalies. Perhaps, at some point, WUWT could capture those real world recommendations into a reference page? One certain missing uncertainty is ‘adjustments’. That daily temperatures are undergo constant adjustments is admission that uncertainty is much higher. • Kip Hansen says: ATheoK ==> I should have said, +/- 0.5K of course. And that is the “absolute minimum uncertainty” of the temperature record — that’s how we have recorded it — as uncertainty ranges of +/- 0.5…. We have to admit in here somewhere that the +/- 0.5 of the original record is +/- 0.5 Fahrenheit — other uncertainties make it add up to Schmidt’s +/-0.5K. 30. Percy Jackson says: Kip, My question is so what if you are right? Even if the uncertainty is as large as you claim it to be that doesn’t effect any of the underlying facts just our ability to measure them. Taking the central point of the temperature and looking at how it varies with time will still show a considerable rise over the past century even if we are not 100% sure of the precise value. Furthermore if the uncertainty is as large are you are claiming then I can claim that climate models fit the measured temperatures to within the uncertainty and so they can be relied on to predict the future. Furthermore a rising global temperature is in agreement with all other measurements such as global sea level rise, energy imbalance at the top of the atmosphere, earlier dates for spring, earlier harvesting of grapes, steady decline of arctic sea ice over decadal time scales etc. Taken all together the evidence for global warming over the past century is overwhelming. • Kip Hansen says: Percy ==> So what? Not an unfair question. Gavin Schmidt says the uncertainty he admits means, among other things, that the whole “warmest year” contest is nonsense — we can’t tell one year from the one before — given the real uncertainties, most years of the last decade are indistinguishable from one another — and that also means that it is uncertain whether is is getting warmer or staying the same on a decadal level. We are pretty sure that is generally a bit warmer now that it was at the end of the Little Ice Age. And, a little less certain, but still pretty sure, that the first 17 years of the 21st century are a little warmer than the flat/or/cooling 40 years from 1940-1980. We are not certain at any quantitative level at a precision of hundredths or tenths or even halves of a degree K. Given that the real uncertainty according to Dr.Schmidt is +/- 0.5K — we are not yet sure that the 1980s temperature is really different from last years temperature — it is within the uncertainty. Thus Policy Recommendations based on the panicked claims of runaway global warming are not justified — we are not yet even certain about the last 40 years…. • Percy Jackson says: Kip, I would disagree that we are not yet certain that the 1980s were different from last year’s temperature. This is the difference between systematic errors and random errors. The systematic error in the global temperature might be +/- 0.5K but as long as that is constant then we can certainly tell whether one year was hotter than another. Most of the errors you are discussing are systematic errors and thus cancel out when you take the difference to create an anomaly. I would also disagree that the policy recommendations are not justified. Imagine if you went to the doctor and they told you that you had a tumour but weren’t certain whether or not it was cancerous. And furthermore the doctor said that by the time it was big enough for them to be certain then it would have spread and your chances of dying would be 100%. Would you wait to be certain or take the doctor’s recommendation and have it removed now? The policy recommendations are based on the expert’s best predictions about the future. Whether or not society should act on them depends on a wide range of things such as a cost/benefit analysis how much we care about people in other countries, people living in the future who haven’t been born yet etc. • Reg Nelson says: Percy, it does make a difference, a huge difference. We don’t know what the global temperature was in 1850 and certainly not to one tenth of one degree Celsius. Policy decisions based on pseudo-science are bad decisions. The whole “what about the children” argument is laughable. What about the 21 trillion dollar debt that our children and grandchildren will inherit? Do you care about that? It’s a far more tangible problem, but I don’t see anyone talking about that. • Percy Jackson says: Reg, Again you are forgetting the difference between systematic errors and random errors. If I weight myself on my scales at home I get an answer that is about 1kg lighter than if I use the scales at the gym. Hence I might claim that I know my weight with an accuracy of +/- 0.5 kg. However if I weight myself repeated on the same scales (over a short period of time) I get the same answer to within the precision of the scales which is 0.05 kg. Now the relevant question is what is the smallest amount of weight I can gain and still be certain that I have gained weight – is it 0.5 kg or 0.05 kg? The answer is 0.05kg which is because I am measuring an anomaly and the accuracy of that is 10 times the accuracy with which I know my actual weight. The whole “what about the children” argument is the whole point. The question is how much do we care about the children especially child who will be born in other countries and who will have a lower standard of living if the predictions about global warming are true. Which is a political and moral question not a scientific one. • Steve Reddish says: Percy, does your home scale have a digital readout? Does it display tenths of a kilogram? I recommend you test it by weighing yourself twice to be sure of consistency, then weigh yourself again holding 2 tenths of a kilogram (measuring cup + water). I predict your indicated weight will usually not change when you run this test. I tested my home scale and determined the scale was rounding my weight to the nearest .5 kilogram, then converting that number to pounds – displayed to 1 decimal space. (I live in the US, so set the scale to indicate pounds.) Even though my scale indicates tenths of a pound, it takes a gain of about a pound, on average, to make the display change. That change is always by 1.1 pound. I cannot say I know my weight within .05 pounds (per the precision of the display), I can only say I know my weight within .55 pounds – the precision of the rounding. Yes, my scale must be a cheap model, but it shares a problem with the weather service: rounding the reading. The precision/accuracy of the display doesn’t matter as much as the rounding of the measurement. It doesn’t matter if an instrument measures within a tenth of a degree if the measurement gets rounded to the nearest degree when recorded. Once rounded, the precision of the original measurement is degraded to that of the rounding method. Never to be rehabilitated SR • Jim Gorman says: Steve, You hit the nail on the head. • Percy Jackson says: Steve, You are missing the point about systematic errors and random errors. In the example above there is a systematic error between the two sets of scales of about 1kg so I do only know my absolute weight with an error of +/- 0.5 kg. However each set of scales is consistent and can reproducibly tell me my weight to an error of 0.05 kg and hence I can measure any weight gain to an accuracy of 0.05 kg and that value is the same with both sets of scales. So any systematic errors cancel out when you calculate the anomaly (i.e. weight gain) and thus I know my weight gain with an accuracy that is 10 times better than I know my actual weight. The same is true with any measurement where there are systematic errors. Taking the difference between two readings will remove the systematic errors resulting in a much more accurate measurement of differences. • Kip Hansen says: Percy ==> And where, in the GAST world, do you think we have systematic errors that are fixed between “scales”? And what, pray tell, do we do with the ranges of the original individual measurements? (I suggest that”ignore them” is the wrong answer.) • Kip Hansen says: Percy ==> One last time — the majority of the +/-0.5K is NOT ERROR– it is uncertainty that stems from original temperature records being recorded at 1-degree-Farhenhiet ranges rather than discreet measurements. You may choose to ignore the uncertainty but you do so at the risk of fooling yourself about important matters that bear on public policy. Because the +/-0.5K stems from real uncertainty and not random error or systemic error, it does not resolve or reduce by long division and we have the same “no idea” of the real values within the range. We could possibly ignore uncertainty on a range of 1K if we were looking at a change of 10-20K in the metric in question. Unfortunately, we are confronted with real uncertainty on the same scale as as the change in the metric — GAST — which leaves us uncertain. No amount of statistical samba will remove the uncertainty. We are somewhat sure that it might be a little warmer now than 1940-1980 — in another ten years we ought to be able to be more certain. • Percy Jackson says: Kip, I have an experiment for you. Step 1: Choose your favourite probability distribution Step 2: Using that distribution choose 1000 random numbers in the range 0 to 10. Step 3: Calculate the average of those 1000 numbers call this number T1. Step 4: Repeat steps 2 and 3 and call the resulting number T2. Step 5: Calculate the temperature anomaly T2-T1. You will get an answer somewhere between -0.2 and 0.2 (With an exceedingly high probability). Now imagine that this is the average of a set of 1000 temperature measurement with an accuracy of +/- 5 degrees. Despite the fact that each individual temperature reading has an error of 5 degrees the difference between the two averages will be less than 0.2 which is much less than the individual errors. This is the advantage of calculating the anomaly. • Kip Hansen says: Percy ==> I don’t want probabilities — I want to know what the global average temperature temperature was for 1990. Not what it probably might have been or how probable such and such a guess might be. I want to look at the data set and arrive at the mean of the global temperatures (quite complicated, area weighted and all that) — that GAST will come with its own uncertainty as a notation such as suggested by Gavin Schmidt 288K +/-0.5K. That complicatedly arrived at mathematical/arithmetical mean is the data as close as we are going to get it. It’s uncertainty is not reduced by switching to probabilities — we can’t UNKNOW the KNOWN UNCERTAINTY — we can only pretend that we can….. • Percy Jackson says: Kip, Again you are missing the point of the experiment. While it is not possible to say with a high degree of accuracy what the numbers T1 and T2 will be (since I do not know what probability distribution you chose) I do know that T1-T2 will be very small and within will lie within +/- 0.2. Thus I can calculate temperature anomalies with a much higher degree of accuracy than I can calculate absolute temperatures. 31. Michael Carter says: “We use the period 1981-2010 to determine the average which is World Meteorological Organisation standard. It is a moving 30 year average so in 2021 we will begin using 1991-2020 as the new “average” if that makes sense.” The above quote is from NIWA NZ. It is a reply I received to my query on how they establish baselines (the “average”) . Neither media or NIWA describe the baseline when making statements such as “above average” in relation to weather events. But they have a problem looming. In 3 year’s time they shift to baseline 1991-2020 which (depending on what data you use) is around 0.2 C higher than the 1981-2010. What with the flattening of the trend over the last 19 years, results (for e.g. monthly mean) may well be below average. Oh dear! What will they do then? Regards M 32. Stan says: They’re not fooling themselves, they know exactly what they are doing. 33. Geoff Sherrington says: Kip, For more than 6 months now, I have been asking the Australian BOM to provide a figure expressing the total error expected when you wany to compare 2 past daily temperatures from Acorn stations in the historic record. As in Tdeg C +/-X deg C, what is the averall best value of X? They say they have a report in progress for later this year, that will have the answer. They have declined some invitations to quote one from every-day working experience. It might be interewsting for folk in other countries to ask, or if the answer already exists, to dig it up for their countries as an attributable statement. For years now I have been pointing to the existence of formalism in error estimation as by the Paris-based BIPM (Bureau of Weights and Measures). There are others. Signatories to some of these that are Conventions are expected to report findings derived in ways laid out in the Convention Agreement. Sometimes there is domextic legislation that requires country bodies of a government nature to conform to certain procedures relating to uncertainty, for which audits are available. The whole business of error has been badly downplayed in climate work, probably in other fields as well, where I have not looked so closely. There are very good reasons why we have procedures to measure and to estimate error, but lip serice or less seems the name of the game these days. In the ideal world, proper error estmates are good for forming initial appraisals of the standard of scientific work, just as their misuse is a sign of poor science. Thank you, Kip, for again shining a light that is needed. Geoff. • Kip Hansen says: Goeff ==> Thank you. It is a difficult thing for those (over-) trained in statistical theory and practice to wrap their heads around — they do truly believe that uncertainty can be reduced by the methods they have learned — even known uncertainty can be reduced to wee tiny uncertainty for the same metric by a little subtraction and division…. 34. William Ward says: Kip, nice write up. Thanks. I challenge Schmidt’s assumption of +/- 0.5C uncertainty. I think it is much larger and probably not symmetrical. I suspect a larger error range to the minus side due to warming bias mentality that affects their processing algorithms. There are several reasons for my suspicion that the error range is greater. First, the instruments for much of the record were not calibrated. This might add another degree or 2. That error for a particular instrument would be constant up to the point the instrument broke or was replaced and then a step discontinuity would be introduced with a different error. There is reading error (like parallax – or not reading the meniscus – assuming glass thermometer). If a 6’6″ person is reading the thermometer at the post office and then the reader is replaced with a 5’1″ person, unless they get some training you get a different reading. What happens if the readers alternate days. Of course, there is thermal corruption of various sorts (bad siting, UHI, internal transmitters, etc.). Another one that bothers me is the violation of the Nyquist Sampling Theorem. To accurately recreate the original signal without aliasing, then the signal must be sampled at a frequency greater than twice the highest frequency component of the signal. The daily temperature signal is not a purely sinusoidal 1Hz signal. There are lots of higher frequency components. Therefore, it must be sampled much more than twice a day – perhaps every 5 minutes or so. The lower the thermal mass of the measuring device (thermocouple for example) the more quickly it responds to changes. And therefore, it needs to be sampled more often. Instruments with different thermal masses will measure the same event with different results. I’m pretty sure there is no standardization in this regard. The goal for modern instrumentation readings should be first to start off with calibrated instruments. Each instrument should be required to adhere to a certain thermal mass response model – perhaps modeled after glass/mercury thermometers if the desire is to best match the historical instruments. Then, using the proper sample rate, the “area under the curve” should be determined to come up with an accurate “average daily temperature”. In some quick experiments I have run, I can get 0.5C – 1.0C difference using this method vs. the simple (max+min)/2 method. (And rarely is the actual max or min captured for the day.) This method would create a discontinuity with the older record, I know. But for the past 40 years we could have easily performed measurements along 2 tracks: 1 to align with past methods and 1 to align with some real engineering methods. Note I didn’t say scientific methods because as I see it the world of “science” is often about publishing papers and not much is ever grounded in reality. Engineers get numbers right or people die. (“I know the plane crashed, but the publisher loved my paper!”) What do you think about the uncertainty range? If it is much larger, then what does that do to the Alarmist Narrative? • Kip Hansen says: Willard ==> The true uncertainty of GAST (its absolute value or its “anomaly from the climatic mean”) is certainly greater than +/-0.5K — how much? Don’t know. The failure is believing that statistical definitions can change physical facts — something an engineer understands is incorrect and dangerous. • Paramenter says: ‘Another one that bothers me is the violation of the Nyquist Sampling Theorem. To accurately recreate the original signal without aliasing, then the signal must be sampled at a frequency greater than twice the highest frequency component of the signal. The daily temperature signal is not a purely sinusoidal 1Hz signal. There are lots of higher frequency components. Therefore, it must be sampled much more than twice a day – perhaps every 5 minutes or so.’ I reckon that matters not, at least for the purpose of ‘global averages’. They don’t want to actually reconstruct the signal – in this sense variation of the daily temperature in the function of time. All they do is a simply median between highest and lowest recorder temperatures per day. Whether that’s good enough – that a different question. On the other hand quantisation inevitably introduces errors as we approximate and assign values to the discrete levels, so yes, quantisation is always lossy – some information from the original signal (in our case temperature measurement) is irreversibly lost. • William Ward says: Paramenter, I think you miss the point. You can’t calculate the average accurately unless you have all of the information to do so. Under sample and you alias. Your average is wrong. I just posted to someone else so won’t repeat it all here, but this is the summary. If you integrate/average the 5-minute NOAA reference data for any location over a 24 hour period, that average may differ from the NOAA reference data stated daily mean by 0.5C – 1.0C. Quantization error is an additional problem. As is reading error, thermal corruption, data infill, homogenization, etc. The instrument record is badly sampled temporally. The instrument record is badly sampled spatially. It is not adequate for scientific endeavor. • Paramenter says: Hey William, I reckon I’ve got the point quite nicely. I can’t see just yet how this point applies to the problem we discuss. But – I’m slow thinker. Will get there, eventually. ‘Under sample and you alias. Your average is wrong.’ For older temperature records you even don’t sample. All you’ve got is a midpoint between minimal and maximal amplitude . You don’t even know how many times and when the signal crossed this value. When we capture daily median eventually we build another signal consisting of several datapoints in the form of daily medians. This signal has obviously very little common with underlying temperature signal due to massive undersampling and quite substantial quantisation errors. Therefore uncertainties must be huge and daily, monthly and yearly average series are utterly unreliable, at least for the orders of magnitude we’re concerned. Is it what you try to say? ‘If you integrate/average the 5-minute NOAA reference data for any location over a 24 hour period, that average may differ from the NOAA reference data stated daily mean by 0.5C – 1.0C.’ I know it is easy to check but would you mind provide some examples? • William Ward says: Paramenter, (and Kip) Thanks for the opportunity to expand on this. This is a vitally important issue, because it undercuts the entire measurement record used in climate alarmism. I doubt this will be easy to digest if you have not studied it but here is a basic link: https://en.wikipedia.org/wiki/Nyquist%E2%80%93Shannon_sampling_theorem I recommend you don’t give up if this is confusing. It takes some formal study for it to make sense. It is not esoteric – it is not some abstract thing that is wrong to apply here. This is what underlies so much of the technology you enjoy today. Digital music, digital movies and digital instrumentation would not be possible without the application of Nyquist. This is not math that can be ignored – but somehow it is ignored by climate science. Here are some fundamentals. This has to do with basic signal analysis. It’s an “engineering thing” – it is a “science thing” and a “math thing”. Engineers use it because engineers make things that work in the real world. Scientists and mathematicians may choose to ignore it because what they do never comes back to reality. Mathematicians can take numbers and do operations on them. They never stop to realize that the numbers they take are COMPLETELY INVALID from the start. I’ll explain. First: Any signal that is continuous and varies with time has frequency components. The term “signal” may not be a familiar one to many, but temperature measured at a point in space is continuous – it doesn’t stop and start – and it varies with time. It is a signal, even if that vocabulary is not widely understood. Second: Any measurement of that signal is a “sampling”. This too may be an unfamiliar vocabulary term. Whether the sampling is done by a human or a piece of electronic equipment is not important. Whether it was done today or in 1850 it was sampling. Third: The sampling must not violate Nyquist, or the data is corrupted with aliasing. The sampling must be performed at a frequency (rate) that is at least twice the highest frequency component in the signal. If the temperature, measured at a point in space, varies exactly as a 1Hz pure sine wave then you can sample 2 or more times a day and not violate Nyquist. However, the temperature does not vary as a pure 1Hz sine wave, so the twice per day measurement/sampling of the signal by definition violates Nyquist. Summary: The twice a day measurements: 1) Violate Nyquist 2) Cannot be averaged to get an accurate mean 3) Cannot be used to understand the thermal energy present with any accuracy 4) Are not valid for scientific use. When people started recording temperature in the mid 1800’s they didn’t understand this. Measuring the high and low of the day served them as it related to planting crops and living life in the mid 19th century. It was not intended to be used for science. Now, we hear “that is the data we have – so we have to use it”. You know, we don’t have to accept that position! The correct response to that is “no we don’t – that data is corrupted and not valid for scientific or mathematical work!” The Nyquist requirements have been known since 1928. There is no good reason why we have continued to use aliased (NYQUIST VIOLATED) data since then. Even our satellite data does a 2x/day measurement. How can so-called scientists allow this? Now for the example you request. Go to the NOAA climate reference network data. This is supposed to be some of the best data available. Start with this data: DAILY (2017/Cordova Alaska/August 17, 2017): https://www1.ncdc.noaa.gov/pub/data/uscrn/products/daily01/2017/CRND0103-2017-AK_Cordova_14_ESE.txt You can import this into Excel. In columns 39-45 you see “DAILY MAX TEMP”: 13.7C In columns 47-53 you see “DAILY MIN TEMP”: 6.1C In columns 55-61 you see “DAILY MEAN TEMP”: 9.9C 9.9C is the stated DAILY MEAN temperature. You can calculate it yourself (13.7+6.1)/2 = 9.9C. Now, compare that to the NOAA climate reference data for the same location (Cordova Alaska), on the same date (8/17/2017) – but use the “SUB-HOURLY” data (taken every 5 minutes): https://www1.ncdc.noaa.gov/pub/data/uscrn/products/subhourly01/2017/CRNS0101-05-2017-AK_Cordova_14_ESE.txt Import this data into an Excel file. In columns 58-64 you see “AIR TEMP”. You can integrate (average) this data (sampled every 5 minutes) over the 24-hour period. You will see the Max and min temps (13.7 and 6.1) used in the DAILY data above. HOWEVER – if you average the data using the higher sample rate you get an average of 9.524C which can be rounded to 9.5C. Note, “quantization error” (due to the limits of the ADC) is already included in these measurements. The quantization error is small and not worth discussing at this point. But the error introduced due to ALIASING – VIOLATING NYQUIST – equates to 0.4C. IMPORTANT: Without the 5-minute sampling, which is used for the daily data, it is unlikely that the actual MIN and MAX will be captured. This will introduce even more error. Run more examples and you will find the error can be even greater. The twice a day sampling of temperature is mathematically invalid. All of the data, whether taken in 1850 by a farmer or in 2018 by a satellite is invalid/wrong/flawed. Sure, you can take the measurement and do all kinds of math on those numbers, but you are dead from the start. The fact that this is not well known and beat like a drum is simply confounding. I hope this helps and inspires more discussion. • Kip Hansen says: William==> I am out of the office for the rest of the day — I will try to get you a complete reply in the morning. • David Dirkse says: Mr. Ward says: ‘ so the twice per day measurement/sampling of the signal by definition violates Nyquist. ” and Mr. Ward says: “The twice a day sampling of temperature is mathematically invalid.” But what Mr. Ward does not understand is the Tmin, and Tmax are not two “samples.” The instrument that collects the data samples continuously throughout the day and records two distinct values, namely Tmin and Tmax for a given 24 hour period. • William Ward says: Mr. Dirkse, Tmax and Tmin are 2 samples. There is no need to place “” quotes around the word samples. Most older instruments did not accurately capture the actual max or min temperature for the day, if for example the instruments required a person to be reading it at the time the max or min occurred. But this is not relevant to the core point. Even if you assume you arrive at 2 accurate samples of the highest and lowest recorded temperature for a 24-hour period, 2 samples are not enough to not violate Nyquist. No one will stop you from taking those 2 numbers, adding them together and dividing the result by 2. But this will not yield the same result that sampling according to Nyquist will yield. Your (Max+Min)/2 result may be more accurate that a scenario where the highest or lowest numbers are not accurately obtained, but your number will be highly aliased and full or error – wrong. But if it makes you feel any better, you will be in the company of countless climate scientists who likewise are content with using mathematically flawed data. If you care about accurate data, being scientifically and mathematically correct, then digest this point about Nyquist. It is a huge gaping hole in climate science. • David Dirkse says: Tmin and Tmax are not two “samples”. You are inappropriately apply the Nyquist in a place it should not be applied. The “climate” at any given location can be wholly specified by the daily Tmin and the daily Tmax as an anomaly with respect to the 30 year average at said location. When you collect 30 years worth of daily data, the Tmax-Tmin number exactly represents the sinusoidal waveform of daily averages that are exhibited on a yearly (seasonal) cycle. …. Now if you wish to disprove the appropriateness of the existing methodology, then you would need to collect 30 years worth of data at a given location with a higher “sampling” rate, then compare the derived averages with the Tmin/Tmax methodology. Good idea for a study/paper that you can get published, shouldn’t be too hard for someone like you. I’m sure that someone like you with a wealth of experience in signal processing can overturn decades how meteorology has been done. • David Dirkse says: Another reason Mr. Ward that the application of Nyquist in this situation is invalid, is because both Tmin and Tmax are auto correlated with the prior day. To illustrate this consider a monitoring station located 50 miles north of New York City on January 20th. If Jan20 Tmin is 15 degrees F, and Jan 20 Tmax is 42 degrees F, then Jan 21 most likely will not have a Tmin of 52 degrees F and a Tmax of 78 Degrees F. This phenomena is known as “Winter.” Nyquist doesn’t work in this scenario. • William Ward says: Mr. Dirkse, Of course they are samples. You have 2 discrete numbers measured from a continuous time varying signal. By definition they are samples. Of course Nyquist applies. The signal is continuous, time varying, band-limited and is being sampled. I don’t need to collect data, write a paper or prove anything. Nyquist did that (and Shannon). But what people like you and I can do, and others on this forum can do is first, understand that Nyquist exists, second see that it applies, third see that it results and big measurement error and fourth start to cry foul and refuse to accept a body of alarmism and manipulation built upon a foundation of violating mathematics principles. I’m just one person. I present this here to help spread the knowledge with the hope our voices against manipulation can be stronger. • David Dirkse says: The “signal” you are applying Nyquist is auto-correlated. How does Nyquist handle an auto-correlated “signal” ?? What modifications to your application of the theory must you make because the input signal is technically not random because of the auto-correlation ? • David Dirkse says: Ward says: “they are samples. You have 2 discrete numbers measured from a continuous time varying signal.” Not exactly. The values have been selected from a set of samples. Suppose you have a station that collects hourly data. You have 24 samples. You then select the highest value and the lowest value from the 24, then discard the other 22 samples. You technically do not have 2 samples, because you are ignoring the 22 you’ve discarded. You can’t apply Nyquist in this situation. • William Ward says: Mr. Dirkse, Regarding your comments about auto-correlation all I can say is that you are speaking mumbo-jumbo. I don’t mean to be disrespectful. You are resisting a new concept, apparently. When you are sampling a signal, it doesn’t matter what you had for breakfast or what is going on in New York. If you want to make an analog signal digital, then Nyquist is your man. The max and min would be of value if the day was a constant max temp and then at a certain hour the temp snapped to the min temp – AND if the time of high and low were exactly equal then a simple average would yield the beloved mean temperature. The fact is that temperature is varying up and down all day and night. The goal is to capture all of the energy in those “nooks and crannies” of the changes. If you follow Nyquist, you get the right answer – the real Mean temp. If you violate Nyquist, you get a number – but not an accurate one. Why are you fighting this? I’m glad to offer the knowledge I have on this if it is helpful, but if you want to throw mumbo-jumbo because the concepts I bring up rattle your cage then I can’t help you. Further mumbo-jumbo will not be responded to. Peace to you. • David Dirkse says: Dear Mr. Ward, please review the following “mumbo-jumbo” before you post your next comment. You obviously do not understand that time series data and a “signal” in your Shannon-world are not orthogonal. .. .. https://en.wikipedia.org/wiki/Autocorrelation .. What I like about the wiki post is that it talks in something you’ll “understand,” namely signals. • David Dirkse says: You see Mr. Ward, I’m not ignorant of your Shannon/Nyquist worldview. Suppose you have a communications channel that is one byte wide. If you receive a byte of value X at time t, then in your world, the value of the byte you get at time t+1 has a probability of 1/256. In an auto-correlated time series of climate data, this is not true. the value at time t+1 is dependent/correlated on/to the value at time t. In the real world what this means is that on January 20th, if the Tmax is 20 degrees F, the probability of the Tmax on January 21st being 25 degrees F and the probability of Tmax on January 21st being 83 degrees F are NOT THE SAME . This is why you can’t apply Shannon’s theory. In your world the probabilities would be identical, but auto-correlation (WINTER) says this is not the case. In your “Shannon”/signal processing world, the temp on January 21st could be 73 degrees F with equal probability as it being 22 degrees F. • William Ward says: Mr. Dirkse, I assume you know a great deal about doing digital signal processing in the field of digital communications, so I mean no disrespect. I’m really not sure how to address what you say. We seem to be on a different wavelength. Nyquist is not a worldview. Probability and auto-correlation have nothing to do with sampling. Digital communication protocols are already in the digital domain and don’t involve Nyquist. You can’t just arbitrarily throw away samples or cherry pick 2 (Max and Min). Maybe we should try to engage again later on another subject. Peace to you. • William Ward says: Kip, I added some good data in replies to Phil and Paramenter. In the latest I provide NOAA supplied data – from their REFERENCE network, showing the extent of the error in the (Tmax+Tmin)/2. NOAA already provides a mean that appears to comply with Nyquist, but they still use the old method. Take a look at the amount of error. It should turn your head. Help me figure out how to upload a chart or Excel file and I’ll show a graph of the error over a 140 day record from NOAA. You are probably drinking from the fire hose on this post – so reply when you get a chance to first take in the info. • Kip Hansen says: William ==> see my other reply — email the whole kit… • Phil. says: Well the use of a Max-Min thermometer means that the temperature is monitored continuously with a time constant of 20sec so I would expect that Nyquist is adequately covered. • William Ward says: Phil, You said a time constant of 20 seconds. That might mean it samples every 20 seconds, but time constant can mean other things. We really need to know the frequency content of the temperature signal. Some days might have much more high frequency content than others and therefore would require a higher sample rate. I’ll make a generalization that if we sample every 5 minutes then the result is probably good – but that is a guess. So, a device that samples every 20 seconds would likely satisfy Nyquist – HOWEVER: While this 20 second sample rate may give you an accurate Max and Min Temperature, just getting 2 samples (Max and Min) from this system does NOT satisfy Nyquist. You must use all of the samples that the Nyquist rate would deliver if you are to do any processing or calculations. Once you have all of the data required by Nyquist then you are free to do any mathematical operation on that information and do so without the added error of aliasing. You may not want to reconstruct the original analog signal, but if Nyquist is followed you could. With just a Max and Min you cannot – even if the Max and Min are accurate. If you want an accurate Mean Temperature from that day, then you can use the samples provided by following Nyquist and get an accurate Mean Temperature. Averaging Max and Min alone does not give you an accurate Mean Temperature. This would be a lot easier if I could find a good picture to illustrate and help increase the intuitive factor. Let me try to create an example that illustrates. If you have an electric range with settings from 0-10 and set it to 10 for 30 seconds and 0 for 30 seconds. The average or mean for that 1-minute period is 5. The amount of energy the burner put out for that 1 minute could be exactly duplicated if you ran the burner at 5 for 1 minute. If you have the correct Mean, then you understand the amount of energy delivered over the time of interest. Now what happens if you operate the electric burner with a more complex pattern? What if it changes every second and goes 0, 10, 7, 3, 5, 2, 9, 3, 2, 1, 0, 8, 6, etc.? Can you just take the Max of 10 and the Min of 0 and average it to 5 and know you have captured the Mean correctly? Let me make it easier. It operates for a minute, changes every second, starts at 0 for 1 second, goes to 10 for 1 second and then goes to 2 and oscillates between 2 and 3 for the remainder of the minute. You intuitively know that the average/mean is somewhere between 2 and 3 (probably close to 2.5). If you just use the Max and Min, you get 5. That is not correct. In this example the highest frequency component is 1Hz – the signal changes once every second. Sampling must be done fast enough to capture all of the transitions and you have to use at least the minimum required samples for math operations on that data to be correct. One more thing about the word accuracy: There are a lot of things that can reduce accuracy. Quantization was discussed in other posts. Aliasing can reduce accuracy. But even with a high-resolution instrument sampled at the right rate, we still need to know that all components in the signal chain are not introducing error. Assuming a digital system, we need a calibrated instrument that takes into consideration the thermocouple, thermal mass of the front end, amplifier chain, power supply and converters. Noise or error can come from any of these things. Noise can also come from external factors related to siting, Stevenson screen, external thermal corruption, etc. It can also come from making up and manipulating data after the fact. All of these sources of noise reduce the accuracy of the measured signal. (Max+Min)/2 only works if the temperature signal is a pure sine wave that changes once per day (which is never). [Note I incorrectly referred to the 1 cycle per day as 1Hz in an earlier post and that is not correct. I cycle per day is actually 0.000012 cycles/second or 0.000012Hz. I just mention to be correct – not that it matters to the points made.] • William Ward says: Phil, One more example to try to bring in the intuition. Imagine (or better yet draw on scrap paper) a rectangle with a base of 10 and a height of 2. Now draw an isosceles triangle with a base of 10 and a height of 2. They both have the same height (Max+Min) and average height (Max+Min)/2, but which shape holds more volume? The rectangle. We can’t know this from (Max+Min)/2 no matter how carefully we calculate it. When the temperature changes throughout the day, the signal is not a nice smooth curve. It will go up and down a degree or so every 5 or 10 minutes if you have clouds moving overhead and breezes blowing. So, the curve becomes irregular shaped. These irregularities to the shape are caused mathematically by higher frequency components of the signal. Nyquist helps you collect up all of the thermal energy in all of the irregular spaces of the signal. We know intuitively that if it is 100F for 1 minute in a day that it isn’t as “hot” as another day where the temperature is 100F for 180 minutes. Nyquist allows us to know not only what temperature but effectively how long at that temperature. 35. Here’s a biological anomaly – a beluga whale in the Thames: https://www.bbc.com/news/uk-england-kent-45649502 A tad further south than normal, one would think. That CO2 global warming is negatively warming the north Atlantic and making whale populations move negatively north. • Kip Hansen says: Phil ==> Interesting …is it a small anomaly as beluga are small whales? • Yes – the Beluga is like an oversized white dolphin. In fact they sometimes travel together with dolphins or porpoises. It is thought that this is how the beluga got into the Thames. It’s not lost or distressed apparently – just happily eating fish and other stuff in the central deeper colder more saline parts of the river. https://binged.it/2zxIY5p 36. hunter says: Kip, This is interesting and speaks to the long term concern many have had about how anomalies are presented and used by regarding climate. • Kip Hansen says: Hunter ==> Yes, that is the problem – how they are used. They may have some utility for something (though I doubt that the “global anomaly” has any physical meaning). 37. Steve O says: ” So, while the mean is quite precise, the actual past temperatures are still uncertain to +/-0.5.” — I’m not sure this is true. If you take one measurement and get 72 degrees +/- 0.5, then that is the range of your uncertainty. If you take 1,000 measurements, and the result is 72 degrees 51% of the time and 71 degrees 49% of the time, your average is 71.5, but the error bars are no longer +/- 0.5. If your mean is precise, then so is the estimate of actual past temperature. • Analog Design Engineer says: Your comment is incorrect. If you take 1000 measurements of which 500 are 72 and 500 are 71 the mean is indeed 71.5 but the standard deviation is 0.5 (try it out in excel). To calculate uncertainty you need first to identify all uncertainties in the measurement process, categorize them as Type A or Type B, determine the std uncertainty for each uncertainty, combine them (if uncorrelated use RMS) and then apply a coverage factor (2 is recommended for 95% confidence). The uncertainty you arrive at is then a +/- figure about the mean of the measurements, which is your best estimate of the measurement. That doesn’t mean btw that your measurement is accurate. Being precise just means that taking a no. of measurements results in a relatively low value of std deviation. It says nothing about the accuracy of the measurement. Imagine throwing darts at a dartboard but without the board present. How do you know if you are hitting the bull, even if all the darts seem very close. The mean is only the best estimate of the measurand (e.g. temperature) • Phil. says: Your comment is incorrect. If you take 1000 measurements of which 500 are 72 and 500 are 71 the mean is indeed 71.5 but the standard deviation is 0.5 (try it out in excel). And the standard error of the mean is 0.016. • Kip Hansen says: ADE and Phil ==> Therein we see the confusion of mixing measurements with statistical theory. Standard error of the mean vs Standard Deviation — both totally ignoring the original uncertainty of the temperature record which is the DATA. In temperature records we are dealing with values that have been recorded as ranges – the ranges are not errors. • Phil. says: No they’ve been recorded as lying within a range, taking a set of whole numbers as data certainly can be averaged and treated statistically. For example I generated 100 random numbers between 71 and 73 and the mean was 72.0367 (se=0.06), when I rounded the set of numbers to the nearest integer and then took the mean was 72.030 (se=0.07). Repeat it multiple times and the mean still falls within +/- 2se. There is a reason why we apply statistics when we make scientific measurements. • Steve O says: First, I took +/- 0.5 as the error range, not the standard deviation. So I have an error of 0.5 being 2 std dev from the mean. But in any case, the more measurements you take, the better will be your estimate of the mean. Using your dartboard analogy, the more darts you throw, the more likely I’ll be able to tell you exactly the position of the bullseye even if you never actually hit the bullseye. That will be true for any distribution pattern — normal, random, skewed, etc. Lets say you have an unfair coin that doesn’t come up heads or tails evenly. I comes up heads 86.568794% of the time. Give me enough tosses and I’ll tell you exactly how unfair your coin is to as many decimal places as you wish to know. • Analog Design Engineer says: Steve O In your experiment you know the answer, heads or tails. But without the dartboard present you don’t know where the bull is, no matter how many darts you throw. The mean is only the best estimate of the location of the bull. But throwing 1000 darts does not make this more accurate. If an instrument has a built in systematic error you cannot remove it by repeating the measurements. If a sensor reads high by 1dgC it will always do so. The only way to improve on this is to use a better meter or method. Why do you think we use 4.5, 6.5 and 8.5 digit multi-meters? Its for the better specified accuracy, not the resolution. Otherwise we could just take lots of measurements with a cheap meter. And when calculating the error you must take into account the uncertainty of the meter as well as the distribution of your readings plus all other uncertainties in the measurement process. • Kip Hansen says: Steve ==> I give a lot of examples in the essay and in Durable Original Measurement Uncertainty. Each of your measurement (temperature records) is a range 1 degree wide within which the real temperature existed but is totally unknown. • Steve O says: Kip, But a real temperature does exist. Pick a value between 71.0 and 73.0. Use a random number generator to add or subtract an amount to simulate 1,000 “measurements” so that the measurement accuracy is +/- 0.5. Give me the 1,000 measurements, all of which now have inaccuracy built in. How close do you think I can get to estimating the true value? Now let’s make it harder. Only give me the whole number results of the 1,000 measurements. Round all the measurements to the nearest whole number. Now how close do you think I can get to estimating the true value? You might want to try this in Excel before you bet me any money, but I can get a heck of a lot closer than +/- 0.5 • Steve O says: Also, the accuracy of my estimate is not seemingly impacted by the fact you round all the measurements. • Kip Hansen says: Steve ==> We don’t know the “real” temperature — it only existed for a moment, which has passed. What we know is that at that moment of measurement, we wrote down “72” for any temperature between 72.5 and 71.5. We can not somehow “know” the true temperature by applying either arithmetic or statistics. The actual true temperature of that moment will forever remain unknown — it is lost data. The ONLY thing we know is the range in which the true temperature would have been found. We can not ever recover the actual temperature to any value of greater precision than the original uncertainty range. • Kip “We can not ever recover the actual temperature to any value of greater precision than the original uncertainty range.” No-one claims you can. The claim is that the average of a lot of different readings is a lot less variable than is each individual one. • Kip Hansen says: Nick ==> That is just an artifact of averaging — averaging any set of numbers, related or not, gives a single answer, quite precisely. what it doesn’t do is add new information (the data is the data) nor does it reduce the uncertainty of the data. Averaging creates a “fictional data point” — with an equally fictional smaller variability. In the sense used by Wm. Briggs in “Do not smooth times series..” • Phil. says: Averaging creates a “fictional data point” — with an equally fictional smaller variability. In the sense used by Wm. Briggs in “Do not smooth times series..” And it’s an equally nonsensical statement. When I measured the velocity of a gas jet using Laser Doppler Velocimetry I would seed the flow with extremely small particles and measure the velocity of each particle as it passed through the control volume, I would then calculate the mean and standard deviation of the distribution. Nothing ‘fictional’ about the data, and it’s much more useful than knowing an individual velocity. • Kip Hansen says: Phil ==> Did you read Briggs? • Steve O says: “We can not ever recover the actual temperature to any value of greater precision than the original uncertainty range.” — I can prove this to be an untrue statement with an experiment in Excel. The actual value will be revealed by the distribution. While it is true that you will never improve the variability of any particular measurement, you can determine the underlying value. But you don’t need to take my word for it because you can see for yourself. In one column, for 1,000 rows enter =Rand() to create a random number between 0 and 1. In the next column enter any fixed value you choose. In the third column, add the two columns together. You can subtract 0.5 if you want to keep eliminate the net effect of the addition while keeping the randomness. These are our inaccurate “measurements.” Calculate the average of the range and see how far it is from the value you chose. Now for the magic. Add another column that rounds your randomized “measurement” result to the nearest whole number, and at the bottom of that column calculated the average again. You should see that it is possible to determine the exact number you chose by averaging all the inaccurate measurements, and you’ll also see that rounding each inaccurate measurement had no effect on your ability to determine the true value. • Kip Hansen says: Steve 0 ==> The world is full of maths tricks…. you can not recover an unknown value from an infinite set. Since the actual thermometer reading at an exact time was not recorded, and only the range was recorded (“72 +/- 0.5”) there are an infinite number of potential values for that instantaneous temperature. No amount of statistic-ing or arithmetic-ing will reveal the reading that would have been recorded had they actually recorded it. • I guess that most here know of Aesop’s fable: “The Tortoise and the Hare.” Basically it’s story where a hare challenges a tortoise to a race. At the start of the race the tortoise begins a slow, plodding gait while the hare takes off, returns, circles the tortoise several times, chastises and ridicules the tortoise, and then takes off again for the finish line. About halfway to the finish line the hare decides to take a nap. Unfortunately for the hare, he sleeps too long and lets the tortoise gain a major lead. Although the hare easily catches up to the tortoise, he doesn’t quite make it before the tortoise crosses the finish line. The hare loses by a hair. Now what is the average speed of the race participants? The tortoise is easy to calculate–it’s the distance from start to finish divided by the duration of the race. What about the hare? Except for the width of a hair, the hare traveled the same distance in the same amount of time. So the two animals have the same average speed (minus the width of a hair). Is the average speed representative of either animal? It seems to apply to the tortoise, but we don’t know if the tortoise took rest breaks. In any case, the average speed of the tortoise matches pretty well with his actual speeds during the race. What about the hare? If anything, the hare’s average speed was something he was passing through to another greater or smaller speed. The hare was probably never going that average speed during the entire race. Anyone who claims averages tell all is kidding themselves. There’s no way to tell if a changing temperature is taking a tortoise-like route or a hare-like route, and an average temperature tells you nothing about the actual path of those two routes. Jim • David Dirkse says: Kip posts: ” you can not recover an unknown value from an infinite set.” .. FALSE Here is an infinite set: { 3/10, 3/100, 3/1000, 3/10000, 3/100000……. } Unknown is the sum of all of the elements of this set. Put in more familiar terms: .. 0.33333…. We all know the sum of this infinite set….it is 1/3 Or in more precise terms: .. 3/10 + 3/100 + 3/1000 + 3/10000 + 3/100000+ ………. = 1/3 • Phil. says: Phil ==> Did you read Briggs? Yes. • Kip Hansen says: Phil ==> So do you now understand what “fictional data” is? • Phil. says: Phil ==> So do you now understand what “fictional data” is? Yes, irrelevant sophistry. • gnomish says: “The claim is that the average of a lot of different readings is a lot less variable than is each individual one.” if they are not taken at the same time and place, then they really are not related and dividing the sum of them is silly. or maybe averaging a tangerine from 1402 with an kiwi-fruit form 1989 is how to get a nasty salad. i wouldn’t eat it. 38. Geoff Sherrington says: A related discussion of error estimation in temperatures arises when conversions are made from F to C and the reverse. A group of us did a lot of work in this, about 5 years ago. It is not feasible to make an umbrella summary. However, as in a lot of climate work, the official view, when stated, was often optimistic. Our BOM admits to a possible rounding inaccuracy of about 0.15 degrees C, but that the time of metrication was when the great pacific climate shift was in progress, so attribution was equivocal and so the solution was to do nothing about it. At least we know that there is some probability that a known mechanism has created an error now forgotten. Geoff • A C Osborn says: There was a lengthy post on here by Pat Frank who has done a lot of work on the subject. There was also a lengthy reply by Mosher saying the same things about Large Numbers here. https://moyhu.blogspot.com/2016/04/averaging-temperature-data-improves.html However I do not see how Large Number theory applies when the subject being measured is ever changing where no 2 measurements are actually measuring the same thing, by the same instrument in the same location. In Mosher’s case he takes the maximum daily temperature for Melbourne using a modern instrument, which is absolutely nothing like an old Max/Min Thermometer being read by different people. One is very Accurate but reads fast transients without interpretation whereas the other is very slow and relies on the interpretation of a Human Eyeball. And yet they want to apply the same rules to both, which makes absolutely no sense whatsoever. The problems with the accuracy of modern BOM data has been completely exposed by many in Australia. • A C Osborn says: I worked in a Government Metrology Lab and the one thing about physical measurement I learned there was that a Controlled Environment is essential for both Accuracy and Repeatability. Anyone who thinks a wooden box with Screen is a “Controlled Environment” is crazy, they were not designed to measure “Global Temperatures” to tenths of a degree, just to give the very very local conditions. Even 20 yards away could be completely different conditions. 39. TallDave says: Interesting, but I’m not sure how much this small uncertainty really matters relative to other larger uncertainties with the data. The actual measurement error is much larger than Gavin admits, as is obvious from the much larger changes they’ve made to temperature just since he made the original claim of .1 degrees — and every adjustment adds its own new potential sources of error. https://stevengoddard.wordpress.com/2013/12/21/thirteen-years-of-nasa-data-tampering-in-six-seconds/ • Kip Hansen says: TallDave==> It matters only to the degree that we have a whole scientific field (almost the whole field) fooling themselves with the “precision and accuracy” of this GAST(anomaly) metric — and insisting that the itty-bitty changes in GAST(anomaly) are exempt from the actual uncertainty of the metric GAST(absolute). It leads to a vast underestimation of uncertainty — and thus the field is over-confident about things that are still pretty vague. Read the paper that is discussed at Dr. Curry’s blog “The lure of incredible certitude”. 40. Mike Graebner says: I know this is a bit off topic but can someone provide a like to how satellites measure temperature. Thanks • Kip Hansen says: Mike ==> You will probably find this answer unsatisfying — but it is accurate. See the two paragraphs on the Wiki page for “Satellite temperature measurements” under Measurements . Hint = Satellites do not measure Earth surface temperature. 41. Paramenter says: Kip, thanks for informative and accessible to an average human mind post. For my benefit I would summarize how I understand matters (Kip it in seven points ;-): 1. As per global average temperature ‘official’ climatology freely admits uncertainty range (measurement error) +/- 0.5 K (Schmidt) 2. What means that for the last 37 years there is no detectable increase in the global average temperature, i.e. increase beyond uncertainty range (your reanalysis chart). 3. Official climatology tries to hide this uncertainty range in the graphs carefully prepared for public presentations where uncertainty bars are either missing or greatly reduced. 4. To reduce measurement/rounding errors many climatologists apply various statistical techniques in hope of ‘improving’ quality of the input data and removing large part of uncertainty. 5. Alas, no such reliable techniques exist. 6. It appears that there is widespread belief (superstition?) among many climatologists that by applying such techniques values of the calculated global average temperature will start to converge with the actual values and they can somehow disperse gloomy shadows of measurement uncertainty. 7. Accuracy of .000 C per global average temperature values is just mere artifact of mathematical operations where we can define any arbitrary number of decimals. Unfortunately, it has very little to do with the true values that may be significantly different, i.e. within large uncertainty range. Have I got the picture, Kip? • Kip Hansen says: Paramenter ==> Close enough for CliSci. The problem is not limited to Climate Science (CliSci). Epidemiology is rife with nonsensical findings claiming great precision from extremely messy data. Psychology has for many years relied on studies of a handful of subjects to make population wide proclamations about the mind and psyche. Researchers in almost all fields have used pre-packaged statistical software — without any real understanding of its limits — to make all kinds of unwarranted conclusions. This is referred to in the science press as The Crisis in Science. • Paramenter says: I’m glad to hear that uncertainty range 0.5 K means there is no detectable warming in the last almost 40 years. That’s settled science and rather good news! This chart definitely requires more attention. 42. Kip Hansen says: EPILOGUE: Well, a nice lively discussion in which the statisitcally-indoctrinated defend their right to fool themselves about the true uncertainty of Global Average Surface Temperature. I know they really believe it — statistical theory says its true. No number of pragmatic, simple, real world examples will change their minds, so we must let them go their way — no harm, no foul. The rest of us can know that the uncertainty of the GAST(absolute) of minimally +/-0.5K means that we can not be certain whether or not the average global temperature has changed much (or how much) over the last decade, and we may be uncertain about it over the last 40 years. This is a “propagation of error problem” — but not the kind we normally think of. The error was in adopting the “anomaly” method as way to avoid the known uncertainty of global average surface temperature. This error is being propagated into the future on the basis of “well, that’s what we all do….” despite its inappropriateness, particularly the pretense that it eliminates the acknowledged, known uncertainty. If you leave comments soon, I may still see them and respond….if I don’t you can always email me at my first name at the domain i4 decimal net. Thanks for reading. # # # # # • David Dirkse says: Kip, the uncertainty of the GAST is not +/- 0.5 degrees. .. The Standard error is equal to “s” (in your case 0.5 representing the thermometer SD) divided by the square root of the number of obs. …. Please stop confusing instrument error with sampling error. • Kip Hansen says: Mark ==> Yes, “Pat Frank that deals with the inherent uncertainty of temperature measurement, establishing a new minimum uncertainty value of ±0.46 C for the instrumental surface temperature record” — pretty close to GFavin’s +/- 0.5K. 43. Kip, nice work. Sorry I made it late to the game, but from my engineering education and from over 30 years experience working with air quality and meteorological measurements, I remember a lot of references to “precision” and “accuracy”. Precision is much more easy to determine than accuracy and most of the discussion here seems to be involved on what I understand to be “precision”. From what I recall, precision has more to do with repeatability than anything else. Good repeatability does not necessarily mean good accuracy. Instruments that are assumed to have a linear response may not have a linear response, for example. Calibrations often drift over time and affect the overall “accuracy”. Accuracy in my mind must also include “representativeness” relative to what you are trying to measure. Awhile back I wrote down some more detailed thoughts about accuracy of global temperature and temperature anomaly estimates here: https://oz4caster.wordpress.com/2015/02/16/uncertainty-in-global-temperature-assessments/ For sake of brevity, I won’t copy that long post here, but please take a read to consider some of the many additional issues I have not seen mentioned in the discussions here. I’m sure there are some I missed as well. • Kip Hansen says: Bryan ==> Thanks for the informative link — I too believe that Gavin’s 0.5K is the minimum uncertainty…lots to add to it. • Kip Hansen says: vandoren ==> Well, at least you end up with an approximately true Known Uncertainty for GAST = 0.5K. I was speaking though — and this whole essay is about — the real uncertainty (We Really Just Don’t Know) that results from recording the original temperature measurements as 1-degree-F ranges. 44. Frank says: Kip: Consider trying this experiment for yourself. Find a random number generator. In my case, it is part of the free add-on for Excel for MAC (2010) and was built in the previous version. Generated 10,000 random numbers with five digits after the decimal point. I request an mean of 60 degF (and got 59.96342) and a standard deviation of 20 degF and got (19.88341). Round all of the numbers to the nearest unit using “=round(A1,0)”. Calculate the mean (59.96510) and standard deviation (19.88491). Does rounding make any difference in the mean? Then I added about 12 degF of GW and generated 10,000 new temperatures with a mean of 71.76008 and std of 20.07188. Then rounded as before. 71.76070 and 20.07334. Did rounding ruin the precision that the mean warming between the two data sets was 11.79666. Yep. The difference in means after rounding was only 11.79560, and error of 0.00106 degF caused by rounding. Not arguing. 45. Kip Hansen says: Frank ==> We are not really rounding — we are dropping the uncertainty altogether and using ranges. It is NOT about the VALUE of the MEAN — it is about the UNCERTAINTY RANGE that devolves correctly to the MEAN when using values that are RANGES. • Phil. says: We are not using ranges we are rounding a value to the nearest mark on the thermometer. As I showed above if you take for example 100 random numbers between 71 and 73 to be the actual values with a uniform distribution for that set the mean was 72.0367 (se=0.06). When I rounded the set of numbers to the nearest integer and then took the mean was 72.030 (se=0.07). Repeat it multiple times and the mean still falls within +/- 2se. So when taking readings from within a uniform range to the nearest integer value the uncertainty range that is appropriate is derived from the standard error of the mean of that set. As shown the mean and standard error of the rounded set will converge on that of the original set, in the sets of 100 values that I studied they’re virtually indistinguishable. • Kip Hansen says: Phil ==> I am talking about the REAL WORLD — not an idealized example. In REAL meteorology, the recorded value for a single temperature reading is a RANGE — in Fahrenheit — 1 degree wide. Always has been. Still is. • Phil. says: It’s not an idealized example, it’s what happens. The temperature if it lies between two limits, say 71 and 72, and fluctuates within those two bounds and will be recorded as one of either of these values. As I have shown if sufficient readings are sampled you end up with the same mean as if you’d sampled the original temperatures with higher precision and with a standard error that will depend on the squareroot of number of samples. • Kip Hansen says: Phil ==> One more time — I don’t give a hoot about the fictitious value “the Mean” and its Standard Error — both are statistical animals — unrelated to our real world uncertainty. about the item of our concern, the temperature. We are not dealing with millions of fictitous values, we are dealing with real world measurements recorded as ranges. You are using the Gamblers Fallacy in a sense — “I would have beat the house if I had made a billion bets — the odds were in my favor.” (He may be right — but he made one big bet and lost). You may end up with the same mean — but you will be just as uncertain in the real world (statisticians will be certain as long as they have enough numbers a a big enough computer — but you wouldn’t want to blast off to the Moon in a rocket built by statisticians — you’d want one built by cynical hard-boiled engineers.) • Phil. says: Phil ==> One more time — I don’t give a hoot about the fictitious value “the Mean” and its Standard Error — both are statistical animals — unrelated to our real world uncertainty. about the item of our concern, the temperature. Rubbish, they are essential to understanding the uncertainty of the measured quantity in the real world. You may end up with the same mean — but you will be just as uncertain in the real world No you will not. but you wouldn’t want to blast off to the Moon in a rocket built by statisticians — you’d want one built by cynical hard-boiled engineers.) Built by some of the engineers I’ve taught, no problem, but they understood the underlying statistics of measurements, one of them even flew on the Space Shuttle as a payload specialist a couple of times. Also over 40 years ago I was the co-inventor of a measurement technique/instrument which revolutionized the field and is still the pre-eminent technique, knowledge of the statistics of measurement was essential. • Frank says: Kip: Respectfully, the question – as best I understand it – is whether we can accurately measure a change (such as 0.85+/-0.12 K) between two states that we know quite inaccurately (such as 287.36+/-0.73 and 288.21+/-0.69). The answer depends on what causes the error: 1) Obviously, I’ve proven that our inability to read a thermometer more accurately that +/-0.5 degF doesn’t limit our ability to measure a CHANGE IN MEAN to within +/-0.001 degF in 10,000 measurements. (I’m disappointed to see you bring up this red-herring.) 2) Constant systematic error: If a thermometer always reads 5 degF too high, we can still take 10,000 daily measurements starting at noon of 1/1/2019 and 10,000 measurements starting at noon of 1/1/2119 and measure change over a century between these two 27.4-year means to within +/-0.001 degF. A constant systematic bias causes no problems. Andy’s station project surveyed potential station bias, but only CHANGING bias will bias a trend. 3) Varying systematic error: Based on what we know about time-of-observation bias with min-max thermometers, a systematic error of this type can introduce a bias of 0.4 degF in a change. 4) Another varying systematic bias: Suppose Stevenson screens GRADUALLY get dirty (decreasing albedo) and clogged with leaves (reducing circulation). The average temperature reading will gradually increase. After ten years or so, the station is cleaned, removing the upward bias in the trend AND introducing a break point. The data is homogenized, removing the break point that restored initial unbiased measurement conditions – keeping the biased trend and removing a needed correction. 5) Calculating local temperature anomalies for a grid cell helps remove some systematic errors. At the end of the Soviet era, a hundred or stations is cold Siberia stopped reporting. If Schmidt is talking about a CONSTANT systematic uncertainty of 0.5 K, then we can measure change to within 0.1 degK. CHANGING uncertainty this size is a disaster. All of the data adjustments add 0.2 K to 20th century warming. Maybe half those adjustments weren’t caused by systematic errors that should be corrected (station moves, TOB, new equipment). • Kip Hansen says: Frank ==> You can read what Gavin Schmidt says: I give the link above and here once more — there is no misunderstanding him. Here is the quote again: But think about what happens when we try and estimate the absolute global mean temperature for, say, 2016. The climatology for 1981-2010 is 287.4±0.5K, and the anomaly for 2016 is (from GISTEMP w.r.t. that baseline) 0.56±0.05ºC. So our estimate for the absolute value is (using the first rule shown above) is 287.96±0.502K, and then using the second, that reduces to 288.0±0.5K. The same approach for 2015 gives 287.8±0.5K, and for 2014 it is 287.7±0.5K. All of which appear to be the same within the uncertainty. Thus we lose the ability to judge which year was the warmest if we only look at the absolute numbers. We don’t need any “stinkin’ anomalies” to see the CHANGE in the GAST(absolute) — we just look at the values: 2016 288.0±0.5K 2015 287.8±0.5K 2014 287.7±0.5K The total change between 2014 AND 2016 is 0.3K. As the known uncertainty for each value is +/- 0.5K — all the values “appear to be the same within the uncertainty”. There — that’s it — we are done. It is KNOWN UNCERTAINTY in the GAST(absolute) — we can not UNKNOW that through statistical manipulation and definition. Current figures from NOAA For THE ANOMALY says the dif between 2014 and 2016 is 0.21K, CRU says the same difference is 0.218K , NASA/GISS says 0.26 — all pretty close to Gavin’s figures for which the difference is 0.3K. So to say that the Anomaly of the Annual GAST is more certain than the actual GAST(absolute) — all figured from the same uncertain data is a self-deluding statistical trick. We already KNOW the real uncertainty of GAST ….. it is ten to one hundred times the uncertainty claimed for the “anomaly” of the same metric. That’s a lot of “fooling oneself” — the justification for that “fooling” does not matter at all. Being able to justify it doesn’t change the facts. Frank — an aside: I am a father of four, and was a Cubmaster and Scoutmaster for ten years — I have been a Marriage Counselor — I have a been a Personal Behavior Counselor — I have a LOT of experience with justification — I have listened to hundreds and hundreds of hours of kids and adults justifying their own behavior and justifying their actions and justifying their opinions. Nearly everyone (at least at first) is absolutely sure that their justification makes their actions right. Seeing scientists justifying their actions to make their results look more certain saddens me — if they want more certain results they have to have more certain data, not more justification. • Kip, “So to say that the Anomaly of the Annual GAST is more certain than the actual GAST(absolute)…” For Heaven’s sake, don’t you ever learn anything? No one says that about the Anomaly of the Annual GAST. They say that about the mean of the anomalies, not the anomaly of the mean. Gavin is saying that GAST is unusable because of its large error, and they don’t use it. GISS long ago explained why. • Kip Hansen says: Nick ==> You are still fooling yourself. I know they don’t use the GAST(absolute) because of its uncertainty. So they try an end run around the uncertainty by looking at a smoothing of smoothing of tiny numbers — that can only have a tiny variance — and claim that that tiny variance is the real world uncertainty about changes to Global Average Surface Temperature. It still isn’t — we know the real uncertainty — we can not unknow it using statistics. • Kip Hansen says: Nick ==> You are still fooling yourself. I know they don’t use the GAST(absolute) because of its uncertainty. So they try an end run around the uncertainty by looking at a smoothing of smoothing of tiny numbers — that can only have a tiny variance — and claim that that tiny variance is the real world uncertainty about changes to Global Average Surface Temperature. It still isn’t — we know the real uncertainty — we can not unknow it using statistics on anomalies. • Frank says: Kip: Thank you for respectful reply and pointing out the link to Schmidt again. Above I discussed the difference between a CONSTANT systematic error and other kinds of errors. I explained why a thermometer that consistently read 5 degF too high could perfectly accurately measure a CHANGE in MEAN from a 10,000 measurements to within less than 0.01 degF and proved this with some normally distributed artificial data. Were you convinced by this? As I understand it, Dr. Schmidt is discussing constant systematic errors between different methods of calculating global mean temperature. These methods use different criteria for collecting temperature records, processing that data to get a single value of each grid cell, different sets of grid cells, different methods of dealing with grid cells with no data (interpolation?), different methods for dealing with breakpoints, and probably a dozen other things I know nothing about. Nick Stokes has constructed TWO of his own temperature indices, which differ. He could speak more authoritatively about this. (Nick and I probably disagree about some aspects of homogenization.) By analogy, Schmidt is talking about 5 different thermometers (processing methods) that measure the Earth’s temperature. The NCEP1 “thermometer” reads the lowest at all times and the JRA55 “thermometer” reads the highest all the time. They all go up and down in parallel – they all show a 97/98 El Nino followed by a La Nina. To some extent, they all have a CONSTANT systematic error that doesn’t interfere with our ability to quantify change (which is exactly analogous to my thermometer that always reads exactly 5 degF too high. What Schmidt doesn’t show us is the difference between each of these records. Is it really constant? If these systematic errors (differences in processing) are not constant, then there is greater uncertainty in our assessment of change. As usual, in blog posts Schmidt leaves out the critical information needed to judge whether there is a constant systematic error between these records or not. There is no point in providing fodder for the skeptics. He is preaching to the choir. As a scientist, I don’t listen to justifications. What is the data? How was it processed? How much difference is there between processing methods? Which method(s) do I judge more reliable for correcting systematic errors? You may not be aware, but the global mean temperature (not temperature anomaly) rises and falls 3.5 degC every year, peaking during summer in the Northern Hemisphere. (FWIW, Willis has written about this. The cause is the smaller heat capacity in the NH, more land and shallower mixed layer due to weaker winds.) Taking anomalies removes this seasonal signal and lets us see changes of 0.1 or 0.2 degK/decade. There are vastly more thermometers in the NH than the SH and the set of available thermometers varies with time. Temperature anomalies are merely a practical way of dealing with these systematic problems in the data that interfere with properly measuring change. One can calculate that anomaly by subtracting a 1951-1980 mean or a 1961-1990 mean as determined by any of Schmidt’s five “thermometers” discussed above. If the warming calculated by any of these ten methods differs from another by 0.5K, climate science would be in big trouble. The 0.5 K Schmidt is referring to is not this difference. If I were your Counselor, I would be saying you and Schmidt are failing to communicate what is meant by an error of 0.5 K, and it may not be possible to bridge that gap because confirmation bias (about Schmidt’s motivations, which also make me deeply suspicious) makes it difficult to process new information. • Kip Hansen says: Frank ==> I have had no communication with Gavin Schmidt, and he isn’t involved here today. Gavin’s +/-0.5K is the uncertainty of the metric GAST(absolute). Freely admitted, acknowledged by all, including Mr. Stokes. What this means is that the Global Average Surface Temperature (GAST) is an uncertain metric — uncertain to within a range 1 degree C (or 1K) wide. This is why they don’t use it — because the entire CHANGE in GAST since 1980 is LESS THAN the uncertainty. Now, just follow the logic here — the GAST is uncertain — everyone admits it and complains about it and refuses to use it because of its uncertainty. Despite that KNOWN UNCERTAINTY, they wish to claim that they can statistically derive an anomaly that represents the change in GAST ( see https://www.ncdc.noaa.gov/cag/global/time-series/globe/land_ocean/p12/12/1880-2018.csv ) to an “error-less” one hundredth of a degree C. The title of the data file linked is “Global Land and Ocean Temperature Anomalies”. It does not say “The mean of the world’s individual station monthly or annual anomalies”. It gives the GLOBAL anomaly from the “Base Period: 1901-2000”. This is the oft graph GAST Anomaly (2018 = 0.74°C) The referring page says: “What can the mean global temperature anomaly be used for? This product is a global-scale climate diagnostic tool and provides a big picture overview of average global temperatures compared to a reference value.” So, regardless of Mr. Stokes’ rambling insistence anout the vast difference between the mean of anomalies and anomalies of means, the data is presented to the rest of the world as “average global temperatures compared to a reference value.” and presented without any statement of uncertainty. We know that the GAST(absolute) is subject to +/-0.5K uncertainty. Yet the 2018 anomaly from 1901-2000 is presented as 0.74°C (no uncertainty admitted at all). You see the disconnect….. It is not that I misunderstand…it is that I don’t agree that the method is valid — particularly as it is specifically carried out to rid the true metric of its real uncertainty. It is a trick that allows Climate Science to ignore the known uncertainty about Global Temperature and its change. • Frank says: Kip: No scientists calculates warming by subtracting the global mean temperature anomaly in say January 1901 from the global mean temperature anomaly in January 2001. That is the only way one can get an answer accurate to 0.01 degC. Someone might average all of the months between 1890 and 1910 and then between 1990 and 2010. There is a statistical for properly calculating the “confidence interval around the difference in two means” with confidence intervals. The confidence intervals add “in quadrature”. Google the phrases in quotes if they aren’t familiar. The method used by the IPCC is to do a least squares linear fit to the data, which gives a slope and a 95% confidence interval for that slope. The slope might be 0.0083+/-0.0007 degC/y and then multiply by 100 years – or 90 years or 118 years. The right way to think about this is to say we have 5 different thermometers for measuring the temperature of the planet named MERRA2, CFSR, NCEP1, JRA55 and ERAi. Each thermometer is slightly different and gives a slightly different temperatures for the globe, say because the diameters of the tube the liquid rises through varies slightly (a CONSTANT systematic error). The JRA thermometer almost always reads the highest and the NCEP1 thermometer reads the lowest. They always move in the same direction and nearly the same amount from year to year and over the entire period. http://www.realclimate.org/images/rean_abs.png Click on the Figure (here or at RC) so that it opens full screen in a new window. Print the graph. Measure the amount of warming from the first year to the last. The differences are all between about 0.7 and 0.8 degC. NCEP1 shows the most warming. MERRA2 the least. That is the error in the amount of warming. Not 0.5 degC. Now look at the year to year shifts. How consistent is the gap between the lowest two lines, which averages about 0.3 degC. (0.25-0.35 degC?) You are getting an idea of our consistent (or inconsistent) the systematic error is between these to records. Good luck accepting the analogy with five different thermometers. I’ll try to leave you the last word. • Kip Hansen says: Frank ==> You have correctly determined the measurement difference between your five thermometers — there is some systematic difference between the measurements apparently, which may be caused by the thermometers themselves, and if so, and evenly spread across the quantitative values, can be identified as systemic. What we haven’t identified yet is the real uncertainty of the temperature of MERRA2 as measured by the five thermometer method — which may (and almost always is) much different than the simple variation between measurements. • Frank says: Kip wrote: “What we haven’t identified yet is the real uncertainty of the temperature of MERRA2 as measured by the five thermometer method — which may (and almost always is) much different than the simple variation between measurements.” What does “real uncertainty” mean here? We have real uncertainty in the absolute measurement of temperature and we have uncertainty in the CHANGE in temperature – warming. The uncertainty in warming is not +/-0.5 K. To understand why there are two different kinds of uncertainty, let’s look at how a thermometer is made? You start with a piece of glass tubing with a very uniform internal diameter and attach a large bulb on the bottom that is filled with specified amount of mercury (or some other working fluid), evacuate and seal. You put the bulb in a ice-water bath and mark 0 degC. The distance the mercury rises per degree depends on how much mercury is in the bulb and the accuracy and uniformity of the inner tube. If your ice bath is mostly melted or you didn’t let the thermometer equilibrate in the ice bath for long enough or parallax interfered with proper placement of the 0 degC line, all of your measurements could be 1 or even a few degC off. This is a constant systematic error that won’t interfere with your ability to quantify warming, if the amount of mercury and the diameter of the tube through which it rises are correct. The five “thermometers” that Schmidt is referring to are actually re-analyses that process a variety of temperature and other data from around the world and reconstruct the temperature everywhere in the atmosphere. He is showing you their output for the surface. Each reanalysis does things slightly differently. HadCRUT4, GISS, NOAA, etc create surface temperature records using only surface temperature data. There are a lot of mathematical techniques available for combining (incomplete) data from a non-uniformly spaced set of surface stations involving grids, and separate grids for land and ocean. The Climategate email scandal forced the release of a lot of previous secret data and methods, and a bunch of technically competent bloggers constructed their own temperature indices. Below is a link to some of their earlier results. You might recognize Jeff_Id (skeptical owner of The Air Vent), RomanM (statistician? and then frequent commenter at ClimateAudit), Nick (supporter of consensus), Steve (probably Mosher, then author of a scathing book on Climategate). Each independently created there own temperature records from the same data set used by the Phil Jones. Many thought that kriging was a better statistical method for processing this data, but was challenging to apply. When Mueller got funding from the Koch brothers to develop kriging, Mosher and Zeke joined that project, which became BEST. https://moyhu.blogspot.com/2010/03/comparison-of-ghcn-results.html As a scientist who has watched some scientifically sound skeptical attacks on the consensus succeed (the hockey stick) and other fail (the amount of warming) for more than a decade, I find it distressing to read naive posts questioning the validity of means reported to tenths of a degree from data to the nearest degree, calling temperature anomalies “tricks”, and asserting we don’t know warming to better than +/-0.5 degC. IMO, the real controversy today should be with homogenization: correcting abrupt breaks in station data without understanding their cause. Some are caused by a documented change in TOB, some by a documented station move, some by new instrumentation, but most are not. Some could be caused by maintenance that restores earlier measurement conditions. Correcting such breaks introduces bias into the record. BEST splits records at breakpoints, effectively homogenizing them. One estimate of the size of the homogenization problem can be found here: https://moyhu.blogspot.com/2015/02/breakdown-of-effects-of-ghcn-adjustments.html Despite his biases (we all have them), Nick is enough of a scientist to publish warming with and without adjustments, even though he believes in those adjustments. He has been refining his methodology over nearly a decade and it is absurd to ignore his expertise in technical aspects of this subject. • Kip Hansen says: Frank==> Let your mind out of narrow box of statistical and scientific definitions for a moment. What is “real uncertainty”? It is how uncertain you really are about something. It has nothing to do with how many thermometers or measurements or accuracy of solid state thermistors or whatever. It is your answer to “How uncertain am I?” or “How certain am I?” This shifts into the scientific realm when we look at the scope of the problem, the tools for investigating it, the accuracy and precision we can expect, the logistics of getting those measurements, the methods of recording the measurements, the Murphy’s Law factors in our field, etc etc. I would suggest that we also have to take into account the questions raised in my essay “What Are They Really Counting?”. The acknowledged problems of CliSci with GAST(absolute) should inform us about our real uncertainty in understanding the changes in that metric. It is simple common sense, often abandoned in the trench wars of science, to realize that if we can not figure/calculate/discover the GAST(absolute) any closer than +/-0.5K (or in simple English, “to within a single degree”) that we must realize that the claim of knowing the annual “change” in Global Temperature to hundredths of a degree is a highly unlikely prospect — no matter how much jargon is used to support the claim. • Frank says: Kip writes: “… we must realize that the claim of knowing the annual “change” in Global Temperature to hundredths of a degree is a highly unlikely prospect — no matter how much jargon is used to support the claim” However, I didn’t use jargon to support my claim. I explained how a thermometer works and how an inaccurate 0 degC mark logically could make absolute temperature readings less accurate than temperature change. And I showed you how to demonstrate for YOURSELF how thousands of measurements rounded or not rounded to the nearest degree nevertheless produce the same mean temperature to one hundredth of a degree. No one claims hundredths of a degree accuracy, but if they did, relying on intuition isn’t good enough. Doesn’t your intuition tell you that it is impossible for GPS to calculate your location to within a few feet using the time difference in signals from four satellites (and the speed of light)? When you analyze Schmidt’s 5 temperature records, it is absolutely true that they disagree by more than 0.5K in absolute temperature, but agree on the total amount of warming for every possible period within about 0.1 K. (There are about 25 one-year period on the graph, 24 two-year periods, 23 three-year periods … See how consistent the differences are. ) I suggest you think about confirmation bias. Wikipedia: “Confirmation bias … is the tendency to search for, interpret, favor, and recall information in a way that confirms one’s preexisting beliefs or hypotheses. It is a type of cognitive bias and a systematic error of inductive reasoning. People display this bias when they gather or remember information selectively, or when they interpret it in a biased way. The effect is stronger for EMOTIONALLY CHARGED ISSUES AND FOR DEEPLY HELD BELIEFS.” Kip writes: “Let your mind out of narrow box of statistical and scientific definitions for a moment.” Are you kidding? Precise definitions are essential to discussing any scientific subject. Statistics has been referred to as “obtaining meaning from data”. This scientific discipline is what helps scientists – or used to help scientists – avoid confirmation bias. Since I’ve read Climateaudit for more than a decade, I’m probably a skeptic at heart. However, science isn’t about selecting a few observations or ideas that appear to support what you want to believe, it’s about what experiments and the data say. • Jack of all trades Hansen was a Cubmaster, Scoutmaster, Marriage Counselor, Personal Behavior Counselor Just add magician, and your background will be perfect for “modern” climate science, run by those pesky government bureaucrats with science degrees, or so they claim: ( “when we need data, we’ll just pull the numbers out of a hat” ) • Kip Hansen says: Richard ==> Honestly, that’s the short list….. Climate scientists claim more magic than I’ve every been able to master. 46. Steve O says: “Steve 0 ==> The world is full of maths tricks…. you can not recover an unknown value from an infinite set. Since the actual thermometer reading at an exact time was not recorded, and only the range was recorded (“72 +/- 0.5”) there are an infinite number of potential values for that instantaneous temperature. No amount of statistic-ing or arithmetic-ing will reveal the reading that would have been recorded had they actually recorded it.” — Kip, it’s not a trick. It’s just arithmetic. We don’t have to go back and forth anymore of yes you can, no you can’t, yes you can. I did describe a simple way you can prove it to yourself in five minutes using Excel. You absolutely can determine the true underlying value very precisely, even though you don’t have a single accurate measurement. • Kip Hansen says: Steve ==> I used to do stage magic too….. 47. Kip In the past month I had been concerned that articles on this website were not as good as they used to be. Then your article showed up, and lifted this website to a higher level. Your many contribution to the comments were equally good. So, who did you hires as your ghostwriter? Just kidding, sometimes I can’t help myself • Kip Hansen says: Richard ==> My wife is my best editor and critic — she keeps meas honest as she can — it is an uphill battle but she has stuck it out for 44 years so far. • “she has stuck it out for 44 years so far”. Cruel and unusual punishment? • Kip Hansen says: Richard ==> It looks like it will be a life sentence…. 48. Kristi Silber says: Kip, I’m very sorry – I didn’t go through all the comments, and I may be repeating what others have said. There are so many comments discrediting scientists, I don’t have patience to go through it all. I’ve forgotten much of what I learned about calculating error, but for my purposes it doesn’t matter. (I hope this is coherent, I sort of cobbled it together.) I have a hard time believing that you don’t know why anomalies are used, but just to make sure, this is a reasonable explanation: “Absolute estimates of global average surface temperature are difficult to compile for several reasons. Some regions have few temperature measurement stations (e.g., the Sahara Desert) and interpolation must be made over large, data-sparse regions. In mountainous areas, most observations come from the inhabited valleys, so the effect of elevation on a region’s average temperature must be considered as well. For example, a summer month over an area may be cooler than average, both at a mountain top and in a nearby valley, but the absolute temperatures will be quite different at the two locations. The use of anomalies in this case will show that temperatures for both locations were below average. “Using reference values computed on smaller [more local] scales over the same time period establishes a baseline from which anomalies are calculated. This effectively normalizes the data so they can be compared and combined to more accurately represent temperature patterns with respect to what is normal for different places within a region. “For these reasons, large-area summaries incorporate anomalies, not the temperature itself. Anomalies more accurately describe climate variability over larger areas than absolute temperatures do, and they give a frame of reference that allows more meaningful comparisons between locations and more accurate calculations of temperature trends.” https://data.giss.nasa.gov/gistemp/faq/abs_temp.html ……………………………. ” In REAL meteorology, the recorded value for a single temperature reading is a RANGE” Why would you record a range? Don’t you mean it represents a range? Are you assuming that the measuring instruments used are marked only by whole degrees? “As Schmidt kindly points out, the correct notation for a GAST in degrees is something along the lines of 288.0±0.5K — that is a number of degrees to tenths of a degree and the uncertainty range ±0.5K. When a number is expressed in that manner, with that notation, it means that the actual value is not known exactly, but is known to be within the range expressed by the plus/minus amount.” This has little to do with rounding error, imprecision, inaccuracy, etc., it’s a function of the spread of the data in the base period. “The rest of us can know that the uncertainty of the GAST(absolute) of minimally +/-0.5K means that we can not be certain whether or not the average global temperature has changed much (or how much) over the last decade, and we may be uncertain about it over the last 40 years.” NOT AS LONG AS YOU USE ANOMALIES! The CRU and GISS series from 2010 to 2017 in your post look different, but the graphs of the data are almost exactly the same (apart from the offset). This is an illustration of why the chosen base period is (grrr, can’t think of the word. Starts with “a.” I hate that! Something like “irrelevant”) when looking at trends. One could use this as an illustration of what might happen if you imagine the two datasets were actually absolute temperature from a mountain and a valley in the same area. One set of temperatures is lower, but the trends are almost exactly the same. If you subtracted the baseline from each (say, 1.7 and 1.56, respectively), your trends would overlap almost perfectly. The reanalysis products are somewhat different, though with the same trends. Why did you only post the reanalyses of the absolute temperatures, and not the anomalies? Why did you not post this about the reanalyses: “In contrast, the uncertainty in the station-based anomaly products [reanalyses] are around 0.05ºC for recent years, going up to about 0.1ºC for years earlier in the 20th century. Those uncertainties are based on issues of interpolation, homogenization (for non-climatic changes in location/measurements) etc. and have been evaluated multiple ways – including totally independent homogenization schemes, non-overlapping data subsets etc.” >>>”evaluated multiple ways”<<< In other words, we don't know exactly how errors are determined from looking at this post alone. And the BEST reanalysis is not alone in its calculation of error, which changes over time;' it is 0.05 only during part of the period, when there were more (and more reliable) sources of observation. Intuitively, you would expect error to decrease when you have readings from ASOS, satellites, radioisondes, as well as correlation among neighboring sites that all support a given measurement. (I don't know how many reanalyses include this variety of information for land measurements) (Reanalyses in brief: http://www.realclimate.org/index.php/archives/2011/07/reanalyses-r-us/) What about the overlap in their errors? Schmidt doesn't address the error when you use absolute temperatures as they are recorded (without "reverse engineering" them from the anomalies) to do a reanalysis, which is what most of Schmidt's post is about. This would be a whole different process from using anomalies to calculate absolutes, using an extremely large and variable dataset – in the original, it would contain biases, errors, different instruments… This is why people rely on adjusted data and reanalyses: all the work of figuring out the myriad measurement effects has already been done – including the error. Kip: "It does not reduce the uncertainty of the values of the data set — nor of the real world uncertainty surrounding the Global Average Surface Temperature — in any of its forms and permutations." Perhaps it would be helpful to conceptualize the issue by distinguishing between the actual variance in the data, and the uncertainty associated with measurements. The difference between a station in Zimbabwe and one in Iceland is a source of variance, but not a source of uncertainty. Likewise the mean of August temps vs January. What anomalies do is remove this variance. When discussing trends in global temperature, the error estimate of absolute temperatures is largely an artifact of the wide range of climate norms. This source of error is not attributable to measuring accuracy or precision. (NH and SH temperatures would also tend to cancel each other out. ) I agree that there should be error estimates on some graphs, but the fact that there aren't doesn't mean they don't know what they are doing. In the graphs of running means, error bars might not even be appropriate – they would just be confusing, since the data that the running mean is based on are already represented in the graph – the error would be simply a measure of year-to-year variability that is already obvious from the plot. Graphing the error of each year shown in the plot, as in the BEST graph, might be more informative – but that could be supplied in a different graph or in the text, rather than cluttering the original with too much data. I depends on what one wants to represent, and also the audience. Those who want more detail can read the literature. The graphs presenting reanalyses in Schmidt's essay are for illustration only. ………………………….. Anomalies are appropriate for looking at global (or regional) trends, absolute temperatures are not. It is not a trick. Once you know how error is calculated in the reanalyses, THEN you can make comments about whether it's appropriate or not. It is not right to make assumptions that scientists are trying to hide something without knowing what they are doing. Here, maybe this will help: https://www.ncdc.noaa.gov/monitoring-references/docs/smith-et-al-2008.pdf • Kip Hansen says: Kristi ==> Sorry I missed this from days ago … weekends are busy for me. You have handicapped yourself by not reading all the links and background information…thus not realizing such things as the reanalysis graph is from Gavin Schmidt’s post at RealClimate….. I assure you, I know all of the justifications for using anomalies …. Gavin makes it very clear, and I have no reason to doubt him. The real calculation of GASTabsolute is just too uncertain to prove the AGW hypothesis, therefore, we must use something (anything) else that we can justifiably claim is less uncertain. I don’t think they are trying to hide something, I think they are falling back on statistical theory to allow them to fool themselves into believing that they can know that the Earth’s climate system is retaining more energy from the Sun due to CO2 — thus raising the annual average temperature by 100ths of a degree. 49. Paramenter says: Kip, real life field examples are always good as the antidote for purely mental exercises from behind the desk! In another comment Mr Frank argues however, that known uncertainty matters not so much in detecting the change. When we’ve got sufficiently large sample of the readings with known and large constant uncertainty (readings randomly distributed I presume) we can figure out mean very precisely. Subsequently, we can detect variations in the mean orders of magnitude smaller than order of actual readings. From the other side you argue that we can detect tiny variations in the mean indeed, that however tells us nothing about the change because those variations still will be within wide range of the original known uncertainty. Right? • Kip Hansen says: Paramenter ==> Yes, that is basically correct. • Kip Hansen says: Paramenter ==> Sorry, hit “send” too soon. The tiny variations in the mean tell us something….it tells us that there have been tiny variations in the mean — but if the original data was quantitatively uncertain — in our case +/-0.5K, then variations in the mean on the same scale or smaller still fall within the uncertainty and can not be distinguished from one another. When we known the uncertainty of the metric itself (our +/-0.5K) — and we know the values of the metric (three years given by Gavin Schmidt, for example) we can see the anomaly for ourselves — about 0.3K — and that anomaly falls within the uncertainty known to exist for that metric. A lot of fancy footwork with statistical definitions and statistical theory can;t defeat the basic scientific facts — we know the values and their uncertainty — the difference between those values (either from one another or from some base) is the anomaly — and the uncertainty of the metric devolves to the anomaly. You can do this visually by putting the metric with its uncertainty (our three values) on a graph, put the base period value on the graph (as a straight horizontal line), the anomaly is the distance between the base line and the metric. Notice that the uncertainty didn’t disappear when measured the distance to each data point from the base line– your measurement is just as uncertain — base line to value, baseline to bottom of the uncertainy bar, baseline to the top of the uncertainty bar. No uncertainty is lost in the process. 50. Crispin in Waterloo says: Everyone: There is a new PDF from Agilent Technologies presented as part of their free webinars. It has a very accessible presentation of the basics of measurements and accuracy. https://www.keysight.com/upload/cmc_upload/All/Accuracy_Matters2018.pdf You may have to provide a contact email to access the document, but it is free. • Kip Hansen says: Crispin ==> Thanks for the link — downloads for me with no further effort. 51. Steve O says: Okay, so you don’t trust math… Let’s say you throw darts at a dartboard that’s projected on a wall. You’re terrible at accuracy, but at least you don’t have any systematic problem like pulling to the left. So, your inaccuracy is evenly distributed. About half the time you get the dart withing 3 feet of the outside of the board. Like I said, you’re terrible. Now turn off the projector. If I come into the room after the first dart, I will have no idea where the dartboard was. It can be anywhere. But give me enough dart holes and I’ll tell you almost exactly where the board was, and I’ll even come fairly close to saying where the bulleye used to be, even though all I have is a hole full of walls, half of which aren’t even within 3 feet of the board. I can’t give you the absolute exact location, but the more dart holes I have the closer I’ll get. And I will certainly be able to get closer than +/- 3 feet. If you have a systematic error, the that’s a different story. But it is not magic, and it’s not a trick that I can give you a more accurate location based on inaccurate data. • Paramenter says: “Let’s say you throw darts at a dartboard that’s projected on a wall.” I reckon good chunk of the problems with misunderstanding comes from the fact that we perform too often gedankenexperiment instead of some real one. Everyone imagines something and then discussion around imaginary constructs goes on and on, proving only that everyone can imagine different things. Not very enlightening, is it? It has to be a better way. OK – my turn for gedankenexperiment 🙂 Actually, just for the settings, not a result. Experiment can be physical or virtual. Team A builds a device that outputs some precisely controlled signal. Team A knows the output value very accurately. Team B can measure output with with less accurate detector, say with the resolution +/-0.5 and then rounds readouts to the nearest integer. Team A may decide to induce slight upward or downward trend (or none) in the outputted data; trend still within known uncertainty range. After collecting sufficient number of samples Team B may torture the numbers as they wish. But in the end of the day the Team A will ask the question: tell me, honey, have your numbers confessed anything about a trend? • Steve O says: You’ll never convince Kip that you can mathematically deduce something to a decimal point if the inputs are rounded. He’ll object that Team B used tricky math, and ignore the fact that they actually did determine the precise value. I and others have described how to use Excel to prove what he believes is incorrect but he is not interested. He “knows” it can’t be done. It degrades the credibility of this website that this article is here. 52. Paramenter says: Hey William, thank you for sharing, really appreciated. I cheerfully embrace sampling theorem; in the fields of telecom, electronics or RF engineering this is a hard, settled science, as some would say. What I’m not entirely sure (just yet) is its applicability to the world of temperature averages, and if so to what extent. But that may be a promising avenue to explore. ‘The twice a day sampling of temperature is mathematically invalid.’ For older records it is not even twice per day, it was just once. As Kip explained in his article in the past thermometers recorded max and min temperature per day but only median was captured in the written logs. So all we know from the older records is something like that: 14 C = (Tmin + Tmax) / 2 But except for telling us that sum of Tmin and Tmax was 28 it won’t tell us much more. I reckon we can securely assume that for older records daily temperature variations cannot be reconstructed due to sampling limitations. All we have is a brand new signal constructed from daily medians. Now, is it sufficient for purposes of tracking temperatures per station or region? Maybe yes, we don’t actually need reconstruct the original underlying signal, maybe not. That should be easy to verify though; I’m sure bright guys did it many times already. We don’t even need 30 years. Few recent years and a station with a good quality temperature record, sampled reasonably often round the clock. If integration of those records yields significantly different results from timeseries of daily medians that at very least suggests that your hypothesis holds the water and daily average timeseries is massively distorted compared with original temperatures. Thanks for providing examples, I shall certainly have a shot on that myself. • ” As Kip explained in his article in the past thermometers recorded max and min temperature per day but only median was captured in the written logs.” Well, if he explained that, he’s wrong (again). The written logs almost invariably reported the max and min, and not very often their mean. For the US, you can find facsimiles of the originals, many hand-written, here. • Kip Hansen says: Nick ==> Most records in the hand written logbook days contain the Tmax and the Tmin and often the Tobs (T at time of observation). No one needed nor cared about Tavg — max and min was enough for human needs. Weather reporting today follows the same pattern (at least on radio) reporting “yesterdays High of… and a Low of…” I’ve never heard a weather report that gave a day’s average temperature… Not sure who misquotes me above, but I was speaking, as you know, of what was written down for either the Max or the Min — any reading from 71.5 to 72.5 was written down as “72”. • Paramenter says: ‘Not sure who misquotes me above’ That was my bad boy. I wrongly interpreted one passus from your article (‘when Climate Science and meteorology present us with the Daily Average temperature from any weather station’). I assumed that for older historical records all we have is written log already calculated medians. As Nick pointed out this is not correct and such logs contain Tmax and Tmin. Apologies then. Well, looks Mr Ward was right – for historical records we have not just one but just two samples per day. • Kip Hansen says: Paranmeter==> We may have an unhelpful third sample — the temperature at time of observation. • William Ward says: Paramenter and Kip, I have some important data to add. If someone can help me to learn how I can upload a chart image I can share a graph of some NOAA data that adds support to my assertion that (Tmax+Tmin)/2 gives a highly erroneous mean. NOAA actually provides the information for us. I didn’t realize this earlier. While doing some calculations with NOAA’s USCRN data I realized that NOAA already provides what appears to be a Nyquist compliant mean in their daily data. As you might know the USCRN is NOAA’s “REFERENCE” network – and from what I can tell, the instrumentation in this network is science-worthy. A NOAA footnote from the Daily data: “The daily values reported in this dataset are calculated using multiple independent measurements for temperature and precipitation. USCRN/USRCRN stations have multiple co-located temperature sensors that make 10-second independent measurements which are used to produce max/min/avg temperature values at 5-minute intervals.” NOAA still reports (Tmax+Tmin)/2 as the mean, but also calculates a mean value from the 5-minute samples (which are in fact averaged from samples taken every 10 seconds). Subtracting (Tmax+Tmin)/2 from the 5-minute derived mean gives you the error. I think the amount of error is stunning. I used data from Cordova Alaska from late July 2017 through the end of December 2017. https://www1.ncdc.noaa.gov/pub/data/uscrn/products/daily01/2017/CRND0103-2017-AK_Cordova_14_ESE.txt On some days the data was missing, so I deleted those days from the record. A side issue is what NOAA does when data is missing but I’ll put that aside for later. There are 140 days in the record. Here is a summary of the error in this record: There are only 11 days without error. The average daily error (absolute value) is 0.6C. There are 34 days with an error of greater than 0.9C. There are 2 days with an error of over 2.5C. Max error swing over 12 days: 5C. 2-3C swings over 3-4 days are common. Can I get you to take this in and advise your thoughts? Ps (Paramenter): There is not 1 example of where Nyquist does not apply if you are sampling a continuous band-limited signal. It’s as fundamental as “you don’t divide by zero”. The point in calculating a mean is to understand the equivalent constant temperature that would give you the same daily thermal energy as the complex real-world temperature signal. We talk about the electricity in our home as being “120V”. Actually the signal is about 170V-peak or 340V peak-to-peak sine wave. Saying 120V is actually 120Vrms (root-mean-squared). RMS is the correct way to calculate the mean for a sine wave. A 120V DC battery would give you the same energy as the 340V peak-to-peak sine wave. The mean needs to be calculated with respect to the complexity of the signal or it does not accurately represent that signal. If Nyquist if followed you can calculate an accurate mean. It matters in climate science if you really care about the thermal energy delivered daily. • William Ward says: See image at this link: https://i.imgur.com/Tcr5CNo.png Y-axis is temperature degrees C X-axis is days Chart shows (Tmax+Tmin)/2 error as compared to mean calculated using signal sampled above Nyquist rate. Data from NOAA USCRN. Calculations are done by NOAA. • Clyde Spencer says: William Ward, What does it look like if you create a scatter plot of the T-average versus the T-median? What is the R^2 value? • William Ward says: Clyde, See here: https://i.imgur.com/BDdaBLF.png A scatter without lines was not helpful in my opinion. Even this is not so helpful as the scale does not allow you to easily see the magnitude of the sampling error. The previous chart shows that well. I should mention “Tavg” is NOAAs label and represents the mean calculated with the much higher sample rate. Tmean is the historical method of (Tmax+Tmin)/2. I assume you are asking for the coefficient of determination (R^2). Can you explain why you think statistics would be helpful here? I see a lot of people so eager to jump in and crunch the data – but the data is bad. Processing it is just polishing a turd. I think it will be difficult to get a full grip on the potential for just how much error is in the historically sampled data. It really depends upon how much higher frequency content is present in any daily signal. NOAA has a lot of data, I could run data for more stations to see if the error profile seems consistent. The station I randomly selected has significant error relative to the 1C/century panic we are fed. I see you have a video background. I have done work there too. Have you done much with data conversion? I think the issue I’m bringing up here should have significant merit, but the responses I have gotten so far shows that it is not a well known concept. I’d like to get this shot at to see if it is bullet proof. If I made any mistakes in logic or calculations I’d like to know. • Clyde Spencer says: William Ward, Your most recent graph is not a scatter plot. It appears to just be a plot of both temp’s versus days. What I was asking about was a plot where the x-axis is one temperature and the y-axis is the other temperature. Yes, I was asking about the coefficient of determination. No, I don’t have a background in video. My degrees are in geology. I picked up the anecdote about NTSC while attending a SIGGRAPH lecture on color when I was involved in multi-spectral remote sensing. • William Ward says: This is the NOAA USCRN for Cullman AL for every day of 2017. (I selected Cullman because I got a speeding ticket there long ago. I was rushing to meet an important customer, but the customer’s flight was cancelled and the ticket was all for nothing. I gave up speeding that day.) https://i.imgur.com/aoUX30R.png • William Ward says: This is the NOAA USCRN data for Boulder CO for 363 days of 2017. Y-axis is temperature degrees C X-axis is days Chart shows (Tmax+Tmin)/2 error as compared to mean calculated using signal sampled above Nyquist rate. https://i.imgur.com/MTXBz05.png • William Ward says: You are right Clyde – sorry to be slow on the uptake. Here (hopefully) is what you request. https://i.imgur.com/WKJ6a4A.png The ideal scatter plot would be a straight, diagonal line (if the 2 means were identical), correct? The wider the scatter, the greater the difference between the 2 methods of calculating the mean. I assume this is the point. Its another way to take in the extent of the error from the properly sampled signal. Yes? I can send you the raw data (or you can download from NOAA) if you want to run calculations based upon your expertise. Thanks for sharing your background. There are so many knowledgeable people on WUWT. There is a lot of diversity with the expertise. I’m an electrical engineer and have worked as a board and chip designer, applications engineer and business manager. I have worked in the process control industry and the semiconductor industry. If you have a cable modem, cable or satellite TV box then these are likely using my chips. Likewise with man car audio systems. Data conversion, signal analysis, audio and video, and communications were at the core. I also do audio engineering and a lot of work around signal integrity through audio amplifiers and data conversion. Temperature is just another signal to be sampled and analyzed. Statistics was in the curriculum but not used regularly so I’ll leave that work to those better versed. • Kip Hansen says: William ==> Visually, that looks skewed to the upper right quad. • Clyde Spencer says: Kip, You said, “Visually, that looks skewed to the upper right quad. ” I think that the correct interpretation of that is that warm/hot temperatures occurred more frequently than cold temperatures. However, there were some cold nights(?). • Clyde Spencer says: William Ward, The scatter doesn’t actually look too bad to me. But, it does make the point that the two are NOT exactly the same. Therefore, it again raises the questions of accuracy and precision. The reason I asked about the R^2 value is because that tells how much of the variance in the dependent variable can be explained by the independent variable. That is, it tells us how how good of a predictor the independent variable is for the dependent variable. I’m not sure which is which on your plot. However, to apply to the question of whether the median of T-max and T-min are good predictors of the average daily temperature, the median temperatures should be plotted on the x-axis. The regression formula might also be helpful because if the y-intercept isn’t zero, then it would suggest that there is also some kind of systematic bias. Since you seem to be new here, you might want to peruse these submissions of mine: Kip also has written on the topics previously, and gives a slightly different, but complementary analysis. • William Ward says: Clyde, I’m thinking about the scatter plot and R^2 value… If I understand correctly, that analysis tells you how close the data falls to an ideal trendline. If my understanding is correct, this approach will not yield valuable information. What is accurate is the highly sampled dataset and the error is the sub-sampled dataset relative to the reference highly sampled dataset. Using a trendline says that the correct value is somewhere between the sets. It uses the trendline as the standard but the standard is the properly sampled dataset. The error graphs I have provided show how far off the sub-sampled data is from the correct value. I think that tells the story. Thoughts? • Kip Hansen says: William ==> Congrats on getting the images to appear in comments! Still, send me your whole bit and I’ll see if I can turn it into an essay here for you. My first name at the domain i4 decimal net. • Clyde Spencer says: Kip, I’m not seeing any graphics on my end, like I used to. All I see are the links. Is there something I have to add to my browser to get the images automatically? • Kip Hansen says: Clyde ==> You ought to for most of them. This functionality dropped out for a while after the server crash a few weeks back, nut appears to be working again for me. Go the the TEST page, — enter in the url for an image — with no specifications, only the url ending in .jpg or.png — starting and ending on its own line. See if that works. • Clyde Spencer says: Kip, The last entry on the test page is from William and shows his link, but no image there either. • Kip Hansen says: Clyde ==> Odd — I’ll have to do some digging on that…. • Kip Hansen says: William ==> You can email the graphic to me, with all your links and links of data sources — I will either post the graph or write an short essay to report on your finding. I have done this before for other stations — which is why I keep telling Stokes and others that the Daily Average is a Median and should not be scientifically conflated with an actual “average” (the mean) for daily known temperatures. Email to my first name at the domain i4 decimal net. • Kip Hansen says: William ==> You can email the graphic to me, with all your links and links of data sources — I will either post the graph or write an short essay to report on your finding. I have done this before for other stations — which is why I keep telling Stokes and others that the Daily Average is a Median and should not be scientifically conflated with an actual “average” (the mean) for daily known temperatures. Email to my first name at the domain i4 decimal net. PS: Make sure to include the raw numerical data, not just graphs of it — I will want to do some calculations based on the numerical differe3nces between Tavg and Tmean. • William Ward says: Thanks Kip. I’ll get a high quality package to you this weekend. I’ll send you a test email to confirm we have a link. If you can reply to confirm the connection I’d appreciate it. • Kip Hansen says: William ==> No test email ..my first name, kip, at i4.net • Paramenter says: Hey Kip, really glad to hear that you will have a shot into this subject. Please do not constrain yourself only to the short essay. If the topic is important I’m sure everyone would love to see, well a good “novella” 😉 On the different note. I believe consequences of undersampling of the historical records is worthy to have a closer inspection [shy look towards Mr Ward]. A good study, beefed with conceptual background, seasoned with comparison of actual records and sprinkled with examples – that would certainly make a treat! • Paramenter says: ‘Ps (Paramenter): There is not 1 example of where Nyquist does not apply if you are sampling a continuous band-limited signal.’ Copy that. Of course it does apply here indeed. It applies as well for instance for such phenomena as accelerations and velocities where recorded values have to meet expectations of Mr Shannon and Nyquist if we want to reliably derive a distance or location. Temperature series ares no exception here. As I said I’m slow thinker but will get there, eventually 😉 Playing a devil advocate I would imagine you opponent would argue slightly differently now: granted, there is no slightest chance that a temperature signal cannot be reliably reconstructed from the historical records. That means significant error margin. But – we don’t really need that. What Mr Shannon is saying that you need to obey his rules if you want your signal to be reconstructed exactly. If what what you need is just some kind of approximation you may relax your sampling rules and still have some sensible results. As per me – that sounds reasonable for tracking temperature patterns or larger scale changes but we we are talking about tiny variations over long periods of time. Not good. Not good at all. Thanks for your calculations and charts. Interesting. I’m bit surprised seeing that error drift is quite substantial. Error variations for Boulder are bit terrifying. What is the error mean for that? • William Ward says: Hi Paramenter, Your replies really stand out for their friendliness and humbleness. That is meant as a compliment to you. And I would not say you are a slow thinker. You are right, opponents might say that we don’t need to completely reconstruct the original signal. Here is the reply to that. Complying with Nyquist allows you the *ability* to fully go back to the analog domain and reconstruct the signal after sampling. The importance of this is that you know you actually have fully captured the signal. It doesn’t *require* you to convert back to the analog domain. Here is the important point. Doing any digital signal processing requires you to have fully captured the signal and use that minimum set of data – or your processing has error. Digital signal processing (DSP) sounds fancy and you may not think of calculating the mean as DSP. But it is. The math swat team will not repel on ropes and crash through your windows and stop you from doing math on badly sampled data – but the results will be wrong. If you want to calculate the mean with any accuracy, then you need the full Nyquist compliant data set. A lot of very knowledgeable mathematicians apparently don’t know much about signal processing and are far too eager to crunch any data put in front of them. At one point in history, just getting a general approximation of what was happening with local temperature was sufficient. So, working with an improperly sampled signal and getting an erroneous mean was not a problem. But now we have a situation where Climate Alarmism is being pushed down our throats. We are told it’s all about “science” – “scientists agree” and the “science is settled”. We are told to stop “denying science” – “to stop resisting, comply with the smart people and fork over our money and change our way of living, because… well because “science”. As an engineer (applied scientist – someone who actually uses science in the tangible world) and as a person who loves our American Experiment in Liberty – and someone who is a sworn enemy of manipulation and institutionalized victimhood – I call bullshi+ on their “science”. Its still-born. Its dead-on-arrival. The Alarmists can’t quote us temperature records of 1 one thousandths of a degree C without me shoving Nyquist up their backside. They can’t have it both ways. If they want accuracy and precision, then they need to actually have it. I hope my rant wasn’t too off-putting. • Clyde Spencer says: William Ward, +1 Right on mark! Your next hurdle will be to convince Nick Stokes that you know more about sampling, and computing a correct mean of the appropriate precision, than he does. • William Ward says: Thanks Clyde, I have seen Nick’s name and I know I have read some of his posts but have not logged anything about his positions. I have not communicated with him but look forward to the opportunity to do so. Perhaps, with Kip’s help, we can put together a post to dig into this. • “Complying with Nyquist allows you the *ability* to fully go back to the analog domain and reconstruct the signal after sampling.” Here is my own plot, derived from three years of hourly observation at Boulder, Colorado. The article is here, and describes how the running annual mean, shown in the plot, varies as you choose different cut-off periods for reading the max-min thermometer. The black curve is the 24 hour average. There is quite a lot of variation of the min-max depending on reading time. That is the basis of he TOBS correction; a consistent offset matters little, as it corresponds to the effect of just having the thermometer in a different location, say. But it creates a spurious trend if the time is changed, as in the US volunteer system it often was. But the focus on Nyquist is misplaced. Nyquist gives an upper limit on the frequencies that can be resolved at a given regular sampling rate. No-one here is trying to resolve high frequencies (daily or below). For climate purposes, the shortest unit is generally a one month average. The sampling frequency is very high relative to the signal sought. The situation is not well suited for Nyquist analysis anyway, because there is one dominant and very fixed frequency (diurnal) in that range, and the sampling is twice a day. No-one is trying to measure the diurnal frequency. • “The article is here, and describes…” The article is on that page, but the direct link is here. • William Ward says: Nick, it’s nice to be speaking with you. Thanks for your excellent information. I can use it to beautifully drive home my points. Any suggestions that Nyquist doesn’t apply to climate measurements or is somehow ill applied is a losing position. This subject runs the risk of being scattered with discussions of what happens once you have a large pool of data and what can be done statistically, etc. So many are so eager to crunch data and don’t realize that they have failed the basics in obtaining the data. So, my approach here will be to go slow and try to address the fundamental factors to see where we align. Why do we measure temperature? Because it is helpful to know? No. We don’t measure temperature because it is the primary goal. We measure temperature because it is the easiest way to approximate thermal energy in the system. The goal is to understand the amount of thermal energy and its changes over time. What is the point of knowing Tmax, Tmin and Tmean? Tmax and Tmin are actually quite useless. They don’t tell us how much energy is in the system. The thermal energy in the system can only be found by integrating the temperature signal. By measuring the “area under the temperature curve”. This is the temperature at any given time *and* how long that temperature exists. Different days have different temperature profiles – different signals. You can have 200 days on record that have a Tmax of 100F and a Tmin of 60F and they all have different amounts of thermal energy. To know this, you need the information about the full signal. Tmean is an attempt to know what single constant value of temperature provides the *equivalent* thermal energy for the day that the actual more complex signal provides. Analogy 1: Someone moves the flow control on a faucet all day long, moving smoothly and continuously, similar to how temperature would vary throughout a day. Water is flowing at different rates between say 1 liter/min to 10 liters/min. You can come along at several points in the day and measure the rate of flow. You can look at your data and determine the max and min flow for the day. If you have sampled properly according to Nyquist, you can integrate and determine the volume of water captured in the tub. If you had a graduated tub, you could just measure the amount of water in the tub. That is the goal – to measure the total water delivered, not the max or min. The mean simply tells you what constant flow over the same amount of time would deliver the equivalent amount of water to the tub. Analogy 2: Think of the electric meter at your house. Its purpose is not to tell you your maximum kWh usage – its job is to integrate and tell you how much energy you used. Nyquist is not about focusing on any particular frequencies. **It’s about capturing all of the information available**. You would have to look at the frequency content to determine how significant the content is at each frequency. Ignoring higher frequencies just means you are allowing aliasing (error) as a result of under-sampling those frequencies. With modern converters and the cost of memory, there is *absolutely no good reason to alias or lose any critical information*. Sample the signal properly, store that information and then the world of DSP is available to you – *AND* the calculations have an accurate relationship to what actually happened thermally – in the place where the signal was measured. Sampling at Nyquist is about getting an accurate digital copy of the analog signal so you can process it mathematically (without error). Time of Observation Bias (TOB): If you are sampling with automated instruments, then there is no TOB. Someone selects the end points and keeps it consistent. From there it can be automated. Your graph “Running annual, hourly and Tavg temperature measures, Boulder 2009-2001” perfectly illustrate my point. Yes, changing the start and end points of the day will give a different mean. Does that surprise you? You are changing the curve/signal – changing the shape of what is being integrated. Of course, there will be an offset! You are *not* measuring the same thing if you change the end points. Now to the heart of the matter: why does climate science insist upon doing a daily mean? Well, because that is the way we used to do it. Understanding climate requires study over long periods of time and we really need to be consistent over time. Yes. But here is the rub. We used to do it wrong. So, we keep doing it wrong in order to have a record long enough to study climate. But we are still wrong! We used to poop our pants as infants. Why would anyone want to keep pooping their pants as an adult? Climate science is still pooping its pants. We are either wrong because we adhere to bad practices or we adopt good practices but have to surrender the past data which has no real value. Calculating Tmean, whether hourly, daily, weekly, monthly and yearly is not necessary! No other discipline of signal analysis does this. Ok, so what should we do instead?! We run an integrator in a digital signal processor (DSP). We start sampling at Nyquist and we never stop sampling. We start the integrator and we never stop the integrator. (Window and filter parameters must be specified and then they move continuously). Long ago we could see the readout on a chart recorder. Today, we stash away the digital data and we can record the output signal of the integrator. We get a continuous, never-ending integration – a continuous running mean (as defined by the digital filter parameters). If we want to know how much the energy changed over a day or over months, years or between any 2 hours you select, we just grab the data and find the difference. It’s the stubborn determination to calculate daily means that gets us into this trouble. There is no need for it. In broadcast, there are program loudness requirements. The program material must maintain a certain loudness and maintain a certain dynamic range. Filters are defined, and integration takes place on the program material. This information can be used to set gain and other parameters to obtain the desired loudness range of the mastered output. These are every day tools used in the radio, record and TV broadcast industry. A video program may last 2 hours, but there is no reason that the integrators can’t run forever (with occasional calibration). In industrial process control, PID (Proportional, Integrative, Derivative) controllers run all of the time. If you need to maintain the flow in a pipe, the fluid level in a tank or the pressure of a process, this is done by PID loops – handled in the analog or digital domain. Newer controllers are digital. You must sample according to Nyquist – you sample continuously and run your integrators continuously. The data is available, and if it is not aliased you can confidently do any DSP algorithm you want on that data. You don’t get the endpoint problems associated with the artificial daily mean requirement (TOB). You don’t get the error associated with simplistic (Tmax+Tmin)/2. If you screw up the control, then all of the liquid polymers flowing through the pipes in your billion-dollar factory solidify. You have to shut down the factory for a month and change out the pipes. That might cost hundreds of millions of\$ of lost revenue. If you screw up the control of the chemical factory you get a chlorine gas leak and kill 10,000 people in the nearby town. Proper sampling and signal analysis works in the real world. If climate science is committed to pooping its pants, then I just wish they would do it in private and stop trying to panic the world because of it.

• Paramenter says:

Hey William, thanks for kind words. Your ‘rant’ wasn’t off-putting at all, quite contrary, please keep it going!

You are right, opponents might say that we don’t need to completely reconstruct the original signal.

I reckon Mr Stokes already verbalized that. As far as I understand his argumentation Nyquist does apply here indeed and in fact should be applied. However, now our baseline temperature signal is not daily temperature but monthly averages built from daily medians. Therefore we can ignore much higher frequency daily temperature signal without compromising quality of records required for climate science. Also, by doing that we comply with Nyquist because basic unit is monthly averages based on daily medians what satisfies sampling theorems.

Again, that may be perfectly valid for all practical applications. But, if we’re talking about fine-grained analysis loosing information from higher frequencies and this crude sampling may mean that we’re well off the target.

• Clyde Spencer says:

Parameter,

The historical baseline temperature is really of little concern. One can subtract Pi or any number you wish to obtain an ‘anomaly.’ It is more meaningful if the baseline represents some temperature with which one wants to make a comparison. However, currently, there are two time periods that are commonly used as baselines, which actually adds to the confusion. An alternative would be to take an average of the two extant baselines and DEFINE it as the climatological baseline, with an exact value, perhaps rounded off to the nearest integer. Another alternative approach would be to take a guess as to what the pre-industrial average temperature was and DEFINE it as the baseline, again as an exact value. What we are looking for is a way to clearly demonstrate small changes over time.

Now, the poor historical sampling clearly deprives us of information on the high-frequency component of temperature change, which may or may not be important when trying to understand natural variance.

I fail to see why anomalies are necessary for infilling where stations are missing. Interpolating, based on the inversely-weighted distance from the nearest stations, should work reasonably well, even using actual temperatures. Although, one might want to account for the lapse rate by taking the elevation of the stations into account as a first-order correction. We now have digital elevation models of the entire Earth, thanks to the Defense Mapping Agency. However, all interpolated readings should be flagged as being less certain than actual measurements, and probably downgraded in precision by an order of magnitude.

• Kip Hansen says:

Clyde ==> I am afraid that while the year-span of the”climatic baseline” shifts, if I’ve understood Stokes properly, they calculate each station Monthly Mean against its own station 30-year Monthly Mean climatic. baseline.

• Clyde Spencer says:

Kip,
You said, “… they calculate each station Monthly Mean against its own station 30-year Monthly Mean climatic. ” Actually, I would hope that it would be done that way, as it makes more sense as a way to account for micro-climates and elevation differences, and make the anomalies more suitable for interpolation of missing station data. However, I don’t have a lot of faith that is what is being done because of the amount of work involved. In any event, it still doesn’t prevent a baseline being quasi-real. That is, some integer number close to the station baseline mean could be defined as the climatic exact mean for purposes of calculating the station anomaly.

• Kip Hansen says:

Clyde ==> I bring it up only because I have confirmed where monthly Tavg is calculated. Stokes starts with station-level Tavg monthly values.

I agree is is a bit wonky.

• Clyde Spencer says:

Stokes,
I read your article where you claim, “But the [temperature] rise is consistent with AGW.” The rise is also consistent with the end of the last Ice Age. The trick is to untangle the two, which you gloss over.

• William Ward says:

Clyde, Kip, Paramenter, Nick and All,

I think/hope you will find the following to be good info to further the understanding. Covered here:

1) Using NOAA USCRN data, I experimented with reducing sample rates from the 288/day down to 2/day to find out how Tmean degrades. Interesting findings.
2) FFT analysis of USCRN data over 24-hour period to see how frequency content compares to results in #1. They align.
3) Jitter – the technical term for noise on the sample clock. Another explanation for why (Tmax+Tmin)/2 yields terrible results.

Point 1) It is important to understand that proper sampling requires that the clock signal used to drive the conversion process must not vary. Said another way, if we are sampling ever hour then the sample must fire at exactly 60 minutes, 0 seconds. If the clock signal happens sometimes at 50 minutes and sometimes at 75 minutes, this is called clock jitter. It adds error to your sample results. The greater the jitter the greater the error. I’ll come back to this as I cover point #3, but it also applies to point #1. For this experiment I selected some data from NOAA USCRN. I provided the link previously, but here it is again.

https://www1.ncdc.noaa.gov/pub/data/uscrn/products/subhourly01/2017/CRNS0101-05-2017-AK_Cordova_14_ESE.txt

For Cordova Alaska, I selected 24 hours of data for various days and tried the following. I started with the 5-minute sample data provided. This equates to 288 samples per day (24 hr period). I then did a sample rate reduction so as to compare to 72 samples/day, 36 samples/day, 24 samples/day, 12 samples/day, 6 samples/day, 4 samples/day and 2 samples/day. The proper way to reduce samples is to discard the samples that would not have been there if the sample clock were divided by 4, 8, 12, 24, 48, 72, 144 respectively. The samples I kept have a stable clock (the discarded samples do not introduce jitter). This is the data from 11/11/2017. Table function in WUWT/Wordpress is not working – so see image at this link for the results of the comparison:

https://i.imgur.com/vtx3iII.png

After looking at several examples, it appears to me that 24 samples/day (one sample per hour) seems to yield a Tmean that is within +/-0.1C of the Tmean that results from sampling at a rate of 288 samples/day. In some cases, 12 samples/day delivers this +/-0.1C result. 4 samples/day and 2 samples/day universally yield much greater error. While there is clearly sub-hourly content in the signal, if we assume +/- 0.1C is acceptable error then 24 samples/day should work, and this suggests frequency content of 12-cycles/day. This is an estimate only based upon limited work. With 288 samples/day available I suggest it be the standard. Especially in a world where 0.1C can determine the fate of humanity.

Point 2) I did some limited FFT analysis and it supports the findings in #1 above. See image here:

https://i.imgur.com/fePXj9r.png

It shows after 30 samples/day the content is small.

Point 3) Back to (Tmax+Tmin)/2. This has (at least) 2 problems. First, the number of samples is low. From what we have seen in #1 and #2 above, we really need to sample at a minimum of 24 samples per day to capture most of the signal and get an accurate mean. Tmax + Tmin is only 2 samples and that yields a very bad result. Next, selecting Tmax and Tmin will almost certainly guarantee you will add *significant jitter* to the sampling. This is because Tmax and Tmin do not happen according to a sample clock with a 12-hour period. They happen when they happen. This is important because jitter is noise – which translates to an inaccurate sample. This leads to an inaccurate mean. What is really shocking to see, is that in many examples, just sampling the signal 2 times per day according to the 12-hour sample clock actually yields results closer to the 288 sample/day mean than if the Tmax and Tmin are used! Wow.

Summary: It’s time to start the grieving process. Climate science has to get past the denial and anger. Get through the bargaining. Go through the depression. And then get on with accepting that the entire instrument temperature record is dead. It’s time to bury the body.

My deepest sympathies are offered.

Ps – Don’t forget the sampling related error gets added to the calibration error, reading error, thermal corruption error, TOB error, siting error, UHI, change of instrument technology error, data infill and data manipulation.

• Kip Hansen says:

• William Ward says:

Hi Paramenter,

You said: “I reckon Mr Stokes already verbalized that. As far as I understand his argumentation Nyquist does apply here indeed and in fact should be applied. However, now our baseline temperature signal is not daily temperature but monthly averages built from daily medians. Therefore we can ignore much higher frequency daily temperature signal without compromising quality of records required for climate science. Also, by doing that we comply with Nyquist because basic unit is monthly averages based on daily medians what satisfies sampling theorems.”

Nick would be wrong on all of those points. If the daily mean is wrong then so is the monthly. Cr@p + Cr@p + Cr@p = A Lot of Cr@p. I just provided some info in another reply where I show we need 24 samples/day min for reasonable accuracy.

• “If the daily mean is wrong then so is the monthly. “
How do you know? Try it and see.

The obvious point here is that you are just on the wrong time scale. Climate scientists want to track the monthly average, say. So how often should they sample? Something substantially faster than monthly, but daily should be more than adequate.

In about 1969, when we were experimenting with digital transmission of sound and bandwidth was scarce, we reckoned 8 kHz was adequate for voice reproduction, where the frequency content peaked below 1 kHz.

You keep thinking in terms of reproducing the whole signal. That isn’t what they want. They want the very low frequencies. In the Boulder, Colo, example I showed, I did a running year average. There was no apparent deviance from the family of min/maxes.

“Summary: It’s time to start the grieving process.”
That is a sweeping conclusion to base on a few hours scratching around. But if you really want to prove them wrong, find something they actually do, not that you imagine, and show how it goes wrong.

• Kip Hansen says:

Nick ==> Define what it is you wish to find out about. Meteorologists are interested in temperature across time as it informs them of past and future weather conditions. Climate Science is interested in long-term conditions and changes. AGW science/IPCC science is interested in showing that CO2 levels cause the Earth climate system to retain more energy as heat.

(Tmax+Tmin)/2 does not satisfy the needs of AGW Science but is being used for that purpose.

• William,
“You don’t seem to understand this point: if you under-sample then the aliased signal gives you error in your measurement. “

It depends on what your measurement is. In this case it is a monthly average. Aliased high frequencies still generally integrate to something very small. You might object that it is still possible to get a very low beat frequency. But that would in general be a small part of the power, and in this case exceptionally so. The reason is that the main component, diurnal, exactly coincides with the sampling frequency. There is no reason to think there are significant frequencies that are near diurnal but not quite.

” Is 0.4C significant over 1 year?”
Your Cullman example corresponds to what I was looking at with Boulder. Min/max introduces a bias, as my plot showed, and it can easily exceed 0.4°C. But that is nothing to do with Nyquist. It is the regular TOBS issue of overcounting maxima if you read in the afternoon (or similar with minima in the morning).

• William Ward says:

Its the last entry in the table… maybe you missed it? If I’m not understanding you can you clarify.

The table I provided:

https://i.imgur.com/vtx3iII.png

It shows: using 288 samples/day gives us a Tmean of -3.3C. As we reduce sample rate and come down to 72, 36 and 24 samples/day we get a result that is +/-0.1C around that value (so, -3.4 and -3.2C). 2 samples (with proper clock) gives us -4.0C. Taking the high quality Tmax and Tmin that 288 samples/day gives us and dividing by 2 yields a mean of -4.7C. Note in this example, just using 2 samples/day (no special samples, just 2 that comply with the clocking requirements) you get a more accurate result than using the max and min of the day. That should be eye opening.

I selected this example as 1) it showed a typical roll-off of performance with lower sample resolution and 2) it explored one of the higher variations between (Tmax+Tmin)/2 and 288 sample/day mean. This is because of aliasing. Later today I’ll reply back to Nick with more on that. I’ll explain what aliasing is and use some graphics. Aliasing can give large error.

• Kip Hansen says:

William ==> Sorry, distracted this morning….

• William Ward says:

Nick,

I’m breaking my reply to your latest response up into 2 replies.

1) Nyquist and Aliasing: I’ll give a basic overview of aliasing.
2) Time-of-Observation Bias: I think what you call TOB is not TOB.

This post is #1 on Aliasing.

Aliasing is not a “very low beat frequency” as you state in your post.

Aliasing is not a function of integrating and it isn’t guaranteed to be “very small”.

First some basics. The image at the link below shows a 1kHz sine wave both in the time domain and in the frequency domain. The frequency domain shows that all of the energy of a 1kHz sine wave exists at exactly 1kHz in the frequency domain. It’s intuitive.

https://i.imgur.com/tSiGl0t.png

If we mix 2 sine waves together (500Hz and 800Hz), the time domain signal looks more complex. See image here:

https://i.imgur.com/PMrnjhx.gif

Green signal is time domain. You can see the 500Hz and 800Hz energy in the frequency domain as red spectral lines.

An even more complex signal consisting of many (but a limited number of) sine waves of various amplitudes can be generically represented in the frequency domain by this graph.

https://i.imgur.com/kItGEjK.png

The spectral lines are contained inside of an envelope. “B” shows the “bandwidth” of the signal – where the frequency content ends.

When sampling, to comply with Nyquist the sample frequency must be >= (2)x(B) or 2B.

A basic fact of sampling is that sampling *creates spectral images of the signal in the frequency domain at multiples of the sample frequency*.

The graph at this link shows the original band-limited signal in the frequency domain (top image in blue) and the spectral images generated in the frequency domain when sampling occurs. The spectral images are shown in green.

https://i.imgur.com/L7Wc393.png

This shows a properly sampled signal – sampled at or above the Nyquist rate: Fs >= 2B. The higher the sample frequency, the farther out the spectral images are pushed. There is no overlap of spectral content.

If, however, the sample rate falls below the Nyquist rate, then the spectral images are not pushed out far enough and you get overlap of spectral content. This is called “aliasing”. The amount of overlap is determined by the sample frequency. The lower the sample frequency is relative to the Nyquist rate the greater the overlap. See image here:

https://i.imgur.com/hPgub33.jpg

The dashed lines show the overlapping spectral images. When sampling occurs, you read the spectrum of the signal of interest *and* the additional aliased spectral content. They add together. This results in sampled signal with error. This error cannot be removed after the fact!

2 samples/day is fine for 1 cycle per day, but there is significant signal content above 1 cycle per day! Aliasing (and jitter) explains why the daily (Tmax+Tmin)/2 has significant error as compared to a signal sampled with 288 samples/day. As per what I have shown, the daily error is many tenths of degrees C up to 2.5C per day (Boulder example)!

https://i.imgur.com/Tcr5CNo.png

Does this move your thinking on the subject?

• William Ward says:

Reply #2 of 2, regarding TOB. Note: Reply #1 was released a while ago but has not appeared yet. These might land out of order… Be sure to read both – thanks.

Nick,

I’d like to understand what you are calling time-of-observation bias (TOB). I understand TOB to be what happens when, for example, the postal worker who reads and records the temperature, reads it at 5:00 PM because that is when his shift ends, and he reads it before he goes home for the day. The actual high for the day may come a 6:17 PM. Either the high is missed completely, or if the instrument is a “max/min” thermometer, then the max may end up being read as the high for the following day.

In the situation in your Boulder CO study, that data is captured automatically by a high-quality instrument, sampled every 5 minutes. A day is defined as starting at 12:00:00 (midnight) and ending just before midnight the next day. The first automated sample for say, August 1st, is taken at 12:00:00 on August 1st. Samples are taken every 5 minutes thereafter until 11:55:00, which would be the last sample of August 1st. The next sample after that would be at midnight on August 2nd. All of the data is there for August 1st. A computer can automatically sort the temperatures and determine the max and min recorded for the day. As pointed out previously, this high quality Tmax and Tmin cannot be used to accurately generate a mean. But that aside, how can there be a TOB issue here? What would be the reason to then decide to define a day as occurring between 2AM-2AM or 5AM-5AM? Sure, this would generate a different (Tmax+Tmin)/2 mean. But this should be called a “Redefining the Definition of a Day” bias. Or maybe called a self-inflicted wound. Either way, this is just more evidence that the (Tmax+Tmin)/2 method is flawed. If you sample properly, above Nyquist, then you can’t possibly encounter a TOB. It is TOB proof. You just start sampling and never stop. Sampling properly, you can go back and run any algorithm you want on the data. You could choose to calculate a daily mean – and this would not require knowing the max or min for the day. The max and min are “in there” even if not captured overtly. The mean is just a real average using every sample. If you are more concerned with the monthly mean, then why bother to calculate it daily. Just average all of the samples over the month. No fuss. No muss. No TOB. No error. Just complying with real math laws and joining the rest of the world for how data is sampled and processed.

What am I missing? Do you see the inherent benefits of getting away from the mathematically wrong methods that are based upon the bad way we used to do it?

• William,
“Aliasing is not a “very low beat frequency” as you state in your post.”
It can appear as a range of frequencies. But it will need to be a low beat frequency to intrude on whatever survives the one month averaging, which is a bit attenuator.

I do understand Nyquist and aliasing well. I wrote about it here and on earlier linked threads.

“The dashed lines show the overlapping spectral images.”
But again, you are far away from the frequencies of interest. You are looking at sidebands around diurnal, which is 30x the monthly or below frequency band that is actually sought.

” there is significant signal content above 1 cycle per day!”
Again, it isn’t significant if you integrate over a month. Those Boulder discrepancies are HF noise. See what remains after smoothing with a 1 month boxcar filter!

• William Ward says:

Nick,

You said: “But again, you are far away from the frequencies of interest. You are looking at sidebands around diurnal, which is 30x the monthly or below frequency band that is actually sought.”

What you say is not correct.

If you sample at a rate of 2 samples/day, where the samples adhere to a proper sample clock, then you will not alias a 1 cycle per day signal. However, there is significant content above 1 cycle per day. See here the time domain and frequency plot for Cordova Alaska, 11/11/2017, from the USCRN.

Time Domain:

https://i.imgur.com/5OZrC32.png

As you know, vertical transitions, and signals that look like square waves have A LOT of energy above the fundamental frequency. As you would expect, we see this in the FFT frequency domain plot:

https://i.imgur.com/fePXj9r.png

Now back to sampling laws. The spectral overlap will start after the frequency represented by the second sample on this graph. So, the long tail of energy in the spectral image lands on the fundamental of the sampled signal! Translation: HUGE OVERLAP – down near or on top of the diurnal. (Not “far away from the frequencies of interest.”)

The figure on the left side of the following image shows overlap down to the fundamental, but if the sample frequency is lower, then the image tail can cross over into the negative frequency content. At 2 samples per day your image is practically on top of the signal you are sampling!

This is proven out by calculating the error for that day between a properly sampled signal and the 2 sample/day case.

What is **even worse** is that the Tmax and Tmin do not comply with clocking requirements for the samples. This further adds jitter.

The error for this day (11/11/2017) is called out with the red arrow on this graph:

https://i.imgur.com/lI22Vd4.png

You can see that there is error every day and most days this is significant.

Nick, why do you refer to “the frequencies of interest”? If you want to know what is going on at a monthly level, then you can certainly do that with properly sampled data. But there is no way to get there with improperly sampled data. Why do you think that just looking at the 1 cycle/day component of the signal has any value? Depending upon the day, 30%-60% of the energy may be in the signal above that point. Go back and look at the spectral plot for Nov 11, 2017. You cannot get a meaningful or mathematically correct monthly average by using erroneous daily readings over that month. The only thing that is giving you any support is the averaging effect of the error. But this error does not have a Gaussian distribution and will not zero out.

I don’t see the point of using a boxcar filter on bad data. I did something more meaningful. Using more USCRN data, I selected Cullman AL, 2015. I calculated the annual mean 2 ways. 1) I took the stated monthly (Tmax+Tmin)/2 mean for each month, added them and divided by 12. 2) I took the daily samples over the entire year, added them and divided by the number of samples.

Annual Tmax+Tmin)/2 mean: 15.8C
Annual Nyquist compliant mean: 16.1C

If we look at January 2015 and do the same exercise:

January Tmax+Tmin)/2 mean: 3.6C
January Nyquist compliant mean: 4.1C

Doing the same for Boulder CO (2015)

Annual Tmax+Tmin)/2 mean: 2.5C
Annual Nyquist compliant mean: 2.7C

Remember, we are using USCRN (very good) data to calculate your incorrect (Tmax+Tmin)/2 mean. In reality, the data record consists of uncalibrated max/min instruments with reading error, quantization error, sampling error (Nyquist + Jitter violations) and TOB error. The deviation from a properly sampled signal will be much greater. Sampling properly eliminates TOB, reading, and sampling error – and quantization error is reduced by a factor of 10 or more with good converters.

Following Nyquist is not optional. There are no special cases where you can get around it. The instrument record is corrupted because it fails to follow the mathematical laws of sampling.

• “bit attenuator”
I meant big ..

• William,
“I understand TOB to be what happens when, for example, the postal worker who reads and records the temperature, reads it at 5:00 PM because that is when his shift ends, and he reads it before he goes home for the day. The actual high for the day may come a 6:17 PM. “

Not really. I’ve tried to explain it in threads linked from my TOBS pictured post. The point is that when you record a max/min, it could be any time in the last 24 hrs. If you read at 5pm, there is really only one candidate for minimum (that morning). But there are two possibilities for max; that afternoon or late on the previous, and it takes the highest. If you read at 9am, the positions are reversed. The first gives a warm bias, the second cool. Another way of seeing the warm bias is that warm afternoons tend to get counted twice (at 5pm) but not cool mornings.

Anyway, you have hourly or better data, you can try for yourself. Just average the min/max for the period before 5pm, and average over some period – a year is convenient, to avoid seasonality. Then try 9 am. This is what I did with Boulder.

“If you sample properly, above Nyquist,”
So what would be above Nyquist? On your thinking, nothing is ever enough. They want monthly averages. How often should they sample?

Back to TOB, it isn’t an issue if you keep it constant. But it is if you change. And the thing is, you can work out how much it will change, by analyses like I did for Boulder.

• William Ward says:

Regarding TOB.

You said: “If you read at 5pm, there is really only one candidate for minimum (that morning). But there are two possibilities for max; that afternoon or late on the previous, and it takes the highest. If you read at 9am, the positions are reversed.”

As I read what you said I think we agree. Perhaps my explanation was not well written, but I don’t think we have a quarrel over what can go wrong with reading a max/min instrument. I also agree with you that TOB is not an issue if reading times are standardized. However, you could make a point that there ought to be a universal clock that triggers readings at the same time all around the globe – not local time. The point is to know what the energy is on the planet at any given time. Using local time yet adds another source of jitter and skew, but I won’t labor this point.

The problem is that the instrument record is full of TOB and attempts to compensate for it. I’m pretty sure that the readings we have on file don’t include a time-stamp, so it is not really possible to adjust for TOB. The corruption is in the record.

Compare this to a properly sampled signal. You can’t get TOB out of it if you try.

You said: “So what would be above Nyquist? On your thinking, nothing is ever enough. They want monthly averages. How often should they sample?”

I think the USCRN specified 288 sample/day is probably good. The 5-minute rate was probably selected for a reason. There is no technology-based reason to select lower. Converters can run up into the tens of gigahertz. 288 samples/day is glacial for a converter. A basic MacBook Pro could easily handle 50,000 USCRN stations, all sampling at 288 samples/day, and run DSP algorithms to process the data, and not use more than 5% of its processing power. It would a decade before the HDD would be full. That machine can handle audio processing of tens of channels of stereo sampled at 196,000 samples/second, 24-bit depth and do all kinds of mixing, filtering and processing in 64-bit floating point math. The only reason to keep doing it the way we do it is that we have to admit it is scientifically and mathematically flawed and that use of the data yields results loaded with error.

• William Ward says:

Nick,

You said: “In about 1969, when we were experimenting with digital transmission of sound and bandwidth was scarce, we reckoned 8 kHz was adequate for voice reproduction, where the frequency content peaked below 1 kHz.”

WW–> Analog (“land-line”) phones had/have a bandwidth from 300-3,300Hz. Analog filters in the phone rolled off frequencies below 300Hz and above 3,300Hz. The filters are not “brick-wall” so they roll off slowly and there is content above 3,300Hz. So when the signal was digitized it was sampled at 8k for the 8k channel you mention. The sample rate complies with Nyquist: Sample rate of 8kHz allows for frequencies of up to 4kHz without aliasing. (Sample frequency = 2x bandwidth of signal). Aliasing down that low where human hearing is strong means you will hear it. Thats why it is designed to not happen.

You said: “You keep thinking in terms of reproducing the whole signal. That isn’t what they want. They want the very low frequencies. In the Boulder, Colo, example I showed, I did a running year average. There was no apparent deviance from the family of min/maxes.”

WW –> This point about “reproducing the whole signal” keeps getting brought up – but it doesn’t apply. I have pointed out (more than once) that following Nyquist *allows* the original signal to be reproduced – not that this is the goal in and of itself. Following Nyquist is required if you want to do math on the actual signal that was there (with no aliasing error). You don’t seem to understand this point: if you under-sample then the aliased signal gives you error in your measurement. You don’t have the luxury of just ignoring the information above certain frequency because you are not interested in it. If you are not interested in it then you must filter it out in the analog domain before you sample it. That is hard to do with a mercury thermometer. With a thermocouple output you can do it. But better yet, sample it correctly and then digitally filter out what you don’t want. Now, I agree that some of the error will get averaged out. But not all. In my opinion that is a hell of a way to do science (measure poorly and pray to the averaging gods). Per your recommendation, I just tried Cullman, AL for 2017. If you take each daily (Tmax+Tmin)/2 and add them up for the year and divide by 365 you get 16.2C. If you take the 5-minute samples over the entire year and average them you get 16.6C. (That is 105,108 samples). Is 0.4C significant over 1 year? I don’t know. Certainly not to me. It’s important to note: I’m ignoring all of the other error that goes into temperature measurement – which increases the error. And don’t forget we are dealing with the USCRN where the max and min are arrived at through the 5-minute sample process – so they are accurate. I would expect far more difference if we are dealing with some of the more “average” (low grade) stations out there. 1C in a century gets headlines, so 0.4C in a year at one station might be a good thing to have fixed. I don’t really care about these measurements except for the manipulation that comes from the alarmism. When we get quoted records of 0.01C then I have a hard time seeing how Alarmists can have it both ways. I’m not for a second referring to you as an Alarmist Nick because I have not really read much of what you have written – I’m referring to the general Alarmism that you are aware of I’m sure.

You said: “The obvious point here is that you are just on the wrong time scale. Climate scientists want to track the monthly average, say. So how often should they sample? Something substantially faster than monthly, but daily should be more than adequate.”

WW –> You really just don’t seem to want to accept the math on this. If you want to only focus on monthly data that is fine, but the only way you get there accurately is if you sample above Nyquist — *OR* — if the averaging of these errors just happens to cancel them out. It’s just a bizarre approach to knowingly do something that violates principles of mathematics and hope that all of the errors just go away.

You said: “That is a sweeping conclusion to base on a few hours scratching around. But if you really want to prove them wrong, find something they actually do, not that you imagine, and show how it goes wrong.”

WW–> I admit I added some color – that is my finger in the eye of Alarmism. The scratching around has been more like 3+ decades – and doing so where millions/billions of dollars were at stake as well as human life. That might explain why I look cross-eyed at climate science – where people play in their playpen with numbers and nothing ever grounds in reality. Just peer reviewed papers (most of which contradict other peer reviewed papers), theories, failed predictions and panicked headlines. I think I have shown in the various posts of this discussion that (Tmax+Tmin)/2 is wrong and the process of getting it is wrong. I think you are counting on the error averaging out enough that you are comfortable with the data over months and years. If the monthly global averages shift up or down by just 0.1-0.3 this is enough to completely change a warming trend to a cooling trend or vice versa – to make or erase records. I have seen enough to be further convinced that the output of the climate science industry is not congruent with the data that underpins it.

• Clyde Spencer says:

William Ward,

You said, “Is 0.4C significant over 1 year?” Well, to put it in context, it is commonly claimed that the Earth has been warming at a rate of about 0.1 degree per decade for at least the last century. So, that error amounts to about 4 decades of warming. I’d say that was significant error.

Nick produces colorful temperature maps. However, from everything I have seen, he appears to ignore the error or uncertainty in the data and plots his maps as though the temperatures are exact and unassailable — even in sub-Sahara Africa. He is a bright and knowledgeable guy. However, if there is ever any question about the interpretation or validity of a fact related to global warming, you can bet your paycheck that he will come down on the side supporting the alarmist claims. Sometimes he is most inventive in trying to rationalize his position.

• Anthony Banton says:

“However, if there is ever any question about the interpretation or validity of a fact related to global warming, you can bet your paycheck that he will come down on the side supporting the alarmist claims. Sometimes he is most inventive in trying to rationalize his position.”

Maybe, just maybe … and you could even have considered this Clyde.
Nick (and the IPCC – not “alarmist”, as that is not the consensus) and especially re data collection and statistical analysis.
Are correct.
I mean, stranger things have happened that a consensus of experts is neither incompetent nor scammers. (sarc)
But then that’s just my common-sense.

• Clyde Spencer says:

Anthony Banton,

Your “common sense” doesn’t seem to be helping you much in getting a reply directed to the person you want it to be seen by.

Who with any “common sense” would assert “There is no Milankovitch forcing driving it.”?

• William Ward says:

Clyde,

Thanks for the comments regarding the 0.4C error vs the 0.1C/decade warming. I appreciate and agree with what you said. Sometimes my writing doesn’t match my thinking as clearly as I’d like. My point to Nick in asking “is 0.4C significant over a year?” was, I don’t stay awake at night with any concern about a 0.4C average temperature rise. All of the evidence I see suggests this is well within the range of normal variability. I infer that you agree. But if 0.1C/decade is cause for alarm then we sure better get the data right and that can’t that happen if each station reports 0.4C or more in error just from improper sampling. From your comments, I think we agree.

Regarding those colorful maps, it boggles my mind how the hot spot can show up in sub-Saharan Africa, even though there isn’t a temperature monitoring station for a thousand miles and the surrounding temperatures are lower than the hot spot. But “ya gotta believe!” “Praise be to climate science!” “Hallelujah!” Antarctica is another head-scratcher: A landmass larger than the US and Mexico combined, with stored thermal energy on par with the oceans, and 1000 times more than the troposphere, yet the datasets include readings from about 8 monitoring stations – mostly on the coast, where it is warmer.

I just explained aliasing in a different post. I’m waiting to see if Nick’s inventiveness to rationalize his position can continue in the light of the new information.

53. Anthony Banton says:

“The rise is also consistent with the end of the last Ice Age. The trick is to untangle the two, which you gloss over.”

One problem….
There is no Milankovitch forcing driving it.
So duly untangled.

• Clyde Spencer says:

Anthony Banton ,
Surely you aren’t suggesting that the astronomical relationships have ceased to exist! The alternative is that they are still in play. What is more appropriate is to say that the alignments that cause an ice age have not yet recurred. Therefore, the entanglement persists.

• Anthony Banton says:

“urely you aren’t suggesting that the astronomical relationships have ceased to exist! ”

Clyde:

I hope you are being tongue-in-cheek:

Obviously in play – but in no way contributing a forcing during this century.
I thought that would have been obvious (if not in-cheek).

• Clyde Spencer says:

Anthony Banton,

Of course it is tongue in cheek! How else would one respond to an absurd remark.

Imagine a room being cooled by an air conditioning unit. The air conditioner is removing heat originally from the sun. That is, the air conditioner, like the Milankovitch NEGATIVE forcings, is causing the room to be cooler than it would otherwise be. If the AC stops working, the room will heat up to be in equilibrium with the energy supplied by the sun. That is, we are moving in the direction of the Holocene Optimum, which existed prior to human civilization. There is no necessity to appeal to anthropogenic CO2 forcing to explain the warmer periods, so why would you demand there be exogenous forcings to explain the current warming? Milankovitch is alive and well. It is just not in the cooling mode.

• Anthony Banton says:

“It is just not in the cooling mode.”

Insolation at 65deg N is increasing Spring/early Summer, but not July/Aug etc.
Lets call it a draw.

“How else would one respond to an absurd remark.”

Then it wasn’t “absurd” as you kenned my meaning!

54. So, rather than finding the mean by adding the hourly temperatures and dividing by 24, we get the result of Daily High plus Daily Low divided by 2.

Warmists believe their so-called GHGE is predominantly felt at night, winter, and in the Arctic. Their (high + low)/2 trick gives greatest weight to night temperatures. Their anomalies do not represent real temperatures.

• Anthony Banton says:

“Their (high + low)/2 trick gives greatest weight to night temperatures. Their anomalies do not represent real temperatures.”
Not a “trick” at all.
How else can an average temp for the historical global series be arrived at.
24 hr records are short and max/min long.

And how does it give “greatest weight” to night temperatures. ?

55. Paramenter says:

“Your Cullman example corresponds to what I was looking at with Boulder. Min/max introduces a bias, as my plot showed, and it can easily exceed 0.4°C. But that is nothing to do with Nyquist. It is the regular TOBS issue of overcounting maxima if you read in the afternoon (or similar with minima in the morning).”

But that would just make an original error (due to sampling rate) bigger. Making an existing error larger does not make an original error disappear.

I’m eagerly waiting for William’s findings to be presented. Let see how situation looks with respect to much larger sample and longer periods.

56. Paramenter says:

Hey Clyde:

“However, currently, there are two time periods that are commonly used as baselines, which actually adds to the confusion.”

It is confusing. I reckon one of the reasons why large part of the public is highly skeptical about AGW message is a confusing way the menu is presented. It’s a strange mix of arguments from authority (consensus), anger (‘just believe us, you moronic crowds!’) and unclear evidence. Even relatively straightforward topics as measurement errors and averaging are hotly discussed, as this thread confirms.

• Kip Hansen says:

Clyde & Paramenter ==> There are more than two — some groups use the entire 20th century as the base –“compared to the 20th century average”.