Guest Essay by Kip Hansen —10 December 2022

“In mathematics, the ± sign [or more easily, +/-] is used when we have to show the two possibilities of the desired value, one that can be obtained by addition and the other by subtraction. [It] means there are two possible answers of the initial value. In science it is significantly used to show the standard deviation, experimental errors and measurement errors.” [ source ] While this is a good explanation, it is not entirely correct. It isn’t that there are two possible answers, it is that the answer could be as much as or as little as the “two possible values of the initial value” – between the one with the absolute uncertainty added and the one with the absolute uncertainty subtracted.
[ Long Essay Warning: This is 3300 words – you might save it for when you have time to read it in its entirety – with a comforting beverage in your favorite chair in front of the fireplace or heater.]
When it appears as “2.5 +/- 0.5 cm”, it is used to indicate that the central value “2.5” is not necessarily the actually the value, but rather that the value (the true or correct value) lies between the values “2.5 + 0.5” and “2.5 – 0.5”, or fully stated calculated “The value lies between 3 cm and 2 cm”. This is often noted to be true to a certain percentage of probability, such as 90% or 95% (90% or 95% confidence intervals). The rub is that the actual accurate precise value is not known, it is uncertain; we can only correctly state that the value lies somewhere in that range — but only “most of the time”. If the answer is to 95% probability, then 1 out of 20 times, the value might not lie within the range of the upper and lower limits of the range, and if 90% certainty, then 1 out of ten times the true value may well lie outside the range.
This is important. When dealing with measurements in the physical world, the moment the word “uncertainty” is used, and especially in science, a vast topic has been condensed into a single word. And, a lot of confusion.
Many of the metrics presented in many scientific fields are offered as averages, as the arithmetic or probabilistic averages (usually ‘means’). And thus, when any indication of uncertainty or error is included, it is many times not the uncertainty of the mean value of the metric, but the uncertainty of the mean of the values. This oddity alone is responsible for a lot of the confusion in science.
That sounds funny, doesn’t it. But there is a difference that becomes important. The mean value of a set of measurements is given in the formula:

So, the average—the arithmetic mean—by that formula itself carries with it the uncertainty of the original measurements (observations). If the original observations look like this: 2 cm +/- 0.5 cm then the value of the mean will have the same form: 1.7 cm +/- the uncertainty. We’ll see how this is properly calculated below.
In modern science, there has developed a tendency to substitute instead of that, the “uncertainty of the mean” – with a differing definition that is something like “how certain are we that that value IS the mean?”. Again, more on this later.
Example: Measurements of high school football fields, made rather roughly to the nearest foot or two (0.3 to 0.6 meters), say by counting the yardline tick marks on the field’s edge, give a real measurement uncertainty of +/- 24 inches. By some, this could be averaged to produce a mean of measurements of many high school football fields by a similar process with the uncertainty of the mean reportedly reduced to a few inches. This may seem trivial but it is not. And it is not rare, but more often the standard. The pretense that the measurement uncertainty (sometimes stated as original measurement error) can be reduced by an entire order of magnitude by stating it as the “uncertainty of the mean” is a poor excuse for science. If one needs to know how certain we are about the sizes of those football fields, then we need to know the real original measurement uncertainty.
The trick here is switching from stating the mean with its actual original measurement uncertainty (original measurement error) replacing it with the uncertainty of the mean. The new much smaller uncertainty of the mean is a result of one of two things: 1) it is the Product of Division or 2) Probability (Central Limit Theory).
Case #1, the football field example is an instance of: a product of division. In this case, the uncertainty is no longer about the length of the, and any of the, football fields. It is only how certain we are of the arithmetic mean, which is usually only a function of how many football fields were included in the calculation. The original measurement uncertainty has been divided by the number of fields measured in a mockery of the Central Limit Theory.
Case#2: Probability and Central Limit Theorem. I’ll have to leave that topic for the another part in this series – so, have patience and stay tuned.
Now, if arithmetical means are all you are concerned about – maybe you are not doing anything practical or just want to know, in general, how long and wide high school football fields are because you aren’t going to actually order astro-turf to cover the field at the local high school, you just want a ball-park figure (sorry…). So, in that case, you can go with the mean of field sizes which is about 57,600 sq.ft (about 5351 sq. meters), unconcerned with the original measurement uncertainty. And then onto the mean of the cost of Astro-turfing a field. But, since “Installation of an artificial turf football field costs between $750,000 to $1,350,000” [ source ], it is obvious that you’d better get out there with surveying-quality measurement tools and measure your desired field’s exact dimensions, including all the area around the playing field itself you need to cover. As you can see, the cost estimates have a range of over half a million dollars.
We’d write that cost estimate as a mean with an absolute uncertainty — $1,050,000 (+/- $300,000). How much your real cost would be would depends on a lot of factors. At the moment, with no further information and details, that’s what we have….the best estimate of cost is in there somewhere —> between $750,000 and $1,350,000 – but we don’t know where. The mean $1,050,000 is not “more accurate” or “less uncertain”. The correct answer, with available data, is the RANGE.
Visually, this idea is easily illustrated with regards to GISTEMPv4:

The absolute uncertainty in GISTEMPv4 was supplied by Gavin Schmidt. The black trace, which is a mean value, is not the real value. The real value for the year 1880 is a range—about 287.25° +/- 0.5°. Spelled out properly, the GISTEMP in 1880 was somewhere between 286.75°C and 287.75°C. That’s all we can say. GISTEMPv4 mean for 1980, one hundred years later, still fits inside that range with the uncertainty ranges of both years overlapping by about 0.3°C; meaning it is possible that the mean temperature had not risen at all. In fact, uncertainty ranges for Global Temperature overlap until about 2014/2015.
(Correction: Embarrassingly, I have inadvertently used degrees C in the above paragraph when it should be K — which in proper notation doesn’t require a degree symbol. The values are eyeballed from the graph. Some mitigation is Gavin uses K and C in his original quote below. The graph should also be mentally adjusted to K. h/t to oldcocky! )
The quote from Gavin Schmidt on this exact point:
“But think about what happens when we try and estimate the absolute global mean temperature for, say, 2016. The climatology for 1981-2010 is 287.4±0.5K, and the anomaly for 2016 is (from GISTEMP w.r.t. that baseline) 0.56±0.05ºC. So our estimate for the absolute value is (using the first rule shown above) is 287.96±0.502K, and then using the second, that reduces to 288.0±0.5K. The same approach for 2015 gives 287.8±0.5K, and for 2014 it is 287.7±0.5K. All of which appear to be the same within the uncertainty. Thus we lose the ability to judge which year was the warmest if we only look at the absolute numbers.” [ source – repeating the link ]
To be absolutely correct, the global annual mean temperatures have far more uncertainty than is shown or admitted by Gavin Schmidt, but at least he included the known original measurement error (uncertainty) of the thermometer-based temperature record. Why is that? Why is it greater than that? …. because the uncertainty of a value is the cumulative uncertainties of the factors that have gone into calculating it, as we will see below (and +/- 0.5°C is only one of them).
Averaging Values that have Absolute Uncertainties
Absolute uncertainty. The uncertainty in a measured quantity is due to inherent variations in the measurement process itself. The uncertainty in a result is due to the combined and accumulated effects of these measurement uncertainties which were used in the calculation of that result. When these uncertainties are expressed in the same units as the quantity itself they are called absolute uncertainties. Uncertainty values are usually attached to the quoted value of an experimental measurement or result, one common format being: (quantity) ± (absolute uncertainty in that quantity). [ source ]
Per the formula for calculating a arithmetic mean above, first we add all the observations (measurements) and then we divide the total by the number of observations.
How do we then ADD two or more uncertain values, each with its own absolute uncertainty?
The rule is:
When you add or subtract the two (or more) values to get a final value, the absolute uncertainty [given as “+/- a numerical value”] attached to the final value is the sum of the uncertainties. [ many sources: here or here]
For example:
5.0 ± 0.1 mm + 2.0 ± 0.1 mm = 7.0 ± 0.2 mm
5.0 ± 0.1 mm – 2.0 ± 0.1 mm = 3.0 ± 0.2 mm
You see, it doesn’t matter if you add or subtract them, the absolute uncertainties are added. This applies no matter how many items are being added or subtracted. In the above example, if 100 items (say sea level rise at various locations) each with its own absolute measurement uncertainty of 0.1 mm, then the final value would have an uncertainty of +/- 10 mm (or 1 cm).
This is principle easily illustrated in a graphic:

In words: ten plus or minus one PLUS twelve plus or minus one EQUALS twenty-two plus or minus two. Ten plus or minus 1 really signifies the range eleven down to nine and twelve plus or minus one signifies the range thirteen down to eleven. Adding the two higher values of the ranges, eleven and thirteen, gives twenty-four which is twenty-two (the sum of ten and twelve on the left) plus two, and adding the too lower values of the ranges, nine and eleven, gives the sum of twenty which is twenty-two minus two. Thus our correct sum is twenty-two plus or minus two, shown at the top right.
Somewhat counter-intuitively, the same is true if one subtracts one uncertain number from another, the uncertainties (the +/-es) are added, not subtracted, giving a result (the difference) more uncertain than either the minuend (the top number) or the subtrahend (the number being subtracted from the top number). If you are not convinced, sketch out your own diagram as above for a subtraction example.
What are the implications of this simple mathematical fact?
When one adds (or subtracts) two values with uncertainty, one adds (or subtracts) the main values and adds the two uncertainties (the +/-es) in either case (addition or subtraction) – the uncertainty of the total (or difference) is always higher than the uncertainty of either original values.
How about if we multiply? And what if we divide?
If you multiply one value with absolute uncertainty by a constant (a number with no uncertainty)
The absolute uncertainty is also multiplied by the same constant.
eg. 2 x (5.0 ± 0.1 mm ) = 10.0 ± 0.2 mm
Likewise, if you wish to divide a value that has an absolute uncertainty by a constant (a number with no uncertainty), the absolute uncertainty is divided by the same amount. [ source ]
So, 10.0 mm +/- 0.2mm divided by 2 = 5.0 +/- 0.1 mm.
Thus we see that the arithmetical mean of the two added measurements (here we multiplied but it is the same as adding two–or two hundred–measurements of 5.0 +/- 0.1 mm) is the same as the uncertainty in the original values, because, in this case, the uncertainty of all (both) of the measurement is the same (+/- 0.1). We need this to evaluate averaging – the finding of a arithmetical mean.
So, now let’s see what happens when we find a mean value of some metric. I’ll use a tide gauge record as tide gauge measurements are given in meters – they are addable (extensive property) quantities. As of October 2022, the Mean Sea Level at The Battery was 0.182 meters (182 mm, relative to the most recent Mean Sea Level datum established by NOAA CO-OPS.) Notice that here is no uncertainty attached to the value. Yet, even mean sea levels relative to the Sea Level datum must be uncertain to some degree. Tide gauge individual measurements have a specified uncertainty of +/- 2 cm (20 mm). (Yes, really. Feel free to read the specifications at the link).
And yet the same specifications claim an uncertainty of only +/- 0.005 m (5 mm) for monthly means. How can this be? We just showed that adding all of the individual measurements for the month would add all the uncertainties (all the 2 cms) and then the total AND the combined uncertainty would both be divided by the number of measurements – leaving again the same 2 cm as the uncertainty attached to the mean value.
The uncertainty of the mean would not and could not be mathematically less than the uncertainty of the measurements of which it is comprised.
How have they managed to reduce the uncertainty to 25% of its real value? The clue is in the definition: they correctly label it the “uncertainty of the mean” — as in “how certain are we about the value of the arithmetical mean?” Here’s how they calculate it: [same source]
| “181 one-second water level samples centered on each tenth of an hour are averaged, a three standard deviation outlier rejection test applied, the mean and standard deviation are recalculated and reported along with the number of outliers. (3 minute water level average)” |
Now you see, they have ‘moved the goalposts’ and are now giving not the uncertainty of the value of mean at all, but the “standard deviation of the mean” where “Standard deviation is a measure of spread of numbers in a set of data from its mean value.” [ source or here ]. It is not the uncertainty of the mean. In the formula given for arithmetic mean (image a bit above), the mean is determined by a simple addition and division process. The numerical result of the formula for the absolute value (the numerical part not including the +/-) is certain—addition and division produce absolute numeric values — there is no uncertainty about that value. Neither is there any uncertainty about the numeric value of the summed uncertainties divided by the number of observations.
Let me be clear here: When one finds the mean of measurements with known absolute uncertainties, there is no uncertainty about the mean value or its absolute uncertainty. It is a simple arithmetic process.
The mean is certain. The value of the absolute uncertainty is certain. We get a result such as:
3 mm +/- 0.5 mm
Which tells us that the numeric value of the mean is a range from 3 mm plus 0.5 mm to 3 mm minus 0.5 mm or the 1 mm range: 3.5 mm to 2.5 mm.
The range cannot be further reduced to a single value with less uncertainty.
And it really is no more complex than that.
# # # # #
Author’s Comment:
I heard some sputtering and protest…But…but…but…what about the (absolutely universally applicable) Central Limit Theorem? Yes, what about it? Have you been taught that it can be applied every time one is seeking a mean and its uncertainty? Do you think that is true?
In simple pragmatic terms, I have showed above the rules for determining the mean of a value with absolute uncertainty — and shown that the correct method produces certain (not uncertain) values for both the overall value and its absolute uncertainty. And that these results represent a range.
Further along in this series, I will discuss why and under what circumstances the Central Limit Theorem shouldn’t be used at all.
Next, in Part 2, we’ll look at the cascading uncertainties of uncertainties expressed as probabilities, such as “40% chance of”.
Remember to say “to whom you are speaking”, starting your comment with their commenting handle, when addressing another commenter (or, myself). Use something like “OldDude – I think you are right….”.
Thanks for reading.
# # # # #
Epilogue and Post Script:
Readers who have tortured themselves by following all the 400+ comments below — and I assure you, I have read every single one and replied to many — can see that there has been a lot of pushback to this simplest of concepts, examples, and simple illustrations.
Much of the problem stems from what I classify as “hammer-ticians”. Folks with a fine array of hammers and a hammer for every situation. And, by gosh, if we had needed a hammer, we’d have gotten not just a hammer but a specialized hammer.
But we didn’t need a hammer for this simple work. Just a pencil, paper and a ruler.
The hammer-ticians have argued among themselves as to which hammer should have been applied to this job and how exactly to apply it.
Others have fought back against the hammer-ticians — challenging definitions taken only from the hammer-tician’s “Dictionary of Hammer-stitics” and instead suggesting using normal definitions of daily language, arithmetic and mathematics.
But this task requires no specialist definitions not given in the essay — they might as well have been arguing over what some word used in the essay would mean in Klingon and then using the Klingon definition to refute the essay’s premise.
In the end, I can’t blame them — in my youth I was indoctrinated in a very specialist professional field with very narrow views and specialized approaches to nearly everything (intelligence, threat assessment and security, if you must ask) — ruined me for life (ask my wife). It is hard for me even now to break out of that mind-set.
So, those who have heavily invested in learning formal statistics, statistical terminology and statistical procedures might be unable to break out of that narrow canyon of thinking and unable to look at a simple presentation of a pragmatic truth.
But hope waxes eternal….
Many have clung to the Central Limit Theorem — hoping that it will recover unknown information from the past and resolve uncertainties — that I will address that in the next part of this series.
Don’t hold your breathe — it may take a while.
# # # # #
Discover more from Watts Up With That?
Subscribe to get the latest posts sent to your email.
There is no need for averages, a notion of certainty, absolutes, or abstraction. Only reality.
The data is clear that there is no such thing that the outgoing radiation diminishes or remains constant and the surface warms up due to some incorrect greenhouse gas enhancement hypothesis, or because of the outcomes of CO2 doubling experiments conducted with never validated models.
The faux environmentalists ignore the fact that the outgoing radiation is governed by the upper tropospheric humidity field which cannot be modeled by any (deterministic) global climate model.
The nature of environmental systems, down to miniscule anomalies, will not be determined by finite computation or mechanistic interpretations. Rather, the system can only be understood at such a level by consciousness and intuition.
The perceived problems are rooted in nature, and it is in nature – with our innate connection thereto – that the solutions will be found.
It is for certain that Earth system intervention will not be successful by our puny technological interventions, statistical sophistry, or ideological standpoints.
the notion of absolute uncertainty when it comes earth system spatial sampling is boundless and unknowable. Whatever one thinks it is, multiply by a factor n based on one’s philosophy. Common methods do not apply to the infinitesimal anomalies for which most of these debates involve. Furthermore, in typical environmental variables, a geometric mean is more appropriate. But even then, still nonsense when it comes to infinitesimal residuals subtracted from a baseline. It’s wacky stats – subject to judgement by the analyst. There is no rule or convention that will give the answer. It comes down to a reasonable sense of the phenomenon; quantification, a subject which cannot be adequately regulated by rule.
Whatever the downvoters are thinking is moot. The absolute uncertainty of temperature anomaly is infinite (which is what this is all about). The bounds one chooses to impose, and the implications drawn thereupon, are judgements. The limit of infinity is up to you. You will not find the answers in methodical tricks. One can only establish a lower bound – full stop.
Exactly. As discovered by Dr. F Miskolczi, the opacity of the atmosphere remains nearly constant. Water vapor and CO2 work together in a complex manner to achieve this balance. When CO2 is increased much of the DWIR leads to increased evaporation. As Dr. William Gray pointed out, this accelerates convection which pushes the air higher into a colder part of the atmosphere. The result is more condensation reducing the high atmosphere water vapor. The two main components of upward radiation through the atmosphere work together to keep the energy flow consistent.
Yes, we get more rainfall as CO2 increases. Exactly what is needed to balance the growing of plants and arable land.
There is no warming effect.
“When it appears as “2.5 +/- 0.5 cm”, it is used to indicate that the central value “2.5” is not necessarily the actually the value, but rather that the value (the true or correct value) lies between the values “2.5 + 0.5” and “2.5 – 0.5”, or fully stated calculated “The value lies between 3 cm and 2 cm”.”
It actually doesn’t mean that in statistical usage. It means there is a distribution of values about 2.5 with standard deviation 0.5. That means that there is about a 2/3 probability that the value lies in the range. It is rare that you can be certain that a value lies in a range like 2 to 3.
So the essay drifts into nonsense about absolute uncertainty. The notion of absolute temperature (as opposed to anomaly) is a red herring here.
I’d strongly recommend, if you want to talk about “absolute uncertainty”, find a reference to where some more authoritative person talks about it, and be rather careful in quoting what they say. Otherwise you’ll only have a straw man.
“ find a reference to where some more authoritative person talks about it”
I spoke too hastily here; I see that you have done that. But I don’t think your references are very authoritative. The definition of “absolute uncertainty” that you quote is just distinguishing from relative uncertainty, given as a % of the mean. But the arithmetic that you quote comes from dubious sources.
In fact error should add in quadrature. So adding 5.0 ± 0.1 mm + 2.0 ± 0.1 mm should give
7.0 ± 0.1414mm ie sqrt(.01+.01).
I should follow my own prescription there and give authority. One commonly quoted, here and elsewhere, is the GUM; “Guide to the expression of uncertainty in measurement”
In sec 2, it says:
“2.3.1 standard uncertainty
uncertainty of the result of a measurement expressed as a standard deviation “
“2.3.4 combined standard uncertainty
standard uncertainty of the result of a measurement when that result is obtained from the values of a number of other quantities, equal to the positive square root of a sum of terms, the terms being the variances or covariances of these” (ie quadrature)
You *always* ignore the fact that the GUM assumes multiple measurements of the same thing which generates a normal curve where the random errors cancel. Thus the mean becomes the average of the stated values whose uncertainty is the standard deviation of the stated values.
Note carefully the phrase “the positive square root of a sum of terms, the terms being the variances or covariances of these”. A single measurement cannot have a variance since there is only one data point. What that single measurement can have is an uncertainty interval which one can treat as a variance.
If the result of a measurement is a result of the value of a number of other quantities (i.e. multiple single measurements of multiple different things) then the standard uncertainty is the SUM of the uncertainty intervals for each single measurement (done using the root-sum-square addition method). This is *NOT* the same thing as the standard deviation of the stated values, it is a sum of the uncertainties of the stated values.
Stokes is just another GUM cherry picker.
Incorrect. You first evaluate the distribution shape of the uncertainties in the calculation.
Often, but not always, they will have approximately normal distributions.
When that is the case, the uncertainties add IN QUADRATURE.
For example, (25.30+/- 0.20) + (25.10 +/- 0.30)
= 50.40 +/- SQRT(0.20^2 + 0.30^2) = +/-0.36
You would report the result as 50.40 +/- 0.36
KB,
First, you didn’t assume any specific distribution for the uncertainties you give. +/- 0.2 and +/- 0.3 are intervals and you have not assigned any specific probabilities to any specific values in those intervals.
Individual, single measurements of different things have *NO* distribution of uncertainties – e.g. Tmax and Tmin.
Thus there is no distribution shape to evaluate.
You are, like most of the climate astrologers on here, only trained in distributions that are random, i.e. multiple measurements of the same thing which CAN, but not always, result in a random measurement distribution.
How do you evaluate a distribution for one single value of temperature, e.g. Tmax? The only thing you can assume is that there is ONE value in the uncertainty interval which is the true value, meaning the probability of that value being the true value is 1. All the other values have a probability of zero of being the true value.
That is, in essence, what you have followed in your example. If there was a distribution involved with the uncertainty interval values then what is it? Would 0.1 be a higher probability than 0.2 for the first measurement?
I said that you assign the distribution shapes at the outset. You are correct that the rest of my example assumes an approximately normal distribution for both uncertainties.
But they don’t need to be. There are established ways to obtain the standard deviation of a rectangular distribution, a triangular distribution or even a U-shaped distribution.
Having done that, they are added in quadrature just like the normally distributed uncertainties.
Assigning an uncertainty to one individual thermometer reading is easy. Look at its calibration certificate and it should tell you the uncertainty. It should also tell you the effective degrees of freedom.
That calibration uncertainty will take into account the uncertainty on the standard against which it was measured and a number of other individual uncertainties. Because it is a combination of several individual uncertainties it is usually safe to assume the uncertainty is approximately normally distributed.
A does not imply B here. A single air temperature measurement has no distribution because the sample size is always and exactly equal to one. Tim raised this point and you ignored it.
A single air temperature measurement does have a distribution. Every measurement has an associated uncertainty as we well know.
I’ve repeatedly said the same thing in other posts so I have not ignored the point at all.
And it is still nonsense.
It is generally recommended that the “.40” be truncated to “.4” and the “0.36” be rounded up to “0.4” when the uncertainty is so large as to impact the “4” in all cases. Thus, despite starting with 4 significant figures in the addends, the sum is best characterized with 3 significant figures because of the impact of the large uncertainty.
It can be impossible to “evaluate the distribution shape of the uncertainties” for a small number of samples, despite being able to calculate the SD. However, at the least, one should note that one is assuming the distribution is close enough to being normal that one is justified in the assumption. There is a theorem for calculating a lower-bound of the SD even when the samples are not normally distributed.
Yes there are rounding conventions also, but I left out that discussion in the interests of brevity.
Frankly there is so much wrong with this article on even basic levels that I thought it best to leave that out for the time being.
Which, by convention, is displayed with the last significant figure implying an uncertainty of ±5 in the next position to the right.
Nick ==> Yes, the statistical animal “standard uncertainty” is different — it is not the same as “absolute uncertainty” which is a known uncertainty of a value. Temperature records, rounded to the nearest whole degree, have a known absolute uncertainty of +/- 0.5°C – we do not know what the measured temperature was any closer than that one degree range.
Kip,
You’re misreading the notion of absolute uncertainty, at least in the definition you quote. It isn’t a definition of a different kind of uncertainty. It is just making the distinction between the ± quantification of uncertainty, and relative uncertainty, which is expressed as a % or fraction.
“Temperature records, rounded to the nearest whole degree, have a known absolute uncertainty of +/- 0.5°C.”
That is not a useful notion of uncertainty, which is actually greater. If you see a temperature quoted as 13°C, then certainly whoever wrote it down thought 13 was the closest. But you can’t be certain that it wasn’t, as a matter of measurement accuracy, actually closer to 14, or even 15. That is why meaningful physical uncertainty is expressed via a distribution, with a standard deviation.
I’m not sure where you got these definitions from. In my real world experience absolute uncertainty is that which you can define for any measurement of a physical attribute and does not depend on the size of the measurand.. Relative uncertainty is that uncertainty which depends on the size of the measurand and is not typically measured directly.
The uncertainty in the length of a board is directly x +/- u. If the board was 10′ long the uncertainty would still be the same -> u.
The uncertainty of the volume of a cylinder is (2 x u(R)/R) + (u(H)/H) where u(R)/R and u(H)/H are “relative uncertainties. The larger the volume of the cylinder the large the uncertainty will be.
Nick ==> It is called an “absolute uncertainty” because we know the range — it must be at least 0.5°. The value was rounded to a whole degree. That is the known measurement uncertainty.
That it could be higher is true. What it cannot be is LOWER.
Kip Hansen said: “It is called an “absolute uncertainty” because we know the range — it must be at least 0.5°”
It is called an “absolute uncertainty” at least in so far as the definition you provided because the units are in degrees (ie. K or C) and not %.
If the the 0.5 uncertainty you speak of here is bounded at -0.5 and +0.5 it is called a uniform or rectangular distribution. It an be converted into a standard uncertainty via the canonical 1/sqrt(3) multiplier thus yielding 0.289 K. You can then plug this standard uncertainty u = 0.289 K into the law of propagation of uncertainty formula or one of its convenient derivations.
bdgwx,
How many true values are there inside an uncertainty interval? If there is systematic uncertainty how does that impact the probability distribution inside the uncertainty interval?
bdgwx ==> Why oh Why would we want to convert a known numerical perfectly correct and accurate numerical absolute uncertain of a measurement to the statistical animal called “standard uncertainty”?
KH said: “Why oh Why would we want to convert a known numerical perfectly correct and accurate numerical absolute uncertain of a measurement to the statistical animal called “standard uncertainty”?”
Because it is required to propagate it through measurement models (functions or equations) that depend on it.
And more nonsense here.
Kip,
In regard to higher uncertainties:
This may be outside the considerations of the topic but there are one or more other uncertainties in many measurements, not generally considered as far as I can see, A simple example is a measuring tape for length of objects. I believe a thermometer would be similar. Perhaps, for practical purposes these uncertainties are considered too small to be relevant but they must be real?
As example, one measures a board of lumber with a tape measure. Perhaps it is reasonable to read the measurement to 1/8″, thus it would be quoted as x and y/8 inches +/- 1/16 inch. There are three assumptions in such a quote, none of which is true.
No real board is a perfect geometric figure. Just where one places the tape on the board can determine what one reads. If one is measuring the length of a 6″ wide board, how many different readings might one get that depend on just where along the 6 inch ends one places the tape? Further, from any of those possible placements, what is the variation in angle along the board necessary to change the length read by 1/8″?
The tape itself cannot be perfect. There is some difference, probably too small to see on a good tape, but still there, between every 1/8″, 1/4″, … 1″. On a long enough board those might mean 1/8 “ error or more. Measured with another tape the reading might be 1/8″ or more different. For lumber this might be irrelevant nonsense but what about sea level quoted in mm to three decimal places?
I used “error” in the last paragraph because it seems to me the instrument itself (tape measure, thermometer) is in error rather than “uncertain” relative to the standard, which is itself most likely not absolute to the quantum uncertainty level, but may be the best currently possible.
Andy ==> Interesting questions — and I’ll offer some insights.
IT DEPENDS. In that, I have sons and all have been in the building trades, house carpenters, at least in their youth. In house carpentry, framing stick houses, measurements made to 1/8″ are considered plenty accurate — stretch tape, strike line with a carpenters pencil or nail even, cut with a circular or chop saw. Nail the board in place. All the sloppiness of that process kinds works itself out in the end.
But for finish work, mouldings and cabinetry, much more precision is needed. And cabinets, for instance, have techniques that hide the smallest errors, but only up to a point.
Measuring instruments ALL HAVE uncertainties — and PLUS the uncertainty of the measurer — the human eye and hand.
For wood inlay work, precision required is incredible.
But for measurements, when we say uncertainty, it just means that. We are not certainty of the actual “length” — we meant to measure and cut to 8″ — but we won’t know what we got unless we measure the resulting board after the fact to the desired precision (which may be 1/8″ in carpentry but not cabinetry).
Kip, why do we need to have absolute uncertainty when we are dealing with relative changes in temp over time?
Thinking ==> We don’t need it, but we got it because that’s how it was measured and recorded.
While what you say is true, the accepted procedure is to record what it appears to be closest to, and the implication being that unless a gross error was made, it will have an uncertainty of ±0.5 units. Of course we can never be absolutely certain that the numbers weren’t transposed, or invented because the measurer didn’t want to go out in the rain, snow, dust storm, heat, you name it. Thus, all uncertainties should be considered a lower bound, with the small, but finite probability the uncertainties could possibly be larger. However, under best practices, a temperature recorded to the nearest degree implies an uncertainty of ±0.5 degree.
A single measurement does not have a standard deviation! Remember the old saying: “You only have one chance to make a first good impression.” Similarly, you only have one chance to record a temperature in a time-series. After the first reading, it is a different parcel of air that gets measured. Or, as Heraclitus said, “You cannot step into the same river twice, for other waters are continually flowing on.”
Clyde:
“However, under best practices, a temperature recorded to the nearest degree implies an uncertainty of ±0.5 degree.”
I assume that by this you mean any systematic bias has been reduced to a value smaller than the resolution of the measurement.
This is *not* a good assumption for field sited temperature measuring stations where the systematic bias (e.g. calibration drift, etc) is highly likely to be greater than the actual resolution of the instrument itself.
“A single measurement does not have a standard deviation! “
Hallelujah!
Strictly speaking, you are right. However, the unstated assumption for those claiming they can improve precision with many measurements is that those systematic biases don’t exist.
This is a rectangular distribution and there is a standard way of dealing with it. imagine we have a temperature reading of 280K, reported to a precision of 1K.
We say that any value between 279.5 and 280.5K is equally likely. It is a rectangular distribution, not a normal distribution.
The standard uncertainty is the semi-range divided by SQRT(3). In this case it would be 0.5/SQRT(3) = 0.289.
Let’s say we want to find the difference between that and another temperature reading of 270K, again reported with a precision of 1K.
The estimate of the difference in temperature is of course 280-270 = 10K
The uncertainty is SQRT(0.289^2 + 0.289^2) = +/- 0.408.
That 0.408 is the one-standard deviation uncertainty. To obtain the approximate 95% confidence range you would multiply by 2.
0.408 x 2 = 0.816
You should then round this value and obtain the result
10.0 +/- 0.8 K
Uncertainty is not typically a rectangular distribution. Not all values are equally probable for being the “true value”, especially when you have only one measurement to look at. There will be one value that has a probability of 1 and all the others will have a probability of zero. The issue is that you don’t know which value has a probability of 1.
You are still stuck in the box of assuming you have multiple values in a distribution from which you can evaluate what the distribution might be. With just one measurement and one uncertainty interval you do not have multiple values upon which to assess a probability distribution for the uncertainty. All you can do is assume that it is unknown.
I didn’t say uncertainty is typically a rectangular distribution. I said in this particular case of digitally rounding a thermometer reading it could be treated like that.
If the reading is (say) 50 degrees, the distribution of the uncertainty component due to digital rounding is rectangular. It can be anywhere between 49.5 and 50.5 with equal probability across the range.
This is a separate uncertainty component to the bias on the thermometer.
KB ==> If we were looking for a standard deviation or standard uncertainty, you would be perfectly correct.
However, we are looking for an arithmetic mean which is found using simple arithmetic. It does not involve statistical processes or any such — its results are precise.
If you want something “less uncertain” than the precise arithmetic mean and its precise absolute uncertainty — you will be hard pressed and will only end up with a statistical WAG — (wild ass guess).
The Central Limit Theorem will not give you a physically less uncertain answer.
KP said: “However, we are looking for an arithmetic mean which is found using simple arithmetic. It does not involve statistical processes or any such — its results are precise.”
The measurement model Y is found using simple arithmetic.
The uncertainty of the measurement model u(Y) is found using statistical processes or established propagation procedures like what is found in JCGM 100:2008 section 5.
This is true even when the measurement model Y is but a trivial sum or average.
Did you even read the main article? Apparently not.
bdgwx:
“The uncertainty of the measurement model u(Y) is found using statistical processes”
What do you statistically analyze when the data set consists of one value?
Even the GUM, see annex H, assumes that you have an SEM – which requires multiple measurements of the same thing which can generate a standard deviation.
If you have one measurement then there is no standard deviation or SEM, only a u_c, an estimate of the uncertainty. When you have multiple single measurements those u_c’s add, either directly or by root-sum-square.
This is the most important idea that neither Nick nor KB have attempted to address.
They can’t address it. It’s outside their little box that all measurements have only random error that cancels and all measurements are of the same thing.
I’ve posted multiple times to address this very issue.
How do you manufacture a sampling population from exactly one measurement?
What you are describing is a simple digital rounding uncertainty. There is an established way of dealing with those, but it is nowhere near what you describe.
This is only one component of combined uncertainty, there will be others. It does not tell you the “error”.
Correct, you should have a full uncertainty budget for the temperature measurement. The digital rounding uncertainty is but one component of this uncertainty budget.
However I am trying to deal with things one step at a time.
Kip is using combined uncertainties that are already known, not doing a UA.
KB ==> Do try to be pragmatic! With the rounding they did (it is mostly in the past) we cannot know the true values — we cannot recover the true values using any method whatever (unless you have a time machine). If +/- 0.5° is too uncertain for your application, say you want to produce accurate and precise means to the x.xxx ° — then you are out of luck. Any result you propose/calculate/statistical-ate will be nothing more than a Wild Ass Guess.
Because, we have no idea better than X +/- 0.5° for the values of the original measurements.
Kip,
It doesn’t matter that much or at least as much as you think. The reason is because the focus is on the monthly global average temperature and the uncertainty on that value. The formula for the propagation of uncertainty based on the 0.5 rounding uncertainty through a measurement model that computes an average is u(avg) = u(x)/(sqrt(N)*sqrt(3)). A typical global monthly average can incorporate 50,000 values or more. If those values are uncorrelated then that 0.5 C rounding uncertainty propagates as 0.5/(sqrt(3) * sqrt(50000)) = 0.001 C when averaging them. This is just the way the math works out whether some people refuse to accept it or not. My sources here are JCGM 100:2008, NIST TN 1297, and others and the NIST uncertainty machine.
And don’t hear what hasn’t been said. No one is saying the uncertainty of individual observations doesn’t matter. They do. No one is saying the individual observations themselves doesn’t matter. They do. No one is saying that the 0.5 rounding uncertainty is the only uncertainty that needs to be considered. It’s not. Nor is anyone saying the monthly global average temperature is computed with a trivial average of individual temperature observations. It’s not. Nor is anyone saying that the uncertainties are uncorrelated. They aren’t. Nor is anyone saying that the total combined uncertainty is 0.001 C or millikelvin or any of the other nonsense the contrarians are creating as strawmen. It’s not. So when the contrarians start trolling this post, and they will, I want you know what wasn’t said or claimed.
And another spin of the trendology hamster wheel…
Only in certain circumstances, when you are measuring the same thing with the same instrument.
Seem Nick skips some courses on basic statistics. !
Where do people here come up with that. There is no rule I’ve ever been showen that says there is a difference in propagating uncertainty when measuring the same thing rather than different things. The rules for propagating uncertainties in quadrature are normally used when you are adding or subtracting different things.
You didn’t even bother to read what Stokes posted did you?
“2.3.4 combined standard uncertainty
standard uncertainty of the result of a measurement when that result is obtained from the values of a number of other quantities, equal to the positive square root of a sum of terms, the terms being the variances or covariances of these” (ie quadrature) (bolding mine, tpg)
When you have single measurements of different things none of those measurements have a variance since there is only one data point. You need to have multiple data points in order to have a variance!
In this case the uncertainty of those single measurements becomes the variance. Those single measurements become the “number of other quantities” and their uncertainties (variances) add by root-sum-square.
You add by root-sum-square because there will probably be *some* cancellation. If this cannot be justified then a direct addition of the uncertainties should be done.
Please note that 2.25 does *NOT* say the standard deviation of the sample means becomes the uncertainty but, instead, the root-sum-square of the uncertainty of the terms becomes the standard uncertainty.
What do you think “other quantities” means? The rules for propagating uncertainties or errors are same regardless of if the different quantities are different measurements if the same thing or different measurements if different things.
What do *YOU* think “ sum of terms” means? If there are multiple terms, i.e. other quantities, you *still* add the uncertainties by root-sum-square!
I’m getting lost in your argument. My point is it doesn’t matter if the things being added are different measurements of the same thing or multiple different things. I’m really not sure what point you think you are making.
“My point is it doesn’t matter if the things being added are different measurements of the same thing or multiple different things. I’m really not sure what point you think you are making.”
This statement only highlights that you haven’t learned anything about measurements no matter how much you cherry pick bits from different places.
Multiple measurements of the same thing *usually* (but not always) forms a probability distribution that approaches a Gaussian. Thus you can assume that positive errors are cancelled by negative errors. This allows you to estimate the true value by averaging the stated values and calculate an estimate of the uncertainty using the standard deviation of stated values.
Multiple measurements of different things probably do *NOT* generate a distribution amenable to statistical analyse where uncertainties cancel. I’ll give you the old example of two boards, one 2′ +/- 1″ and one 10′ +/- 0.5″. Exactly what kind of distribution do those two measurements represent? What does their mean represent? It certainly isn’t a “true value” of anything!
“Multiple measurements of the same thing *usually* (but not always) forms a probability distribution that approaches a Gaussian.”
I think you have that back to front. All the measurements are coming from a probability distribution that may or may not be Gaussian. If you took a large number of measurements you may be able to estimate what that distribution is, but the measurements are not what forms the distribution. Even if you only take one or two measurements, they are still coming from the distribution, but you won;t be able to tell what it is just by looking at them.
“Thus you can assume that positive errors are cancelled by negative errors.”
Once again, this has nothing to do with the distribution being Gaussian, or even symmetric.
“This allows you to estimate the true value by averaging the stated values and calculate an estimate of the uncertainty using the standard deviation of stated values.”
That’s one way, but in the discussions about propagating the measurement uncertainties, the assumption is you already know the uncertainties, and can estimate the uncertainties of the mean from them. If you don;t you are going to have to take a large number of measurements just to get a realistic estimate of the standard deviation.
Continued.
“Multiple measurements of different things probably do *NOT* generate a distribution amenable to statistical analyse where uncertainties cancel.”
You just keep saying this as if that makes it true. Multiple measurements of different things will still form a probability distribution which for the most part will be the same as the distribution of the population. The measurement errors just add a bit of fuzziness to this and increase the variance a little.
However, the real problem here is that you keep conflating these two things. The estimate of the measurement uncertainty with the variation of the population. If, as we have been discussing, you only want the measurement uncertainty then the distribution you want is the distribution of the errors around each thing. E.g.
“I’ll give you the old example of two boards, one 2′ +/- 1″ and one 10′ +/- 0.5″.”
Rather than mess about with antique units I’ll convert that to 20 ± 1cm and 100 ± 0.5cm.
Here you have one measurement with an error taken from the ±1cm distribution and another taken from the ±0.5cm distribution. The actual error for each could be anything within that range, but you won’t know what it is because you don’t know the actual size of the boards. Maybe you first board has a reading of 20.8cm, and the other 99.6cm. Your rather pointless average is 60.2cm. There’s little point in worrying about the standard error of the mean given the tiny sample size, and the fact I’ve no idea what population you are trying to measure.
But we do know the measurement uncertainties, and with the usual caveats about independence, we can say the measurement uncertainty is sqrt(1^2 + 0.5^2) / 2 ~= 0.56cm. So the actual average of the two planks could be stated as 60.2 ± 0.6 cm.
“Exactly what kind of distribution do those two measurements represent?”
Impossible to say without an explanation of the population you are taking them from. Two values are just not enough to tell.
“What does their mean represent?”
Again depends on what you are trying to do. If all you want to know is what is the exact average of those two boards, that’s what it represents. If you are trying to make a statement about a population it represents the best estimate of that populations mean, but with a sample size of two the uncertainties are enormous.
“It certainly isn’t a “true value” of anything!”
It’s an estimate of the true value of the mean, which ever one you are looking for. If you want an exact average, the use you could put it to is to estimate what the total length of joining the two boards is. It will be the average times two, and the uncertainty will be the uncertainty of the average times two. 120 ± 1.2 cm.
“ The actual error for each could be anything within that range, but you won’t know what it is because you don’t know the actual size of the boards. “
The error has nothing to do with the length of the boards, it has to do with the uncertainty of the measurement device, at least if we assume the same measuring environment.
An uncertainty interval has no distribution since the probabilities for each value in the interval are not known.
“ So the actual average of the two planks could be stated as 60.2 ± 0.6 cm.”
This is EXACTLY what we’ve been trying to teach you for two years. Welcome to the right side.
“Impossible to say without an explanation of the population you are taking them from. Two values are just not enough to tell.”
The uncertainties and the measurements are two different things. How do you tell the distribution inside the uncertainty interval?
You didn’t answer the question about the mean. All you basically said in the word salad was: the mean is the mean! So I’ll ask again – “What does their mean represent?”
“It’s an estimate of the true value of the mean”
More word salad basically saying: the true value of the mean is the mean! But even that can’t be true if you say there is an SEM or standard deviation around the mean!
“ If you want an exact average”
I want to know what use can be made of that *exact average” when you have multiple different things making up your data set.
And you just keep avoiding answering by saying the mean is the mean. So what?
“The error has nothing to do with the length of the boards,”
That wasn’t my point. I’m assuming the uncertainties are independent (though they might not be, in which case you have another source of bias). My point is that if you only have one measurement you cannot know what the true value is and therefore don’t know what the error is.
“An uncertainty interval has no distribution since the probabilities for each value in the interval are not known.”
Just because you don’t know what it is doesn’t mean it doesn’t exist. All uncertainty intervals have to have some form of a probability distribution. How else could you have a “standard uncertainty” if there was no distribution. You need a distribution to have a standard deviation, you need a standard deviation to have a standard uncertainty.
“This is EXACTLY what we’ve been trying to teach you for two years. Welcome to the right side.”
If you didn’t spend all your time ignoring everything I say and arguing against your own strawmen, maybe it wouldn’t come as a surprise when you finally see something you agree with.
“The uncertainties and the measurements are two different things. How do you tell the distribution inside the uncertainty interval?”
That’s my point. You can’t tell what the uncertainties are just from two measurements.
“So I’ll ask again – “What does their mean represent?””
This style of argument is so tedious. All you keep doing is coming up with these pointless examples and then demanding I explain what the point of your example is. I don’t know what you want me to tell you about the mean of a sample of two planks of wood. You’re the one who wanted to know what the mean is, you say what the purpose was.
All you are saying is “I can’t figure out why I’d ever want to know what the average of two very different planks of wood is, therefore that proves that all averages must be useless.”
“More word salad”
sorry. Is “It’s an estimate of the true value of the mean” too complicated for you. I’m not sure how to make it any simpler. You keep claiming that means are meaningless because they don’t have a true value, by which I assume you think there has to be one specific example that is the same as the mean. And I’m saying that is wrong. The mean is the mean, the true value of the mean is it’s true value.
“But even that can’t be true if you say there is an SEM or standard deviation around the mean!”
[Takes of glasses. Pinches nose]. The standard error is around the sample mean. It is telling you that the sample mean may not be the same as the population mean (i.e. the true mean). It is telling you how much of difference there is likely to be between your sample and the population.
(And yes, before we go down another rabbit hole – that’s dependent on a lot of caveats about biases, systematic errors, calculating mistakes, badly defined populations, and numerous other real world issues.)
“I want to know what use can be made of that *exact average” when you have multiple different things making up your data set.”
Then take a course in statistics or read any of the numerous posts here which use the global averages to make claims about pauses or whatnot.
You don’t understand any of this, and I’m not going to waste any time time explaining what you refuse learn.
“All the measurements are coming from a probability distribution that may or may not be Gaussian.”
A probability distribution provides an expectation for the next possible value.
How do you get an expectation for the next value of a different thing? Tmin is *NOT* a predictor of Tmax so how do you get an expectation for Tmax from a probability distribution?
“the measurements are not what forms the distribution.”
Jeesh! Did one of those new “chatbots” write this for you?
“Even if you only take one or two measurements, they are still coming from the distribution, but you won;t be able to tell what it is just by looking at them”
Again, what distribution does the boards used to build a house come from?
“Once again, this has nothing to do with the distribution being Gaussian, or even symmetric.”
Gaussian can be assumed to provide cancellation. You must *PROVE* that a skewed or multi-nodal distribution provides for cancellation. How do you do that?
“the assumption is you already know the uncertainties, and can estimate the uncertainties of the mean from them.”
Are yo finally coming around to understanding that the standard deviation of the mean is *NOT* the accuracy of the mean? That the uncertainty of the mean depends on the propagation of the individual member uncertainties?
“Tmin is *NOT* a predictor of Tmax so how do you get an expectation for Tmax from a probability distribution?”
Your really obsessed with this max min business aren’t you. As always it depends on what you are trying to do. If you want to know the distribution around TMax don;t use the distribution around TMin.
“Again, what distribution does the boards used to build a house come from?”
I’ve no idea because usual you are getting lost in the wood. What boards? What house? What distribution. If you have a big room full of assorted boards and you pull out one at random it will come from the distribution of all boards in that room. If you find your boards in a ditch it will come from the distribution of all boards that a dumped in ditches.
“Gaussian can be assumed to provide cancellation.”
As can any random distribution with a mean of zero.
“You must *PROVE* that a skewed or multi-nodal distribution provides for cancellation.”
It’s already been done for me.
“Are yo finally coming around to understanding that the standard deviation of the mean is *NOT* the accuracy of the mean?”
Do you mean, is the thing I’ve been telling you I don’t believe in still the thing I don’t believe in? Yes.
The standard error of the mean, or whatever you want to call it, is not necessarily the accuracy of the mean. That’s because there can be biases or systematic errors that will affect the mean. In metrology terms, the standard error of the mean is akin to the precision of the mean, but not its trueness.
But I’m not sure what this has go to do with the comment you were replying to, which was about whether you are deriving the measurement uncertainties from an experimental distribution, or assuming you already know them.
“That the uncertainty of the mean depends on the propagation of the individual member uncertainties?”
No I don’t agree with that. The measurement uncertainty is only a part, and usually very small part, of the uncertainty of the mean.
Bellman,
To use statistics, you need to be able to show that your numbers are samples from a population. A population is best considered to arise when extraneous variables are absent or minimal.
When discussing thermometers at weather stations, two different stations will produce two different statistical populations because of different types and intensities of extraneous variables.
Just as two people have different variables – one should not take the body temperature of person A by inserting a thermometer in person B.
Geoff S
100% correct.
And right on cue, bellcurveman (Stokes disciple) pops in with his usual and tedious nonsense.
And right on queue the troll pops up with an insult that has nothing to do with the discussion
Note that bellcurveman is unable to refute anything Kip wrote in a coherent fashion.
Of course he can’t. Neither can most on here because all they know is what they learned in Statistics 101 where *NONE* of the examples have anything but stated values with the uncertainties of the values totally ignored.
It’s amazing to me that none of them have ever measured crankshaft journals to size the bearings that need to be ordered. None of them have ever apparently designed trusses for a hip roof and ordered the proper lumber so that wastage is minimized.
All they know is random distributions of stated values and that the average of those stated values somehow helps in figuring out the bearing size and stud length.
I’m not sure how much more coherent my points can be. Rather than making snide remarks you could actually ask for more clarity.
I’ve only said three points relating to this article. One was agreeing with Kip that you have to divide the total uncertainty by the number of measurements when taking an average. The two points I disagreed with were claiming that a mean has no uncertainty. I disagreed when the mean is of a sample being used to estimate a population. And the other was to point out that you can add uncertainties in quadrature when they are random and independent. I provided a source for that which I think is taken from your favorite authority, Taylor.
So I’ll ask again, which points do you want me to be more coherent about?
And take another spin on the bellcurveman hamster wheel? I’ll pass.
“ One was agreeing with Kip that you have to divide the total uncertainty by the number of measurements when taking an average. “
That is the AVERAGE UNCERTAINTY. It is *NOT* the UNCERTAINTY OF THE AVERAGE!
Like the two boards of 2′ +/- 1″ and 10′ +/- 0.5″. The average uncertainty is 0.8″. Now nail those two boards together. What is the uncertainty interval you can see? It is *NOT* +/- 0.8″!
“The two points I disagreed with were claiming that a mean has no uncertainty. I disagreed when the mean is of a sample being used to estimate a population.”
If the measurements themselves have uncertainty then even if you have the entire population that you can use to calculate the average of the population (i.e. zero standard deviation) that average will still have uncertainty propagated from the individual measurements. That’s why the SEM only tells you how close you are to the population mean but it does *NOT* tell you anything about the accuracy of that average.
“And the other was to point out that you can add uncertainties in quadrature when they are random and independent. I provided a source for that which I think is taken from your favorite authority, Taylor.”
Which, as usual, you cherry picked with no understanding of what you were posting.
Not all uncertainties can or should be added in quadrature. That carries with it the assumption that some of the random error in the uncertainties can cancel. That is *NOT* always true.
Again, with the two board example there is not very much likelihood that the uncertainties of the two boards have any cancellation at all. Direct addition of the two uncertainties is probably a better estimate of what you will find in the real world.
“Like the two boards of 2′ +/- 1″ and 10′ +/- 0.5″. The average uncertainty is 0.8″. Now nail those two boards together. What is the uncertainty interval you can see? It is *NOT* +/- 0.8″!”
How many more times before you get it. The uncertainty of two planks nailed together is the uncertainty of the sum of the two boards, not the average. If you know the average length and uncertainty of the average, you can multiply it by 2 to get the total length of the two boards and the uncertainty of that length. If the uncertain the of average length is ±0.75″ the uncertainty of two nailed together will be ±1.5″.
“That’s why the SEM only tells you how close you are to the population mean but it does *NOT* tell you anything about the accuracy of that average. ”
I’m not the one saying the mean has no uncertainty. But if the SEM tells you how close to the mean you are, that is what is meant by uncertainty. And yes, there can always be systematic biases either caused by the measurements or the sampling that can change how true and hence accurate the result is, but that is true about any measuring method.
“Which, as usual, you cherry picked with no understanding of what you were posting. ”
What [part do you think I’m cherry picking? I just supplied a link to the entire document. It starts of given the simple approach that Kip uses, and then goes on to say if you can assume random measurement uncertainties, then you can use adding in quadrature. What context do you think I am ignoring?
“Not all uncertainties can or should be added in quadrature.”
Which is why I keep adding the qualification of random and independent.
“That carries with it the assumption that some of the random error in the uncertainties can cancel. That is *NOT* always true. ”
Under what circumstances would you assume that random errors will not cancel? The only requirement is that the mean of any uncertainty is zero, otherwise you have a systematic error.
“Again, with the two board example there is not very much likelihood that the uncertainties of the two boards have any cancellation at all.”
The argument that random uncertainties will cancel is probabilistic. It isn’t saying this will always happen, just that the average will reduce. With just two boards it’s possible that both will have the same maximum error in the same direction, but it’s less likely.
“Direct addition of the two uncertainties is probably a better estimate of what you will find in the real world. ”
It isn’t if you assume the uncertainties are random. Suppose your thought experiment says that an uncertainty of ±1cm means that every measurement is either 1cm too long or 1 cm too short, with a 50% chance of either. There is only a 25% chance that both will be +1, and a 25% chance that they will both be -1. But there’s a 50% chance that one is +1 and the other – 1, with an aggregate error of 0cm. With a different error distribution it’s even less likely that you would see both errors being the maximum possible. And of course, this becomes less likely the more samples you take.
Nonsense! You STILL don’t even know what uncertainty is! Not even the vagest idea!
Despite the inference one might draw from the handle he has chosen, he has acknowledged that he has no particular expertise in statistics.
I believe you are correct here, yet he seems to always circle around and lecture people as if he is the expert in statistics as well as uncertainty.
I correct people when I think they are wrong. I try to back this up with evidence, but am always prepared to accept when I’m shown to be wrong.
But when people claim that the uncertainty off an average increases with sample size and as evidence use clearly incorrect interpretation of the maths and then refuse to even consider the possibility they might be wrong, then there’s little I can do but keep explaining why they are wrong. This doesn’t require any expertise, just the ability to read an equation.
You still cannot comprehend that error is not uncertainty, regardless of what you write about “the maths”.
And you still ignore that a single temperature measurement cannot be “cured” with averaging.
And you ignore that a different analysis gives a different answer to your sacred averaging formula that you plug into the maths that you don’t understand.
“You still cannot comprehend that error is not uncertainty, regardless of what you write about “the maths”.”
Uncertainty isn’t error? Why didn’t you mention this earlier. It changes everything. Rather than using hte equations from sources like Taylor which is all about error propagation, I’ll instead use the ones from the GUM which only talks about uncertainty. Oh wait, they’re identical.
“And you still ignore that a single temperature measurement cannot be “cured” with averaging.”
No idea how you would average a single temperature measurement.
“And you ignore that a different analysis gives a different answer to your sacred averaging formula that you plug into the maths that you don’t understand.”
What different analysis? I’ve gone through this using the standard rules for error/uncertainty propagation, I’ve used the general equation using partial differential equations, I’ve used the rules for combining random variables, and I’ve made many common sense logical arguments for why what is being claimed cannot happen. You cannot get a bigger uncertainty in the average than the biggest uncertainty of a single element. In the worst case you assume all uncertainties are systematic, and the uncertainty of the mean is just the uncertainty of the individual element, or you assume some randomness in the uncertainties, in which case the uncertainty of the mean is reduced.
Duh! How many times have you been told this?
Innumerable.
Duh #2!
1) There is your beloved GUM eq. 10 (which you misapply to averaging).
2) Apply GUM 10 separately to the sum and N, then to the ratio.
3) Kip’s analysis, in which N cancels, and which you obviously have not read.
You pick the one that gets you the tiny numbers.
This is what passes for science in climastrology.
“Duh! How many times have you been told this?”
I think you need to check the battery in your sarcasm detector.
“1) There is your beloved GUM eq. 10 (which you misapply to averaging).”
This is the equation you spent ages insisting I had to follow in order to get the uncertainty of the average. I have no special feeling for it, but you can derive all the other rules from it. Applying it to an average means that as the partial differential for the mean is 1/N for each term results in u(average)^2 = (u(x1)/N)^2 + (u(x2)/N)^2 + … + (u(xn)/N)^2, or
u(average) = sqrt[u(x1)^2 + u(x2)^2 + … + u(x2n)^2] / N = u(sum) / N
If you think it’s misapplying it to use for the uncertainty of an average then why bring it up?
“2) Apply GUM 10 separately to the sum and N, then to the ratio.”
Same thing.
For the sum all the partial. derivatives are 1, so
u(sum)^2 = u(x1)^2 + u(x2)^2 + … + u(x2n)^2
For N the uncertainty is zero
u(N)^2 = 0
For the ratio, mean = sum / N, partial derivative of mean with respect to sum is 1/N, so
u(mean)^2 = (u(sum)/N)^2 + (sum * u(N))^2 = (u(sum)/N)^2 + 0
so u(mean) = u(sum) / N
“3) Kip’s analysis, in which N cancels, and which you obviously have not read.”
Kip gets a different result for the uncertainty of the sum because he’s not adding in quadrature. But the principle he states is still the same
So for an average he is saying
u(mean) = u(sum) / N
If you assume all uncertainties are equal u(x), than the first two methods resolve to
u(mean) = u(x) / sqrt(N)
whilst Kip gets
u(mean) = u(x)
The difference being in the assumptions made about the uncertainties. Equation 10 is for independent random uncertainties, whilst Kip’s is for non-indpendent uncertainties. But the calculation of the average is the same (divide the uncertainty of the sum by N) and in neither case do you get an increase in uncertainty when taking an average.
And here is YASHW**, usual trendology nonphysical nonsense ignored…
**yet another spin of the hamster wheel
And to no ones surprise, Carlo ignores the answer and just responds with another troll-ish insult.
And I see that since this comment, Mr “How dare you call me a Troll” has posted 15 other one line insults directed at others.
I’m not sure what inference you want to draw from my psudonym. It was meant to be a self-derogatory nod to the bellman from the Hunting of the Snark. Maybe I should add “the” to it as some seem to think it’s a surname.
Bellman ==> Alternately, you could just simply use your real name?
Sorry Kip, some of us on both sides of the debate have good reason to choose pseudonyms. When I retire I will switch to using my real name but in the meantime like many others I have good reason related to my employer why I do not.
I could, and I’ve considered it. But I don’t see what purpose it would serve, beyond self promotion. Why should my real name have any bearing on the arguments. If I was claiming any sort of expertise, fair enough, I would need to provide my name in order to justify it – but I really don’t and my name isn’t going to have any meaning to anyone.
And then there’s the issue that I tend to provoke very strong reactions here. Many really seem to hate me, and at present I don’t mind that because they only hate the Bellman, not the real me.
I’ll try again—stop posting nonphysical nonsense.
Oh the irony. A pseudonymous troll barges in to a discussion about whether you should use your real name.
You mistake “hate” for pushback against the nonsense you post.
Egotist. I wasn’t talking about you when I said “hate”. I think you’re just desperate for attention.
But in either case, if you want to “pushback” you need to engage in the argument and not simply post 50 variations of “you’re an idiot”.
Projection time again?
Bellman ==> It is just my personal ethics view. If one is going to make claims and state opinions — and argue vigorously with others, one ought to do it with their real name.
More skin in the game. No hiding behind a internet handle. right up front. It is just more honest.
This is me and I say this!
I have been writing here for a decade of so, every essay and every comment in my own name.
Each to his own — and hiding behind internet handles is certainly common and generally accepted — but not by me.
Those of us who are confident in our knowledge and unashamed of our opinions use out real names.
(There are cases for some, say still in professional employment whose jobs would be threatened by publicly exposing their contrarian views here, where a ‘net handle may be justified.)
I learned a long time ago that I should not use my full name after my email was completely blocked by trolls. For a week my students and colleagues were unable to communicate with me. I’m sure others have similar reasons.
bdgwx let’s me take the easy ones. He’s provided it more than once. And I have provided my Oklahoma Professional Petroleum Engineering Registration number, which would take 45 seconds to trace back to my name.
Nom’s de WUWT, and just about everywhere else are common. Abraham Lincoln used one.
I think Kip is the Bellman here:
He had bought a large map representing the sea,
Without the least vestige of land:
And the crew were much pleased when they found it to be
A map they could all understand.
“What’s the good of Mercator’s North Poles and Equators,
Tropics, Zones, and Meridian Lines?”
So the Bellman would cry: and the crew would reply
“They are merely conventional signs!
“Other maps are such shapes, with their islands and capes!
But we’ve got our brave Captain to thank:
(So the crew would protest) “that he’s bought us the best —
A perfect and absolute blank!”
For the record, I don’t consider myself an expert in statistics or uncertainty either. That is why I cite so much literature.
Yet you beat people (“contrarians”) over the head with your nonsense if they don’t hoe to your propaganda.
That isn’t a defense to dismiss bnice2000.
You probably also are unaware that the purpose for calculating a mean, and whether the data are stationary or non-stationary determine the appropriate treatment.
If you are only interested in an approximate value of a bounded time-series, then a simple arithmetic mean or even a mid-range value may suffice. In any event, high precision is hardly warranted for that purpose.
A bigger problem is that non-stationary data — a time-series where the mean and variance change with time — does not have a normal distribution. It will be strongly skewed, if there is a trend. The requirements of the same thing (e.g. the diameter of a ball bearing) being measured multiple times with the same instrument is not met. Otherwise, we might as well be averaging the diameter of black berries and watermelons. One can calculate an average, but of what utility is it? All one can say is that the average diameter of those two varieties of fruit is X.
The rationale behind taking multiple readings of something with the same instrument is that most of the random errors have to cancel to improve the precision and hence the estimate of the mean. One only has high probability of that happening with a fixed value (stationarity) and the same instrument, thereby creating a symmetric probability distribution of all the sample measurements.
CS said: “That isn’t a defense to dismiss bnice2000.”
This “measuring the same thing with the same instrument” rule for propagating uncertainty doesn’t actually exist. You can prove this out for yourself by reading JCGM 100:2008, JCGM 6:2008, NIST 1297, and NIST 1900 and seeing that many (maybe even most) of the examples are actually have measuring different things with different instruments.
How do you know that Clyde has not done so?
Are you psychic?
bdgwx:
“This “measuring the same thing with the same instrument” rule for propagating uncertainty doesn’t actually exist. “
Malarky. Total and utter BS. Go look at annex H in the GUM. Specifically H.6.3.2 where you have two different machines making measurements. The total uncertainty is the sum of the SEM for each machine. If you only have one measurement per machine then the SEM is actually the uncertainty of that single measurement. Since you only have one measurement the uncertainty of each measurement gets divided by 1.
Thus for a multiplicity of measurements of different machines measuring different things you wind up with a root-sum-square of all the uncertainties.
Just because it isn’t explicitly mentioned in the documents you list doesn’t mean it can be ignored. If calibration of the instruments reveals an uncertainty or bias that is non-negligible, then it shouldn’t be ignored. Using the same instrument eliminates the possibility of variations in the uncertainty and bias. I doubt that any of the data processing uses a rigorous propagation of error where every instrument has its unique precision and bias taken into account.
CS said: “Just because it isn’t explicitly mentioned in the documents you list doesn’t mean it can be ignored.”
Yes it does. And just be clear I’m talking about the mythical “measuring the same thing with the same instrument” rule for JCGM 100:2008 (and others) and implemented by the NIST uncertainty machine which isn’t a thing. If someone makes up a fake rule you can and should ignore it.
CS said: “If calibration of the instruments reveals an uncertainty or bias that is non-negligible, then it shouldn’t be ignored.”
Nobody is ignoring uncertainty or bias here. The discussion isn’t whether the uncertainty or bias should be ignored. The discussion is about ignoring fake rules for its propagation that don’t actually exist.
CS said: “Using the same instrument eliminates the possibility of variations in the uncertainty and bias.”
You can’t always use the same instrument such as the case of section 7 in the NIST uncertainty machine manual in which the measurement model is A = (L1 − L0 ) / (L0 * (T1 − T0)). You might be able to measure T1/T0 or L1/L0 with the same instrument, but you can’t measure all 4 with the same instrument.
CS said: ” I doubt that any of the data processing uses a rigorous propagation of error where every instrument has its unique precision and bias taken into account.”
The law of propagation of uncertainty documented in the JCGM 100:2008 (and others) does just that. The NIST uncertainty machine will perform both the deterministic and monte carlo methods for you as a convenience. You are certainly free to do the calculations by hand if you wish though.
Someone hit the Big Red Switch on the bgwxyz bot, it is stuck in another infinite loop.
“The discussion is about ignoring fake rules for its propagation that don’t actually exist.”
There are no fake rules. There is only an fake assumption that all measurement error cancels in every case and the standard deviation of the stated values is the uncertainty.
The *ONLY* way to get uncertainties in the hundredths digit from measuring different things is to assume the uncertainties of the individual measurements always cancel. It is truly just that simple – and incorrect.
That cuts both ways! That also means if something is left out, and a case can be made that it shouldn’t have been, then you are obligated to do the right thing if you are intellectually honest.
You are full of crap. Do you need more references than I gave you above?
Several relevant definitions from JCGM 100:2008. I can give you several more from Dr. Taylor book if you like.
“successive measurements of the same measurand” from B.2.15 and
“closeness of the agreement between the results of measurements of the same measurand” from B.2.16 and
“infinite number of measurements of the same measurand carried out under repeatability conditions” from B.2.21.
You obviously have no training or experience dealing with these issues in sufficient detail to be making wild assertions like not needing multiple measurements of the same thing with the same device.
Yep. And I agree with JCGM 100:2008.
Repeatability is for the same measurand.
Reproducibility is for the same measurand.
Random error is for the same measurand.
But, just because those terms apply to the same measurand does not invalidate anything in section 4 or 5 including the measurement model Y which has “input quantities X1, X2, …, XN upon which the output quantity Y depends may themselves be viewed as measurands and may themselves depend on other quantities, including corrections and correction factors for systematic effects, thereby leading to a complicated functional relationship f that may never be written down explicitly”
It is unequivocal. The measurement model Y may depend on other measurands. It says so in no uncertain terms.
Clyde,
The example H.6 from JCGM 100:2008 that Tim mentions below proves my point. The measurement model is Y = f(d, Δc, Δb, Δs) where d, Δc, Δb, and Δs are all different things. Furthermore Δc = y’s – y’ where y’s is from one instrument and y’ is from another. H.6 is example where the uncertainty was propagated from different things measured by different instruments.
Furthermore, the manual for the NIST uncertainty machine make no mention of the mythic “measuring the same thing with the same instrument” rule. In fact, every single example (bar none) is of using the tool with different things measured by different instruments.
This “measuring the same thing with the same instrument” is completely bogus. It is absolutely defensible to dismiss bnice200, Tim Gorman, Jim Gorman, and karlomonte on this point.
The expert speaks, y’all better listen up!
“3-3 Repeated Measurements. It is possible to increase the accuracy of a measurement by making repeated measurements of the same quantity and taking the average of the results. The method of repeated measurements is used in all cases when precision of the instrument is lower than the prescribed accuracy. … In those instances when the readings cannot be accumulated in repeated measurements, the prerequisite condition for improving the accuracy is that measurements must be of such an order of precision that there will be some variations in the recorded values. … In this connection it must be pointed out that the increased accuracy of the mean value of the repeated single measurements is possible only if the discrepancies in measurements are entirely due to so-called accidental errors … In other words, at a low order of precision no increase in accuracy will result from repeated measurements.”
[Smirnoff, Michael V., (1961), Measurements for engineering and other surveys, Prentice Hall, p. 29]
It should be obvious that the description above applies to a single instrument (particularly accumulations), not a conflation of measurements from many instruments because the uncertainty will grow and the precision will decrease when the uncertainty from many instruments is propagated rigorously.
Yes. It does apply to a single measurand and presumably from the same instrument. No one (even contrarians) is challenging that fact that the uncertainty of an average of measurements of the same thing will improve (with limits) as the sample size increases. But notice what wasn’t said there. It never said that the uncertainty of the average of different measurands cannot also be improved (with limits) as the number of measurands increases. That is the salient point. We aren’t discussing repeated measurements of the same thing here so the Smirnoff 1961 verbiage, which is consistent with all of the other sources I’ve cited in the comments here and which I happily accept, is irrelevant to the discussion at hand.
Idiot.
“But notice what wasn’t said there. It never said that the uncertainty of the average of different measurands cannot also be improved (with limits) as the number of measurands increases.”
In the case of multiple measurements of the same thing you are identifying a TRUE VALUE.
Multiple measurements of different things do *NOT* identify a TRUE VALUE.
One average is useful and the other is not.
The average of 1 unit, 5 units, 7 units and 9 units is 5.5 units. That is *NOT* a TRUE VALUE of anything. If you are measuring different things you will *never* find a true value, only an average. So of what use is an “improved” average in the real world?
Do you build a roof truss using the average length of all the boards the lumber yard delivered ranging from 3′ to 20′? Do you just assume that all the boards you take from the pile is of average length? You might be able to figure out the board feet you are charged for by the lumberyard assuming all of the boards are of the same width and height but how many of the boards will meet that restriction?
They will never acknowledge the truth here because it threatens their entire worldview.
It is those “limits” that are the gotcha that you and others conveniently ignore.
To paraphrase Einstein: “There are only two things that are infinite, the universe and the ability of humans to rationalize. And, I’m not sure about the universe.”
CS said: “It is those “limits” that are the gotcha that you and others conveniently ignore.”
Then you are ignoring Bellman, Nick, bigoilbob, my, etc. posts because we talk about those limits all of the time.
You ignore the implications of H.6! I’ve showed you how the equation applies when you have single measurements of an multiple objects.
In the example, you have two machines making 5 readings each of the same thing. Thus you can get a standard deviation for each machine. And you know the number of measurements made.
When you have single measurements of different things then you have no standard deviation of stated values and must use the uncertainty interval as the standard deviation. The number of measurements equals 1. So each term in the equation becomes nothing more than u(i)^2 and the equation defaults to the typical root-sum-square of the individual uncertainties.
“Furthermore, the manual for the NIST uncertainty machine make no mention of the mythic “measuring the same thing with the same instrument” rule. “
It’s not mythic. As I point out GUM H.6 shows how to handle measurement uncertainty when you have multiple measurements of different things.
If you have only ONE measurement then each term in Eq H.36 ( s_av^2(x_i)/ N) becomes nothing more than
u^2(x_i)/ 1. Which means you get a root-sum-square addition of the uncertainties associated with the individual measurements.
THERE IS NOTHING MYTHIC ABOUT THAT.
All this demonstrates that you simply can’t think outside the box you are stuck in where you assume all measurement uncertainty cancels.
The NIST machine makes *NO* attempt to handle single measurements of different things. It just calculates the standard deviations of the stated values.
Break out of your box and join the rest of us in the real world!
You did not read this example very well did you? There are FIVE REPEATED INDENTATIONS IN THE SAME TRANSFER BLOCK. The hardness is derived from the measured depths of penetration. Additional uncertainty is derived from the difference between the sample machine and the national standard machine.
Do you even know the purpose of the transfer block? I’ll bet not if you’ve no experience in a machine shop.
Strictly speaking, 4 significant figures in the uncertainty are not justified. The rule for adding (or subtracting) two or more numbers is to retain no more significant figures than are in the least precise number(s), 7.0. That is, it should give 7.0 ± 0.1, which is larger than the implied precision of ± 0.05, for the number 7.0.
You really need to define what you are trying to calculate.
Root Mean Squared as a method for calculating tolerance stack is valid in practical terms. The idea is while a single part may be anywhere within the allowed tolerance, the odds of two parts both being at the extremes are low.
So RMS when checking tolerance stack for manufacturing/design is an acceptable method in an pragmatic environment, but still might stuff you up.
Tolerance and statistical analysis are different beasts.
We also have the discussion on the theory of trailing zeros.
If you subscribe to the practice of No Trailing Zeros then 5mm is exactly the same as 5.00000mm.
If you don’t then 5mm is 100,000 times less precise than 5.00000mm.
Personally I am a No Trailing Zeros type of guy. If you need to define a dimension then you apply tolerance directly to it.
But yes, we are seriously starting to mix our disciplines here me thinks.
If I remember right, if the first decimal digit of the uncertainty is an”1″ then you state the second digit, so ±0.14.
You are making a mistake immediately Nick, you are assuming the errors have a Gaussian distribution! Now why do you think that is true for temperature measurements made with many instruments in many various places? You are simply saying that is a starting assumption, but it cannot be as there are several uncorrelated sources of the errors!
No, in fact the additivity of variances is not dependent on Gaussian distribution. Nor of course is the existence of a standard deviation.
WRONG !
Nope, Nick Stokes is actually correct.
The additivity of variances is *NOT* the standard deviation of the mean.
The standard deviation of the mean is *only* useful if you have a Gaussian distribution of error around a mean. In this case the average becomes the true value and the standard deviation of the mean applies.
When you have measurements of different things you do *NOT* have a true value, only a calculated value, and the uncertainties (variances) add by root-sum-square.
Why, in climate science, everyone seems to assume that mid-range values represent some kind of true value is beyond me, let alone the idea that the individual measurements form some kind of a Gaussian distribution where measurement error cancels. Daily temperatures follow a sine wave growth and an exponential decay, neither of which are Gaussian nor is the mid-range value an actual average.
“Daily temperatures follow a sine wave growth and an exponential decay, neither of which are Gaussian nor is the mid-range value an actual average.”
Yup. A key point.
That is not what is actually measured historically.only min and max daily temps are generally recorded. So your answer is irrelevant.
davezawadi said: “Nick, you are assuming the errors have a Gaussian distribution!”
No, he isn’t. The errors do not have to be gaussian. You can prove this out for yourself using the NIST uncertainty machine which allows you to enter any distribution you want.
bdgwx ==> Oh boy, we will see about that! I will be writing about when and if the CLT et al are universally applicable! Stay Tuned!
There’s no need to wait. Using the NIST uncertainty machine select two input quantities (x0 and x1) and assign them rectangular (or any non-gaussian) distribution. Then for the output enter (x0 + x1) / 2 and see what you get.
The NIST machine is ONLY applicable for multiple measurements of the same thing. Uncertainty of a single measurement simply can’t be assigned a distribution since you don’t have multiple values with which to define a probability distribution.
Explain how the errors cancel without a residual if the distribution isn’t at least symmetrical.
Not only do the distributions not have to be gaussian, but they don’t even have to be symmetrical. The only requirement is that be random. The reason is the CLT.
You can prove this out using the NIST uncertainty machine. Select 3 input quantities each being a gamma distribution (non-symmetrical). Then enter (x0+x1+x2) / 3 as the output quantity. Notice that despite the input quantities having non-symmetrical distribution the output quantity forms into a normal distribution with a standard deviation scaled per 1/sqrt(N).
And what happens if they aren’t random? That is the “component of uncertainty arising from a system effect” that NIST TN 1297 talks about which can be removed by exploiting the additive identity of algebra when doing anomaly analysis.
Here a little homework problem for you:
1) stuff X+Y into your NIST machine
2) write down the result as A
3) stuff 2 into machine
4) write down result as B
5) stuff A / B into NIST machine
6) write result as C
Come back and tell everyone C.
Are you willing to expect an answer?
He did try, sort of (below), with a million Monte Carlo steps. He got back “u(y) = 0.707” which is supposed to be one of the inputs!
You didn’t specify the distribution of x or y so I made them gaussian with a mean of 0 and std of 1.
Here is the configuration.
And here is the result.
===== RESULTS ============================== Monte Carlo Method Summary statistics for sample of size 1000000 ave = 3e-04 sd = 0.708 median = -3e-04 mad = 0.71 Coverage intervals 99% ( -1.8, 1.8) k = 2.5 95% ( -1.4, 1.4) k = 2 90% ( -1.2, 1.2) k = 1.7 68% ( -0.71, 0.71) k = 1 ANOVA (% Contributions) w/out Residual w/ Residual x 100 50.03 Residual NA 49.97 -------------------------------------------- Gauss's Formula (GUM's Linear Approximation) y = 0 u(y) = 0.707 SensitivityCoeffs Percent.u2 x 0.5 50 y 0.5 50 Correlations NA 0 ============================================Distribution!?!??? From where did you pull this nonsense?
Duh #3
X has combined uncertainty u(X).
Y has combined uncertainty u(Y).
The CLT only applies when determining how close the sample means are to the population mean. It tells you nothing about the accuracy of that mean.
If there is *any* systematic bias in the measurements the population mean can be very inaccurate.
How does your meme that all individual uncertainties cancel allow for that?
Are these other functions you mention representative of natural phenomena such as temperatures? If not, then it is mathematically interesting, but not germane to the discussion.
Can you explain how it is that if random events are expected to cancel each other that it will happen if there are more measurements above the mean than below?
It is true for all distribution sources whether natural or otherwise.
If there the errors do not average to 0 then there is a systematic effect or bias. That will bias will drop out when working with anomalies via the additive identity of algebra.
Bullshit, total BS.
It’s right back to assuming that all measurement uncertainty cancels in all situations!
Its all they have to hang their hats on.
Really?
You get 10 crankshaft journals from a jobber to use in your engine. They are from different manufacturers with different lot numbers. I.e. different things! The boxes are all marked 2.46″ (i.e. .01 over standard).
Why would you assume the distribution of measurement errors would average to zero?
Why would you assume a systematic bias in their measurement?
Why would you assume that the average value of the journals (2.46″) would be 100% accurate?
What if one of the journal bearings was actually 2.47″? Is that a systematic measurement error or did the wrong bearing get put in the box?
“That will bias will drop out when working with anomalies via the additive identity of algebra.”
Huh? How does uncertainty drop out for anomalies? Uncertainty ADDS whether you are adding stated values or subtracting them. They either add directly or by root-sum-square.
The absolute average will have propagated uncertainty and the absolute temperature will have uncertainty. When you subtract the two stated values to form an anomaly their uncertainties ADD, they don’t subtract.
As usual you are just assuming that all measurement uncertainty cancels and the stated values are 100% accurate.
You would *never* be able to make a living as a machinist.
And, because historical measurements are likely to have greater uncertainty than recent measurements, the act of subtraction may increase the absolute uncertainty. Because the anomaly is smaller than the original measurements, the relative uncertainty will be greater for the anomaly than the original.
Thanks, Clyde. I hadn’t considered the relative uncertainty impacts.
You have it backwards, the anomaly is the red herring. How can we even have a measurable anomaly if don’t have a base point to work from? And if this anomaly is within the area of uncertainty, then is it even an anomaly? Which I believe is ultimately the purpose of the essay, that we really can’t say how much the world has warmed in the last 150 years.
But you got lost in the technical words and definitions of the post again. And you were in such a hurry to point out mistakes or misinterpretations that you had to correct yourself multiple times.
You hit the nail on the head. The assumption in the climate science world is that the long term average is 100% accurate with no uncertainty and that current temps are 100% accurate with no uncertainty so any anomaly is 100% accurate.
The accepted uncertainty of most measuring device, even today is between +/- 0.5C and +/- 0.3C. No amount of averaging can reduce that uncertainty. It simply doesn’t allow the resolution down to the hundredths digit. In fact, since the temperatures in the data base are separate, individual, single measurements of different things, their uncertainty should add, either directly or in root-sum-square. Just like Nick quoted in GUM 2.3.4.
Wrong. In nearly all cases of formal uncertainty analysis the actual distribution is unknown.
The expert on absolutely everything hath spake, so mote it be.
Nick ==> Of course “It actually doesn’t mean that in statistical usage.”. It is clear from the start that this essay is about the correct method of adding, subtracting, multiplying and dividing values with stated “absolute uncertainty” (which is also clearly defined).
In other words, we are doing arithmetic. Which means we are not doing statisics, thus the statistical meanings of things are not germane and do not apply.
I’m sure you learned arithmetic — and I give the arithmetical rules for adding, subtracting, multiplying and dividing values that have a given absolute uncertainty.
You aren’t saying that the arithmetic is incorrect are you?
Kip,
“values with stated “absolute uncertainty” (which is also clearly defined)”
You’ve misunderstood the definition. It isn’t talking about a different kind of uncertainty. It’s really just about the units. Here is the definition they give of the underlying uncertainty, which is standard:
“Uncertainty. Synonym: error. A measure of the the inherent variability of repeated measurements of a quantity. A prediction of the probable variability of a result, based on the inherent uncertainties in the data, found from a mathematical calculation of how the data uncertainties would, in combination, lead to uncertainty in the result. This calculation or process by which one predicts the size of the uncertainty in results from the uncertainties in data and procedure is called error analysis.
See: absolute uncertainty and relative uncertainty. Uncertainties are always present; the experimenter’s job is to keep them as small as required for a useful result. We recognize two kinds of uncertainties: indeterminate and determinate. Indeterminate uncertainties are those whose size and sign are unknown, and are sometimes (misleadingly) called random. Determinate uncertainties are those of definite sign, often referring to uncertainties due to instrument miscalibration, bias in reading scales, or some unknown influence on the measurement.”
Nick ==> Your source has the full: (when reading this, think of the examples I give, temperature recorded to the nearest full degree)
“Absolute uncertainty. The uncertainty in a measured quantity is due to inherent variations in the measurement process itself. The uncertainty in a result is due to the combined and accumulated effects of these measurement uncertainties which were used in the calculation of that result. When these uncertainties are expressed in the same units as the quantity itself they are called absolute uncertainties. Uncertainty values are usually attached to the quoted value of an experimental measurement or result, one common format being: (quantity) ± (absolute uncertainty in that quantity).”
Kip, I think you have misinterpreted that definition. The difference between absolute and relative uncertainty is that absolute retains the units whereas relative is unitless. The formula for converting between the two is R = A/X where R is the relative or fractional uncertainty, A is the absolute or measured uncertainty, and X is the quantity itself. Absolute and relative uncertainties are not different types of uncertainties. They are just different ways of expressing it. See Taylor section 2 for details.
I think the source of confusion arises from Gavin Schmidt’s statement. When used the word “absolute” he was talking about the actual temperature (~288 K) and not anomalies. Do not conflate the discussion of absolute temperature with absolute uncertainty.
bdgwx ==> I am not comparing absolute to relative — I am using the definition as given in the essay — a known uncertainty resulting from the measurement methodology or instruments. In the case of temperature, the example is when temperatures are recorded as whole degrees only, either originally or later by rounding). While it is possible to have an actual temperature at a while degree, when we know that the recording was done by nearest degree, then we know only that the true value was between say, 13.5 and 12.5 but recorded as 13. That is correctly written as 13° +/- 0.5°. The +/- 0.5° is the absolute measurement uncertainty.
One could, of course, convert that absolute measurement uncertainty (which is the true measurement uncertainty of our temperature record, when we see it recorded as whole degrees) to a relative uncertainty. When I say “different type of uncertainty” I mean that original measurement uncertainty that derives from the instruments and methods of measurement is an absolute uncertainty when we KNOW the uncertainty as a numerical value of the units being used. This is different than the estimates of random errors or variance.
The point I’m making is this.
The uncertainty in an absolute temperature has one set of contributing factors. The uncertainty in an anomaly temperature has another, albeit overlapping, set of contributing factors. Those sets are different and thus the two metrics have different uncertainties. It is important not to conflate the uncertainty of the absolute temperature with “absolute uncertainty”. Those are two different concepts.
I think the terminology you are actually wanting to use in the context of absolute temperature which anomaly temperature does not have is “component of uncertainty arising from a systematic effect” as defined in NIST TN 1297. (see caveat below).
We can model the uncertainty of absolute temperature as Ta = Tm + Urm + Us where Ta is the actual temperature, Tm is the measured temperature, Urm is the component of uncertainty arising from a random effect for the measurement, and Us is the component of uncertainty arising from a systematic effect. Then if we want to convert absolute temperature measurements (Tm) into anomaly measurements (Am) we do so using Am = Tm – Tb where Tb is the baseline temperature. But Tm and Tb have uncertainty so Aa = (Tm + Urm + Us) – (Tb + Urb + Us). Notice that the Us terms cancel here leaving us with Aa = (Tm – Tb) + (Urm – Urb). Note that because Us is caused by a systematic effect it is included in both Tm and Tb.
This is the power of anomalies and why GISTEMP anomaly temperatures have a much lower uncertainty than the absolute temperatures. The reason…the algebraic identity x – x = 0. The component of uncertainty arising from systematic effect cancels out. That is what Gavin Schimdt is discussing.
caveat: We can actually further subdivide the component of uncertainty arising from a systematic effect into a time variant portion and a time invariant portion. It is only the time invariant portion that will cancel when doing anomalies.
Total word salad.
Note also that the uncertainty for historical data is commonly larger than current measurements. Thus, they do not completely cancel.
Oh, dear.
There are a lot of assumptions in that equation.
“Notice that the Us terms cancel here leaving us with Aa = (Tm – Tb) + (Urm – Urb).”
You keep making this same mistake over and over and over again.
The uncertainties only cancel if they are equal. The systematic bias uncertainties will only cancel totally if you are using the same instrument to make all measurements. The random uncertainties will only cancel if you are measuring the same thing for all measurements.
You simply can *NOT* just assume that the uncertainties will cancel unless both restrictions are met – multiple measurements of the same thing using the same measurement device.
Neither restriction is met for temperature measurements.
And the uncertainty of the absolute temps carry over to the anomalies.
If T_avg has the propagated uncertainties associated with all the individual, single temperature measurements and the single, individual daily temperatures are subtracted then the total uncertainty of the anomaly will be at lest the root-sum-square uncertainty of T_avg and Tm.
bdgwx ==> Well, I agree that it is the intended purpose of the use of anomalies to make the uncertainty about global temperature seems smaller than the reality.
That’s actually not the primary purpose of using anomalies. But it is a convenient consequence that is obviously exploited. I do take issue with the statement “seems smaller than reality”. On the contrary, it is the reality. That is the reality is that anomalies have lower uncertainty than the absolutes because a large portion of the systematic effect cancels out.
Hey! You forgot to yammer on and on about the uncertainty machine in this post.
HTH
bdgwx,
Sometimes when I am about to spend some money, I am uncertain if my account has enough credit money. So, I would be happy to reduce this uncertainty by use of a proven statistical formula. I lack confidence that useful formulae exist.
Many formulae, out of no more than laziness and ignorance, express uncertainty symmetrically as in +/-5 or whatever. If I have but $1 in my account, I might have +5/-O unless I have loan arrangements and if the money counting is inaccurate. Likewise, water temperature near freezing phase change cannot be shown as 1 +/-2 degrees. Better is 1 +2/-1.
Look, I could go on for pages about the clash of reality and theory in measurement, but I did that on WUWT earlier this year.
I still miss a theoretical statistical method to minimise my bank account uncertainty.
Geoff S
Commonly expressed as a percentage.
CS said: “Commonly expressed as a percentage.”
Sure, though you could use the Taylor 2.21 approach and leave it as fractional without multiplying by 100. This has the advantage of being used in propagation formulas.
How many formal UAs have you actually performed?
Relative uncertainties are typically applicable when the uncertainties depend on the size of the measurand.
And they are totally useless for temperature unless working only in Kelvin.
Note that if you calculate the variance of a time-series with a trend, one will get a larger value than if the variance were calculated from the same time-series that has been de-trended. The implication of that is one can expect variance to grow with the number of samples, rather than decrease.
“Note that if you calculate the variance of a time-series with a trend, one will get a larger value than if the variance were calculated from the same time-series that has been de-trended”
Not true. The standard error – and hence the variance – of the zero expected value of detrended data is the same as for the original set. Since this is your claim, please supply the data set and it’s detrended equivalent that shows this. Maybe use wft.
I’ve done this, detrending the data myself. I did it again just now to be sure. I’m open to the rebut that my detrending method is faulty, but I don’t think so.
The main goal of trendology—make the numbers as small as possible, then make a lot of noise if anyone dares to point out they are nonphysical nonsense.
I was responding to a specifically false claim. I actually provided the rebut data in another post. As opposed to fact free whining – are your eyes burning?
Are you watching CNN this morning, blob?
And no, you didn’t “rebut”. You don’t comprehend the implications of a time-series measurement.
“The standard error – and hence the variance “
The term “standard error” is typically used to describe the uncertainty of the mean calculated from a set of samples. That is *NOT* the variance of the population or even of the samples themselves.
Around and around the hamster wheel spins, there is no getting off…how many times has this one been told?
Bigoilbob I disagree with you and Stokes on almost everything here, but on statistics I agree with most of what has been Neen said by the both of you here.
Still a sceptic, but sticking to the science I know.
It’s not fair to stick Stokes and bdgwx with my posts. I have neither the breadth nor depths of their training and experience.
All petroleum engineers end up about half statistician. We take a couple of extra courses in our undergrad and grad lessons, and then use what we’ve learned to apply statistical software commercially. Workover costs, stochastic inputs for reservoir simulations (which, in spite of posters here claiming that it is impossible, use many different sources for ranged geologic and rheologic inputs), economic evaluations of development campaigns – like that.
But I don’t have nearly the ability to use the fundamentals in my posts that Stokes and bdgwx do. Which does give my the humility to avoid wanting to disregard centuries of accumulated knowledge with prejudged intuition, per Pat Frank, the Gorman’s and others.
Statistics über alles!
Go right ahead and believe nonphysical nonsense if it floats your boat, blob. You are free to do so.
I’m saying your arithmetic is indeed incorrect.
Show your work!
KB ==> Care to show where? (Arithmetic is pretty easy….)
Your method of propagating uncertainty is clearly incorrect. They add in quadrature, not simple addition like you have done.
KB ==> Can you diagram your point of view? Use two simple values with known absolute (numerical) uncertain values. Say 7 inches +/- 1 inch added to 3 inches +/- 1 inch. Mark it out on a ruler or use counting blocks or strips of measured paper. Show how the values (7 and 3) add together, and how the uncertainty of the two values affect the sum.
I culd do that …. but you’ll learn more if you do it yourself.
(7 +/- 1) + (3 +/- 1) =
10 +/- SQRT(1^2 + 1^2) = 10.0 +/-1.4
Assuming both uncertainties are normally distributed (for example they are random measurement uncertainties).
If the uncertainties are actually tolerances, they will have a rectangular distribution. In which case you would divide each by SQRT(3) to obtain their standard uncertainties first.
“Assuming both uncertainties are normally distributed (for example they are random measurement uncertainties).”
And if they’re not? Or at least there is no evidence they should be?
So let’s apply this to a hypothetical progressive calculation such as output from a GCM.
We have a starting temp of 10 +/- 1 “measured”
Uncertainty is then 10 +/- 1.4 when we apply quadrature rule.
That gets fed into the next calculation which returns say 10.1 +/- 1.6
Uncertainty was larger to begin with.
That gets fed into the next calculation which returns 10.1 +/- 1.6
That gets fed into the next calculation which returns 10.1 +/- 1.6
And the uncertainty is slowly increasing unbounded.
After millions of iterations it’s huge.
Yup!
It’s not easy, but I will show you where. You start with the law of propagation of uncertainty given as equation 10 in the JCGM 100:2008 (the GUM).
Let Y = f(x_1, x_2) = x_1 + x_2.
It follows that
∂f/∂x_1 = 1
∂f/∂x_2 = 1
Then using GUM equation 10
u(Y)^2 = Σ[(∂f/∂x_n)^2 * u(x_n)^2, 1, N]
u(Y)^2 = (∂f/∂x_1)^2 * u(x_1)^2 + (∂f/∂x_2)^2 * u(x_2)^2
u(Y)^2 = (1)^2 * u(x_1)^2 + (1)^2 * u(x_2)^2
u(Y)^2 = u(x_1)^2 + u(x_2)^2
u(Y) = sqrt[u(x_1)^2 + u(x_2)^2]
This is the well known root sum square or summation in quadrature rule.
Did you even read Kip’s article? Obviously not.
I agree. Why can’t they see this?
Tell everyone the assumptions behind both straight addition and using quadrature. They have different assumptions for their use!
For u(x+y) = sqrt[u(x)^2 + u(y)^2] the correlation must be r(x, y) = 0.
For u(x+y) = u(x) + u(y) the correlation must be r(x, y) = 1.
See JCGM 100:2008 equation 15 and 16 for details.
You are truly an ignorant person Nick. When I tell you a ruler can measure to +/-1 inch, nothing about the distribution is assured at all. You learned a little statistics and have embarrassed yourself by parading your overconfidence and ignorance daily.
Actually Nick MAY be correct. It depends on what he is trying to do.
Using RMS as an analysis method for discussing tolerance stack is completely valid. If you add an item with a tolerance to a second (or third… fourth…) item (say a pile of discs you cut from a cylinder) then you need to be able to calculate the possible variations in the final stack size.
If I ask you to cut each disc 5mm thk and assign a tolerance limit of 6mm/4mm then any single disc can be cut within that range and still be accepted.
The theory is that while a single disc could be 6mm thick, the odds of getting two cut at that thickness is rather low. Hence using RMS as a method is valid under most circumstances in an engineering world.
Tolerance cost money. If you can get away with it you design the need to have tolerance on parts out of the design.
It is probably not valid for discussion individual measurements. Probably. Not my skill set.
Except Nick doesn’t use root-sum-square of the uncertainties with temperatures. He assumes all uncertainties of the individual measurement cancel and the standard deviation of the sample means is the accuracy of the average.
It’s like putting 5 shots into the 4-ring of a pistol target and they all hit the same hole. Nick would say that implies that the standard deviation of the sample mean of those 5 shots is very small which makes them very accurate – when in actuality all the shots missed the 10 ring (the bullseye) by a large margin implying that his standard deviation of the sample means is *NOT* the actual accuracy of his shots. His accuracy is very low.
And you know the statistical distribution how?
Plus or minus 57 Varieties
“”I’m asking my people to give me a better download on exactly what the emissions implications are going to be””
– J Kerry on a Cumbrian coal mine…
https://www.theguardian.com/environment/2022/dec/10/john-kerry-examining-likely-impact-of-new-uk-coalmine
Download???
Spoken like a true, clueless politician!
oh good lord … they mix water temperatures with air temperatures … pure apples and oranges … there is no global temperature data set that is fit for purpose … this haggling about “uncertainty” is worrying about the napkins in the Titanic dining room …
While I agree with your overall point, the climate activists need to be called out on every point. There’s no way we can pinpoint the average temperature of the earth in 1880 to +/- half a degree.
Kip’s Gavin Schmidt source cites P. D. Jones, et al., *1999) Surface Air Temperature and its Changes Over the Past 150 Years. Rev Geophys. 37(2), 173-199, who quote global air temperature anomalies with an uncertainty of ±0.1 C in 1880 declining to ±0.05 C in 1995.
These uncertanties are smaller than the lower limit of resolution of the meteorological instruments. That should inform one of the level of competence we’re dealing with here.
Workers in the field also don’t seem to understand that the uncertainty in an air temperature anomaly must be larger than the uncertainty in the monthly measurement mean (minimally ±0.5 C) and the normal (also minimally ±0.5 C) used to calculate it.
The minimal uncertainty in any given 20th century GMST anomaly is then sqrt[(0.5)²+(0.5)²] = ±0.7 C.
The whole global warming thing is mortally subject to the scientific rigor learned at the level of the sophomore science or engineering major.
Pat ==> Thank you for this comment — always nice to have someone who understands the simple pragmatic truth.
Jones 1999 is, of course, talking anomalies (the magical anomalies that are less uncertain than the measurements of which they are comprised…).
Where do you think so many smart guys go wrong on this issue?
Because they learned their statistics in STAT 101 which never addresses uncertainty and how to handle that. It’s like my youngest son when he was taking a microbiology major – his advisor told him not to bother with any statistics or engineering classes,,just find a math major to analyze his collections of data in the lab.
It just boils down to the blind leading the blind. Most math majors and climate science majors have never done any real world work with metrology where either customer relations or monetary/criminal liability can ensue from ignoring proper propagation of uncertainty. So they’ve never learned.
Tim ==> I was trained in my twenties in intelligence and security methods and practice — seriously trained (brainwashed?) I am well aware that it has tainted not only my world view, but my everyday life (ask my wife if you have a few hours….). So, I understand the problem that many stats and numbers people have — they can only see things a certain way — and because of my personal experience, I have sympathy for that sort of thing. That’s why I write this essay in such simplistic terms — to see if I can undercut, get through by passing under the trained-in-response wire.
I didn’t have much success.
Kip,
That seems to be the crux of it. Everybody’s education trains them to see the world and approach problems in a particular way.
The deeper the knowledge, the more this applies.
I tend to argue that it is actually the uneducated who are more preoccupied with rules and procedures, and keeping up appearances. You may notice that children are totally obsessed with rule structures, whereas the elderly often couldn’t care less. A new graduate or wannabe will stop at nothing to share new things which they have learned, often lacking the proper context to understand what it means.
It takes wisdom to understand the array of tools, when it’s appropriate to apply them, and what they are telling you. These are tools to aid understanding for a research scientist, quite different than the way the rules are applied to lab standards organizations and technicians. The scientist’s job is to take in information and to make an informed judgement, not to just regurgitate computer output and proclaim it as meaningful. The first thought for any good scientist should be, “does this seem reasonable?”.
I should add that the job of the scientist is to inform – not to deceive. There seems to be a race to the bottom when it comes to finding data manipulation tricks to indicate certainty today. This has come about since the advent to desktop number crunching software. p-hacking routines and such. It may be a cultural thing, with the influx of hordes of students and handing out advanced degrees like candy. It is very competitive, and everyone wants to be seen. What is lost in the fold is the duty to inform. It should be understood that often the most useful insights are drawn from an admission of uncertainty, which allows new hypotheses to evolve. Hiding this uncertainty is actually a hindrance to advancement. It is the hardest part of any research to determine how to report the uncertainty, and to understand the subsequent consequences of these active cognitive decisions. What is for certain is that often the most useful information comes about from an appreciation of what is not known. This concept should be embraced.
In my more cynical moments, I think that programs such as SPSS did a great disservice to the world by allowing statistical analysis by those with insufficient understanding of the concepts, and knowing where particular approaches are and are not applicable.
Those are all good points, but I wasn’t thinking of rules and procedures or showing off newly acquired knowledge.
There will always unquestionable axioms in any field – those things which you know so well that you don’t even know you know them.
We’re seeing that here with the slightly incongruent approaches of the mathematicians and metrologists.
I see. I would recommend to start thinking like a scientist then. There is a risk of getting lost in the recesses of our datasheets and conceptual axioms, and to lose sight of reality.
Recognising and questioning those axioms seems to be extremely difficult, even for scientists.
Those who can do so often seem to achieve great things. Marshall & Warren and their work on H. pylori should be an inspiration to all.
agreed. we can put lipstick on a pig but at the end of the day it’s still a pig. If we forget this we all miss out on the good eatn.
Pat, do you mean the accuracy of the instruments or the precision of the instruments?
Pat,
Nice comment. Short, sweet, and to the point.
Somehow these folks have missed the fact that resolution is a method of displaying available information. Quoting calculations with a higher resolution than was measured is adding information from nowhere and that is fantasy.
It is the reason for Significant Digit rules. Too many here simply ignore
SigFigs as a childish endeavor not worthy of real scientific consideration.
The two issues combined is tragic.
No kidding!
If the air in a 1 m^3 box from Churchill, Manitoba has a temperature of exactly -25 C (measurement error is zero) and the air in a 1 m^3 box from Honolulu has a temperature of +29 C (again assuming no measurement error), the (mathematical) average temperature is precisely 2 C. So what? If I join the boxes together while holding constant the physical properties of the two air masses and then remove the boundary between them to create a 2 m^3 box, the temperature of the combined air mass is unlikely to be 2 C after a new equilibrium is reached. Without additional information, such as pressure and humidity, about the initial state of the air in the two boxes, we don’t know what the equilibrium temperature of the joined system will be. Average energy or enthalpy would be more meaningful than average temperature, though I expect the uncertainties would be vastly larger because of yet more uncertainties in yet more measurements. Of course, no global average is of much use on a local level because there is no global “climate.”
Randy ==> Quite right, Slim. That is another topic altogether, but yes, temperature does not measure heat (enthalpy) and temperature is an intensive property….(that leads to a whole discussion.…)
Bravo!
Other than these uncertainties also blow many of the predictions out of the water. Do you ever wonder why you never ever see either a standard deviation or error bars on GMT?
So for length the uncertainty is additive. As Nick Stokes points out, if the uncertainty is expressed as standard deviation then they add in quadrature because variances are additive.
However, the uncertainty goes down from a daily measure to a monthly measure because the variation in monthly values is clearly less than daily values. In this case there is a change of scale (known as support in geostatistics) and the central limit theorem applies. Again, the variance is the linear quantity.
Note also that in change of scale there is an equality related to variance. For example for a set of a year of daily temperature measures the total variance (of the daily measures) equals the within monthly variance + the between monthly variance. So when changing scale the variance is redistributed but always preserved.
The change of support is why it is wrong to splice modern temperature series at, eg monthly resolution onto paleo-temperature series where the resolution might be several hundred years. The modern temps should shown as a single value with the reduced variance.
Thinking ==> Stokes instantly shifts from original measurement error to “standard deviations” as if they could be considered the same.
I am talking measurement here – not statistics.
Stokes should explain to Gavin Schmidt, head of NASA GISS, that he is wrong about calculating the error of GISTEMPv4 — how can Gavin be so stupid, huh?
But, Gavin is not wrong (in this point). Despite the millions of measurements that go into the global GISTEMP annual average, Gavin points out, when using absolute temperatures for which he means temperatures as numbers of degrees C, the uncertainty is +/- 0.5°C. It is this because records are kept in whole degrees which have been eyeballed (in early days) and rounded in more recent days. THAT +/-0.5°C uncertainty has to carry all the way forward to the final global figure.
Th rules for addition and division of values with absolute uncertainty (stated as +/- value with the same units) are given correctly above and are not open to question — that is simple arithmetic. Arithmetic is not subject to uncertainty – the answers are sure. 1 + 2 ALWAYS equals 3. There is no doubt. Division of values written “2 cm +/- 0.5 cm” produces equally certain results….just follow the arithmetic rule.
Stokes and the stats guys are pulling the same nonsense, shifting a problem of simple arithmetic to an unnecessary statistical approach. Arithmetic trumps statistics every time.
“Stokes should explain to Gavin Schmidt”
No, you are misreading your Gavin quote. In fact he sets out the requirement to add in quadrature explicitly, and exactly as I did:
All he is doing in the part you quote is saying that you should add in quadrature (hence 0.502) and then round the uncertainty to 0.5 (because of uncertainty about the uncertainty).
Kip,
In fact, if Gavin had followed your rules, the uncertainty would have been 0.5+0.05=0.55°C.
But he used quadrature
sqrt(0.25+0.0025)=0.50249
which he then rounds to 0.5 (his rule 2)
Nick ==> Where did he get the 0.5°? From the original measurement uncertainty, carried forward, as I describe, through multiple steps of finding arithmetic means to finally arrive at the global values, with the same+/- 0.5°. He gets that little extra bit from the uncertainty of the anomaly of the baseline temperature. If they had simply calculated the temperature of 2016 like they did the climatic baseline) they would have arrived at the figure with the +/- 0.5° without the little extra bit from the anomaly. The uncertainty of the anomaly is not an “absolute uncertainty” (caused by known limitations of the measurement process) but a statistical uncertainty — thus adding the two requires the statistical approach in that articular case.
Kip,
“Where did he get the 0.5°? From the original measurement uncertainty”
There was no “original measurement”. Jones formed an average of many global temperatures over all times. The 0.5 is a statistical estimate. I think his global absolute mean temperature is based on data too inhomogeneous to average usefully, and should be paid little attention. Oddly, the GISS item that Gavin is modifying was saying just that, but somehow he is getting sucked in.
The fact is, no useful information is gained by adding that 14 to the anomaly average.
I suspect that the anomalies will be correlated with the raw data, thus disqualifying the use of addition in quadrature.
Adding in quadrature is justified if the measurements are uncorrelated. If there is autocorrelation, as in a time series with a trend, or it is not known whether or not there is correlation, then the more conservative simple addition is preferable.
Dr. Schmidt has not proven that there is cancelation in the combined uncertainties of different temperature measurements. He also uses 0.5 which disagrees with the documentation from the NWS and NOAA. They show ±1°F for both ASOS and MMTS. I suspect older LIG readings are even larger.
In any case, one can not dismiss the direct addition of uncertainty intervals of different temperatures. This provides an upper bound on the uncertainty interval.
From Dr. Taylor’s book, An Introduction to Error Analysis. Page 59.
“That is, our old expression (3.14) for δq is actually an upper bound that holds in all cases. If we have any reason to suspect the errors in x and y are not independent and random (as in the example of the steel tape measure), we are not justified in using the quadratic sum (3.13) for δq. On the other hand , the bound (3.15) guarantees that δq is certainly no worse than δx + δy, and our SAFEST course is to use the old rule.”
3.13. δq = √[(δx)^2 + (δy)^2]
3.14. δq ~ δx + δy
3.15. δq ≤ δx + δy
The ±0.5 is basically caused by rounding error. It is not random and one does not know the exact temperatures that were rounded. One would have to assume that not only the signs cancel but every value between +0.5 and -0.5 of each temperature would cancel in the two daily values, the 30/31 monthly values, and 12 annual values.
Kip,, you are exactly correct. The NWS and NOAA describe what is basically the combined standard uncertainty for ASOS and MMTS as ±1°F. This number applies to each and every reading taken by these system. I read somewhere that prior to 1900, the uncertainty was assumed to be ±4°F.
One only has to do simple math to confirm this. Find the mean of two numbers with ±1. The mean will also vary by ±1. Now try with more numbers.
81/82->81.5
80/81->80.5. 80.5 ±1
79/80->79.5
79/80/81/82/83/84 -> 81.5
80/81/82/83/84/85 -> 82.5. 82.5 ± 1
81/82/93/84/85/86 -> 83.5
You can use all the statistics you want but this doesn’t lie. With single temperature readings averaged together with all having the same combined uncertainty, the mean will carry that uncertainty also.
Singular readings have no distribution to cancel error in a given reading nor any other single readings. This is where Tim Gorman has tried to make folks aware that calculating a mean using exact numbers is what is being done rather than considering that each and every data point has an uncertainty that moves through each statistical calculation.
In essence much of the argument about uncertainty is really moot. There are published documents from the NWS and NOAA that tell what the intervals should be. Climate scientists should be using these intervals and not replacing them with some statistical calculations where averages and dividing by “n” reduces the intervals.
“However, the uncertainty goes down from a daily measure to a monthly measure because the variation in monthly values is clearly less than daily values.”
The uncertainty does *NOT* go down in reality. This only comes from the unjustified assumption that all error of individual measurements cancel (i.e. they form a Gaussian distribution) and the standard deviation of the stated values is the uncertainty.
Let me state again: THAT IS AN UNJUSTIFIED ASSUMPTION.
In geostatistics you are measuring the same thing multiple times and any errors should form a Gaussian distribution with as many plus measurement errors as minus measurements. You can, in this case, use the standard deviation of the mean as an estimate of the uncertainty of the mean (which would be considered a true value +/- uncertainty).
When combining minimum and maximum temperatures during the day you are *NOT* measuring the same thing multiple times. They are separate, individual, single measurements with no other data points which would form a Gaussian distribution. When you form a daily mid-range value the uncertainty of the mid-range value is just like what Nick quoted in GUM 2.3.4 – the root-sum-square of the individual temperatures (GUM’s “sum of terms”). Thus the uncertainty of the mid-range value becomes somewhere between +/- 0.7 (+/- 0.5C) uncertainty per term) and +/- 0.4 (+/- 0.3C uncertainty per term).
Since the mid-range value of each day is a separate, individual, single value when they are averaged the uncertainties for each term should add by root-sum-square. Assuming 30 days in a month the monthly uncertainty would be _+/- 3.8C to +/- 2.2C.
The uncertainty would grow even larger when combining 12 months into an annual average.
Doing anomalies doesn’t help. You still wind up subtracting an uncertain term from an uncertain term, each of which represent a different thing. Thus the uncertainties for the final answer is a root-sum-square of the two uncertainties. Thus if your base uncertainty is +/- 0.5C you wind up with a +/- 3.9C uncertainty for the monthly anomaly.
As an example let’s assume you have a minimum/max value for one day of 70 +/- 0.5 and 90 +/- 0.5. The standard deviation is 10 and the sample size is 2. That gives a standard deviation of the mean of 7. That is a much bigger uncertainty of the mid-range temperature than merely just adding root-sum-square the individual uncertainties.
I pulled 24 min/max records from my weather station for Nov. The SEM for each mid-range value ranges from 4.6 to 16.6, with an average of 10.5. The total uncertainty for the combination of the 24 mid-range values is 3.4.
How the climate scientists can possibly think they can identify uncertainties down to the hundredths digit is just beyond me.
There is so much wrong with your post it’s difficult to know where to begin. I really have not got the time.
and never will
You mean you can’t refute it at all.
Thanks for your useless reply.
I would call your remark a ‘cheap shot.’ You make an assertion about many problems, but don’t even feel a responsibility to address the one or two most egregious, in your estimation.
“the variation in monthly values is clearly less than daily values”
Only if measurement error is random. It’s not.
Field calibration experiments invariably indicate large non-normal systematic errors.
Thanks, Pat. The requirement is actually a bit more general. The measurement error doesn’t need to be “random”. The error only needs to be symmetrical … however, as you point out, that’s not often the case.
w.
“Only if measurement error is random.”
The key issue is cancellation, which doesn’t have to be caused by randomness.
But the reduction is clearly seen in read data. I have on hand a file of daily maximum temperatures in Melbourne, 159 tears up to 2013. If I calculate the standard deviation of all days for each month, I get (in °C)
6.175 5.783 5.236 4.109 2.965 2.181 2.109 2.649 3.529 4.537 5.303 5.806
But if I calculate the mean of each month first, and then the sd of those monthly averages, I get
1.652 1.569 1.420 1.323 1.054 0.966 0.844 0.926 1.121 1.248 1.392 1.639
Clearly a lot less, in fact down by a factor of 2-4. The reduction that would apply if the days were independent is about 5.5 (sqrt(31)). So autocorrelation brings it back a bit, but clearly averaging reduces the variability.
The ratio of sd reduction for each month is
3.74 3.69 3.69 3.11 2.81 2.26 2.50 2.86 3.15 3.64 3.81 3.54
Your problem is that you have ignored and sweep under the carpet the standard deviations of the individual monthly averages.
Averaging averages does not reduce variance!
“you have ignored and sweep under the carpet the standard deviations of the individual monthly averages”
No, it’s the second row of figures I gave
1.652 1.569 1.420 1.323 1.054 0.966 0.844 0.926 1.121 1.248 1.392 1.639
Since the underlying uncertainty of the monthly averages is undoubtedly higher than these values due to the uncertainty in the daily mid-range values there is no way these are proper descriptive statistics.
See my post above.
ThinkingScientist said
“the variation in monthly values is clearly less than daily values”
Pat Frank said
“Only if measurement error is random. It’s not.”
I showed by simple enumeration that
“the variation in monthly values is clearly less than daily values”
What you showed is something with NO propagation of measurement uncertainty. You simply assumed that all stated values of the measurements are 100% accurate.
Of course you found what you did. But it is meaningless.
No, you have shown that SAMPLES of a population will have a smaller standard deviation than the population. The CLT is used for this purpose. The sample means distribution will of course be smaller since you get a sharper and sharper peak as you increase the sample size and number of samples.
I’m not sure what you think this proves!
Here is a not very sophisticated image explaining this.
You are in essence sampling the population by groups (months). What you have done is created a sample means distribution.
Now if you find the mean of that monthly distribution you will have the Standard Error of the Sample Mean (SEM).
The individual standard deviations of the 12 monthly samples have little meaning other than to show your samples are not IID and the the SEM is probably not correct.
The formula for finding the Standard Deviation of the population is
SD = SEM • √n
SD = 1.404 • √30.4 = 7.74
Have you studied sampling?
I also noticed you have 4 decimal places. Does your actual measurements resolution support that many decimal place?
I just pulled 24 Tmax/Tmin pairs for 11/22.
Avg mid-range value = 60
Avg standard deviation = 21
Avg SEM = 10
total uncertainty = 6F (assumes T +/- 0.9F)
Anyone that thinks this data can have a smaller uncertainty through averaging is only kidding themselves.
“Anyone that thinks this data can have a smaller uncertainty through averaging is only kidding themselves.”
From GUM 3.2.2
“Although it is not possible to compensate for the random error of a measurement result, it can usually be reduced by increasing the number of observations;”
Not for a time-series, you silly person.
Nor can it when you have multiple different things.
I am quite positive that Stokes et al truly think that the daily temperature curve is a Gaussian distribution. It isn’t!
And anything that isn’t Gaussian magically goes away with the incantation of the baseline subtraction. Voila! The Hockey Stick.
This seem to me to be a very uncertain statement. The meaning of daily values might be guessed as as measurements are often recorded daily, e.g. the min for the day and the max for the day, both measured temperatures, which are then often used to calculate a daily average (which has very low meaningfulness in my opinion). However, I don’t believe there is any way to take any monthly measurements so monthly values must refer to some calculation result that is another factor removed from measurement.
Monthly averages are simply averages of the daily measures. They must exhibit lower variance than the daily values simply because of averaging. This is basis stats 101.
So what? How does this remove the instrumental measurement uncertainty which Stokes & Co ignore?
Averaging doesn’t reduce uncertainty. The average uncertainty is not the same thing as the uncertainty of the average. When you combine random variables you add their variances, you don’t find the average variance *That* is basic stats 101.
Each individual, single measurement has an uncertainty interval that is equivalent to a variance so you wind up with psuedo-random variables with only one value. When you combine these individual random variables the uncertainties (variances) add.
When you divide the sum of uncertainties by the number of terms all you get is the average uncertainty of the random variables, i.e. you just spread the total uncertainty equally over all the members of the data set. That is *not* finding the uncertainty of the average.
Using standard propagation rules if you have q = (x +/- u1)/(y +/- u2) the total uncertainty squared is (u1/x)^2 + (u2/y)^2. If y is a constant then u2 = 0 and the result is u(q)/q = u1/x. If the numerator has multiple terms it still winds up the same: (u(q)/q)^2 = (u1/x1)^2 + (u2/x2)^2 + … (un/x_n)^2 + (uy/y)^2 and the total uncertainty still becomes the sum of the uncertainty terms in the numerator.
So perhaps you would like to explain improvement in signal:nose ratio as a function of sqrt(n) samples?
How do you get an improvement when N is always and exactly one?
You can’t.
And Stokes even claimed that one little single temperature measurement has a distribution attached to it! Gaussian, of course.
Different ball game, different rules, different results.
Remember, you are dealing with ONE signal. Try sampling TWO or THREE different signals and using the additive combination result to get a better SNR.
You didn’t have the time to respond to Tim, but made the time to respond to AndyHce. Your initial plea was clearly just a poor excuse.
You can take a monthly measurement in many ways. The measurement model could be Y = Σ[(Tmin_d + Tmax_d) / 2, 1, D] / D where Tmin and Tmax are daily min and max values and D is the number days in a month or or Y = Σ[Th, 1, H] / H where Th is an hourly value and H is the number of hours in a month. There are others. For example, ERA integrates every ~12 minutes. The interesting thing is that all of these measurement models yield the same answer (within a reasonable margin of error) and have less variability than the input quantities upon which they are based.
HAHAHAHAHAHAH
He said “reasonable” while pushing the nonphysical nonsense of trendology!
If all the methods are ignoring the propagation of uncertainty then why wouldn’t they come up with the same answer? That doesn’t make ignoring uncertainty right!
“Y = Σ[(Tmin_d + Tmax_d) / 2, 1, D] / D
Where in this equation is the measurement uncertainty included? This is nothing more than the old, incorrect assumption that all measurement error cancels leaving the average 100% accurate!
You claim to show a way to “take” a monthly measurement. Instead, what you show is a method to calculate a monthly average based on two daily measurements. The Tmin and Tmax are examples of actually ‘taking’ a measurement, which could also be calculated from high temporal-resolution measurements.
I’m using the term “measurement” as it is used in JCGM 100:2008. That is all measurements are based on a model. The model is a function that maps input quantities into an output quantity. Even something as simple as Tmin or Tmax actually requires a complex model that calculates and output based on several inputs. Remember, individual temperature observations are (usually) calculated from underlying electromagnet phenomenon. The various examples of Y above are “measurement models”.
No please, just stop,..
This is a meaningless word salad.
Each individual component has its own uncertainty. Those uncertainties propagate.
The model doesn’t tell you the measurements, only the functional relationship of the measurements.
For instance, the MODEL for the volume of a cylinder is V = πR^2H.
The model isn’t the measurement just like the map isn’t the territory. Radius and height are the measurements.
The “measurand”, volume, is not measured directly but through measurements of other quantities.
“4.1.1 In most cases, a measurand Y is not measured directly, but is determined from N other quantities X1, X2, …, XN through a functional relationship f :”
You are trying to confuse the issue by suggesting that the measurand is always what is physically measured. It isn’t.
Your point being?
Computing an output quantity from multiple input quantities whether it is averaging or otherwise is “taking a measurement”.
How do you measure an average?
Being a measurand does *NOT* mean that you can “take a measurement” of it.
If you have two boards, one 2′ long and a second one 10′ long how do you “take a measurement” of the average of 6′?
The 2′ and the 10′ are measurements that can be “taken”. The value of 6′ may be a measurand calculated from the measurements but there will be no “taking a measurement” of it.
Averaging does not provide you with a physical value for a measurand.
Go to the A and B appendices in the GUM and try to get an appreciation for what a MEASUREMENT capable of being repeated actually is.
If it isn’t a measurand, why have you been going on for the last two years about the measurement uncertainty? How can something that isn’t a measurement have a measurement uncertainty?
It’s been you Tim and Carlo who have been insisting all this time that you have to follow the rules of metrology, you have to propagate the u=measurement uncertainty to the mean.
The GUM is based on physical measurements of physical attributes. It recognizes that some physical attributes can’t be directly measured and a FUNCTION is used to derive the value. Volume and density are two physical properties that require interim measurements.
You are discussing something entirely different. You are trying to make some kind of a monthly index into a value that represents a physical attribute. You can’t do that without showing it is useful and consistent.
Tell us what a monthly signal of a time varying continuous signal looks like and how you would sample it to achieve a reasonable depiction of what it looks like. You are getting further and further afield. Why not use the new methods of integration that are used for heating and cooling degree days?
JG said: “Why not use the new methods of integration that are used for heating and cooling degree days?”
It was only but a few months ago you were defending Kip’s position that doing any kind of arithmetic with temperature was useless and meaningless. And now you feign incredulity if someone doesn’t use a metric that requires performing arithmetic operations on temperature?
Why don’t you answer the question posed.
Ultimately, I don’t think global temperature is a good proxy for the heat energyof the atmosphere since things like humidity (latent heat), different albedo’s, and other variables are not measured by temperature. I’ve posted this before and I still believe it.
That doesn’t change the fact that climate science will continue to use temperature as a proxy. Doing so, since we have better data available, there is no reason to not perform better analysis.
JG said: “Why don’t you answer the question posed.”
You already know the answer because we discussed it a few months ago. I use HDD and CDD all of the time. And I have no issue with either the integration or legacy methods.
I’m just pointing out that it was only but a few months ago that you were telling me I shouldn’t be using HDD or CDD at all since regardless of method it requires performing arithmetic on temperatures. Kip devoted a whole article to it.
No one has told you this. What you *have been told is that the area under the temp curve is an area, and not a linear value. It’s why the mid-range value of temp is such a poor thing to use. It’s uncertainty is large. And a graph of temperature vs time is not a probability distribution. Therefore the mid-range value is not an average of a probability distribution. The average of a daytime sine wave and a nighttime exponential is not the same thing as a mid-range value.
We aren’t living in 1980. Climate science and its measurements need to move into the 21st Century.
TG said: “No one has told you this.”
Kip said “Multiplying temperatures as numbers can be done, but gives nonsensical results”
Kip said “Similarly, temperature, an Intensive Property, cannot be added.”
Kip said “A sum over intensive variables carries no physical meaning – adding the numerical values of two intensive variables, such as temperature, has no physical meaning, it is nonsensical.”
Kip said “Dividing meaningless totals by the number of components – in other words, averaging or finding the mean — cannot reverse this outcome, the average or mean is still meaningless.”
Frank, are you and Kip discussing error (ie deviation from a true value) or precision?
A measurement system can be in error (ie biased) but still give precise relative changes in an observation over time.
I still think there is a lot of discussion at cross purposes here on this thread.
Uncertainty is not error!
This inconvenient little fact is something the Stokes crowd cannot understand.
Precision is related to uncertainty but they are *not* the same thing.
Micrometer 1 can have the same precision as Micrometer 2. But they can each have different uncertainties. If M1 has a spring loaded anvil that makes the pressure on the measurand equal for all measurements it will have less uncertainty than M2 which depends on the “feel” of the operator to judge correct pressure.
Thinking ==> “are you and Kip discussing error (ie deviation from a true value) or precision?”
Good heavens! Did you read the essay at all? I write about “absolute uncertainty” of a measurement, with a careful definition — it is nothing like “deviation” and is not really about precision. It is about the real known, can-be-measured uncertainty surrounding a measurement.
So why does absolute uncertainty matter when deciding for example, if temperature has increased over the last 120 years?
If I didn’t understand it, maybe you didn’t explain it as well as you thought. I am not a numpty. I also sit in the sceptic camp.
Much of what has been written on this thread is either at cross purposes or a red herring.
At the end of the day absolute temp measures are not required to observe a warming or cooling trend over time. I work with relative estimates in Geoscience all the time – seismic relative impedance. Most paleo estimates are relative. And trying to put a wide range absolute uncertainty on temps is an open goal for modellers as it makes their target so much easier to hit
Resolution and precision are closely tied together. You can’t achieve higher precision without also having higher resolution. Accuracy is a different animal. They all have uncertainty intervals. I wish it wasn’t that way but it is.
The NWS and NOAA have stated the intervals the expect different measurement stations to meet. I’ll list them if you need them. The encompass the entire gamut of uncertainty. Accuracy, precision, etc.
One way to reduce the global average temperature is to use geometric mean instead of arithmetic mean. Simples
I am reminded of the 1978 film ‘Jubilee’ where the crime rate was reduced to zero by abolishing all laws…
A great pity that the author has no understanding of mathematics, were e.g., the correct answer to ‘what is the square root of four?’ is “±2”.
A shorthand way of indicating that there are two correct and exact answers.
The fact that its used in a scientific or engineering context to mean ‘the range of likely or allowable inaccuracy’ is another matter entirely.
.
Same with the quadratic equation formula.. the ± gives two numbers only, not a range of numbers
Leo ==> We are talking addition and division, not square roots, applicable to simple physical measurements with physical tools. These produce an uncertainty known as “absolute measurement error — or absolute measurement uncertainty” constrained by the limitations of the measurement tool.
+/- can be used the way you described, but that is not the only way it is used, and not the way it is used in the case of measurement. Like many words and notation systems, things can have more than one valid meaning — which depend on the context.
For your example to be meaningful, we would have to be taking the square root of 4 +/-.
A lot wrong here, but thanks for this statement.
“We just showed that adding all of the individual measurements for the month would add all the uncertainties (all the 2 cms) and then the total AND the combined uncertainty would both be divided by the number of measurements”
You have to divide the uncertainty of the sum by the number of measurements to get the uncertainty of the mean. It’s the point I’ve been making for the past two years, yet some just refuse to accept it.
More whining from da bellcurveman, who doesn’t get the respect he thinks he deserves.
Quoting Tim Gorman:
And Carlo Monte misses the point again I am not discussing in that comment if you can reduce uncertainty by taking multiple samples. I am pointing out that Tim Förman is wrong when he claimed you do not divide the uncertainty of the sum by the number of observations.
As I said, all you do when dividing the total uncertainty by the number of terms is find the AVERAGE UNCERTAINTY – not the uncertainty of the average.
You *STILL* can’t tell the difference after being shown that over and over and over.
Write it out several times. Average Uncertainty is NOT Uncertainty of the Average!
If you have a Gaussian distribution of measurements then the standard deviation of the stated values gives an estimate of the uncertainty of the average – and the standard deviation is *NOT* the average uncertainty. If you have a non-Gaussian distribution of measurements then the individual uncertainties have to be propagated onto the average in order to determine the uncertainty of the average using root-sum-square – and root-sum-square is *NOT* the average uncertainty!
Do you ever bother reading what I said. All you do is keep ignoring it and repeating your mistakes. Dividing the uncertainty of the sum by the number of elements is not giving you the average uncertainty.
For example, to use your original example, if you have 100 temperature readings each with a random independent uncertainty of ±0.5°C, the the uncertainty of the sum of those 100 temperatures will be 0.5 * sqrt(100) = ±5.0°C. The average uncertainty is 0.5°C, obviously. The uncertainty of the average is 5.0 / 100 = 0.05°C. They are not the same.
Have you never seen the simple equation for the mean?
Again I ask, what color is the sky in bellcurveman world?
Read what I said. Dividing the uncertainty of the sum is giving you the uncertainty of average, not the average uncertainty.
“Dividing the uncertainty of the sum by the number of elements is not giving you the average uncertainty.”
Did you actually read this before you posted it?
If I have uncertainties .1, .2. .3. .4, and .5 giving me a sum of 1.5 and I divide by 5, I get an average of .3.
That is the same thing as saying you have an uncertainty set of .3, .3, .3, .3, and .3 = 1.5!
When you divide the sum by the number of elements you get the AVERAGE uncertainty.
If I lay 5 boards end-to-end with the uncertainties .1, .2, .3, .4, and .5 the uncertainty in the total length will be sqrt( .1^2 + .2^2 + .3^2 + .4^2 + .5^2) =sqrt(0.55) = .74. If I use the average uncertainty of .3 I get sqrt( .3^ * 5) = sqrt(.45) = .67. They are *not* the same value.
Bellman==> Well, at least you agreed with something…..
Adding all the uncertainties and dividing by the number of uncertainties gives you the AVERAGE OF THE INDIVIDUAL UNCERTAINTIES. All that does is give you a number you can spread evenly across all the stated values – you still wind up with the same total uncertainty.
It tells you NOTHING about the accuracy of the mean!
I wish everyone would figure out the difference. The average uncertainty is *NOT* the accuracy of the mean. Uncertainty is a number that should tell you the accuracy of the mean and that is *NOT* the average uncertainty of the data set individual members.
All the standard deviation of the mean (inaccurately called the uncertainty of the mean) can tell you the interval in which the population mean can lie. Whether or not that mean is accurate or not can only be determined by propagating the uncertainties of the individual stated values of measurement.
The ONLY time you can use the SEM as an estimate of the accuracy of the mean is if you have a symmetric, Gaussian distribution. Then the spread of the individual stated values can be used as an indication of the accuracy of the mean. You simply cannot assume that multiple, individual, single measurements of temperature at multiple locations with different elevations, humidity, pressure, terrain, and geography generates a symmetric, Gaussian distribution.
When you are talking about combining summer and winter month temperatures (i.e. NH and SH), each of which has a different variance, you will *NOT* get a symmetric, Gaussian distribution and therefore the standard deviation is not a good estimate of the uncertainty of the mean and neither is an SEM. Using anomalies doesn’t help at all! The anomalies will carry the same variance (i.e. summer/winter) as the absolute temps.
What you wind up with when combining all these individual, separate measurements is an average whose uncertainty is vastly larger than the values trying to be discerned!
“Adding all the uncertainties and dividing by the number of uncertainties gives you the AVERAGE OF THE INDIVIDUAL UNCERTAINTIES.”
We’ve been through all this many times before. The point here is to divide the uncertainty if the sum by the number of measurements to get the uncertainty if the average. If, and only if, you obtain the uncertainty of the sum by adding all the uncertainties will the result be the same as the average uncertainty. But that isn’t the purpose of the exercise. The purpose is to get the uncertainty if the average not the average uncertainty.
If you get the uncertainty of the total by adding in quadrature then divide by the number of measurements the result will be less than the average uncertainty.
Or you could obtain the total by laying all your items end to end and measuring the total length. Then the uncertainty if the average will be the uncertainty of that onemeasurement divided by the number of items.
Tim Gorman must hate the old adage of “measure twice and cut once” since it flies in the face of everything he believes about measurement uncertainty.
And you *truly* don’t understand the adage at all.
You do *NOT* assume the average of the first measurement and the second measurement is the true value. You do *NOT* assume the average uncertainty is the uncertainty of the average either.
You measure twice to make sure you didn’t make a mistake in your measuring. That does *NOT* remove any calibration error in the measuring instrument. It does *NOT* minimize measuring uncertainties since with only two measurements it is highly unlikely you will see any cancellation of uncertainty.
For field temperature measuring devices there will ALWAYS be systematic bias (i.e. calibration offsets). u_total = u_systematic + u_random. If you can’t identify u_systematic in all field instruments then how do you assume a Gaussian distribution of measurements?
That adage is offered as a check on what you remembered or wrote down, not as a way of improving the precision.
Exactly! It’s a check to minimize the risk of wasting a whole board. Has nothing to do with precision
You don’t even recognize the problem with what is being discussed. It is not measuring the same board twice, it is measuring one board, then going and measuring a different board and saying that you know how much uncertainty there is when you cut each board.
Each temperature measurement reading, even at the same station, is SINGLE MEASUREMENT and allows no “true value” to be computed from multiple measurements of the same thing. Any error associated with that single reading is unable to be recognized by a statistical analysis.
The trendologists will never understand, mainly because they don’t comprehend that uncertainty is not error.
“The point here is to divide the uncertainty if the sum by the number of measurements to get the uncertainty if the average.”
The average of the uncertainties is *never* the uncertainty of the average.
The standard deviation is *NOT* an average uncertainty of a Gaussian distribution. The root-sum-square of the individual uncertainties is *NOT* an average uncertainty of a non-Gaussian distribution.
It’s just that simple.
And Kip deftly demonstrated with simple arithmetic and a very simple graphic that you are completely wrong.
And you still can’t comprehend this simple concept.
Around and around the hamster wheel spins…
“You see, it doesn’t matter if you add or subtract them, the absolute uncertainties are added. ”
Except that, if the uncertainties are independent, you take the square root of the sum of the squares of the uncertainties.
What if some are independent and some aren’t?
He doesn’t care as long as he gets to calculate uncertainty as sigma/root(N).
Then it’s more complicated. But the uncertainty will be somewhere between the two. Equation 11 of the GUM (if memory serves) gives the correct procedure based on the correlation of the uncertainties.
Exactly. And I would add that, since we largely look at trends here, the standard error of a trend is reduced when the uncertainties of some or all of the data points are positively correlated.
blob weighs in with his own brand of nonsense.
Your comments are mostly fact and tech free about the commenter, so they are mostly ignored. Even here. But this is not “nonsense. Such correlation would tighten up the effective error bands of the data being evaluated. So, if that data is being trended, the standard error of that trend would be less than if the same distributed data was uncorrelated.
Yes, multiple, systemic errors, versus time would yoyo the bands up and down. But, as with Bigfoot, and significant US election fraud, no one in this forum has ever produced any evidence of it.
Word salad blob returns!
Still with his terminal case of TDS.
You are dealing with single readings of different things. There is no correlation of data uncertainties whatsoever.
You can not expect any cancelation of uncertainty. You can not statistically analyze each reading because you only have the one each time you record a temperature. That means you have to us the accepted value which the NWS and NOAA has provided. For MMTS stations the accepted value is +/- 1.0 F. At a minimum, the mean will carry this same value.
Take yesterday’s Tmax and Tmin and calculate the standard deviation. There are various calculators online so it won’t take much time. Now translate that to a 30 days worth of data, each with a pretty much equal SD or variance and what do you get? What is the standard deviation of an mean of a NH summer and SH winter average temperature?
Why do you think NO ONE, and I mean you or the group of uncertainty deniers never quote a combined variance or SD of temperatures?
As a mathematician you can not deny that a “mean” must have a variance in order to have any meaningful description of a distribution. Why do you never add one to your calculation of a mean?
From the GUM:
Note the references to Standard Deviations.
Now lastly, look at NIST Technical Note 1900, Example E2
NIST.TN.1900.pdf
I think it is hilarious that this was pointed out by an uncertainty expert of your clan, bdgwx. The expert, Dr. Possolo found that the 95% confidence interval for a one month of collection of temperature reading to the one-hundredths of a degree came out with a mean of 25.6 +/- 1.8 F.
How do you keep justifying anomalies to the one-thousandths of a degree?
“You are dealing with single readings of different things. There is no correlation of data uncertainties whatsoever.
I never said that there was. I was discussing a hypothetical raisied by someone else.
And ironically NIST TN 1900 E2 does a type A evaluation of uncertainty and gets 4.1 C (or 8.2 for 2σ) and then gets uses that in the SEM to get an uncertainty of the average of 0.9 C (1.8 for 2σ). In other words, the example Jim mentions above proves our point.
Again?? AHHHHHHHHHHH!
Actually, a positive correlation would mean the uncertainties should be added because there is no chance that one offsets the other.
What you need to look at is the covariance which everyone simply ignores as a part of uncertainty.
“Actually, a positive correlation would mean the uncertainties should be added because there is no chance that one offsets the other.”
Simply wrong, just because their is an artificially reduced chance that “one offsets the other”. That’s what positive correlation does.
The standard error of a trend can be bootstrapped or found from the variances of the residuals and of the data error bands. And since the effective variance of positively correlated is smaller than it would otherwise be, it will have the effect of reducing the standard error of that trend.
If you have access to @RISK or similar software you can show this yourself. But I don’t, and I can’t find a step by step method to artificially correlate uncorrelated data, a la @RISK,
Another trendologist who doesn’t understand that error is not uncertainty.
You use the more general law of propagation of uncertainty in this case. See JCGM 100:2008 equation 13. Or more conveniently you could use the NIST uncertainty machine which will do it for you.
You are abusing these resources and do not understand what you are doing.
Go to the very beginning of “5”. Read what the first section says.
Now let’s go through what a measurement of a measurand truly is.
B.2.5
measurement
set of operations having the object of determining a value of a quantity
B.2.6
principle of measurement
scientific basis of a measurement
B.2.9
measurand
particular quantity subject to measurement
B.2.11
result of a measurement
value attributed to a measurand, obtained by measurement
B.2.14
accuracy of measurement
closeness of the agreement between the result of a measurement and a true value of the measurand
B.2.15
repeatability (of results of measurements)
closeness of the agreement between the results of successive measurements of the same measurand carried
out under the same conditions of measurement
Please read through these and define how an average of measurements of different things applies to “a single measurand“.
Trying to use the equations and uncertainty calculations of a single measurand and apply it to a collection of single measurements of different measurands is simply misusing what you are reading.
Averages (means) of a distribution of single measurements of different things has nothing to do with determining the value of a single measurand. Statistical processing of a distribution of individual measurements should be shown with statistical descriptors and not any other mathematical calculations.
The first thing you need to do is decide whether station data are samples, or if they define a population. No one has ever stood up and done this.
It does affect what statistical processing is done once you’ve done that.
The second thing is to define how the variances of daily averages are propagated throughout the next stage of combining daily averages into monthly and annual averages. You can’t simply ignore the variances from succeeding distributions.
At that point you can begin to address how Significant Figure Rules are used at each stage to insure that the resolution of the individual readings is not incorrectly increased.
And, how to handle measurements that are seasonally correlated?
Then you need the partial correlation coefficient. Some us actually do some of this stuff for a living…
Do you mean the co-variance?
Folks ==> I write very simple examples — and give the very simple rules. All apply correctly in the circumstances of the examples.
It is silly to say “But if we were counting pigs and piglets and ducks — and calculating combined protein needs — it would be different.”
Yes, so if we were not dealing with “absolute uncertainties” (as clearly and correctly defined in the essay) it would be different == which equals the trivial statement “but if things were different they’d be different.”
It’s nothing to do with dealing with “absolute uncertainties’. It’s whether the uncertainties are random and independent. Absolute in this case just means the uncertainties are not expressed relative to the measured value.
NO! Go reread with a small scrap of comprehension.
I think the problem is that you are using “very simple rules”. Both sources you quote for adding uncertainties are very basic, and avoid mentioning adding in quadrature.
Here’s a more advanced source which starts with just adding uncertainties before showing how this can be reduced when uncertainties are random and independent.
https://www.google.co.uk/url?sa=t&source=web&rct=j&url=http://web.mit.edu/fluids-modules/www/exper_techniques/2.Propagation_of_Uncertaint.pdf&ved=2ahUKEwip6_HB4O_7AhUQSMAKHR-SClEQFnoECAwQAQ&usg=AOvVaw0pOTN8mnexE-EVFj4lmoFM
The uncertainty expert is now looking down his nose at the unlearned Kip Hansen.
I am not an expert and I’ve yet to see any evidence that Kip Hansen is either. In any event I have no respect for argument from authority – if an expert is wrong they are wrong.
Kip,
“Yes, so if we were not dealing with “absolute uncertainties” (as clearly and correctly defined in the essay)”
Again, you are just reading something into that definition that isn’t there. But what you need to do is to demonstrate that your peculiar notion of uncertainty is actually useful for something. The “statistical” concept that you scorn is what is universally used in science. Your version is effectively what is called in the source of that definition, scale-limited.
“Scale-limited. A measuring instrument is said to be scale-limited if the experimental uncertainty in that instrument is smaller than the smallest division readable on its scale. Therefore the experimental uncertainty is taken to be half the smallest readable increment on the scale.”
This is a superficial measure; the answer is to get a better scale, or to estimate. But it isn’t the concept of uncertainty proposed by your source, which is:
“Uncertainty. Synonym: error. A measure of the the inherent variability of repeated measurements of a quantity.”
Suppose you had a thermometer accurate to 0.1°C, but with a scale in degrees. Suppose the temperature was actually 13.1°C. Then repeated measures on your scale limited instrument will give you 13°C every time. The measure of inherent variability will be 0. That is not what you wanted. Nor is the scale estimate of ±0.5°C useful. It’s always 13.
Nick ==> If one knows the uncertainty in the recorded value of measurements is ALWAYS only to the nearest whole degree, then one has a clear “absolute uncertainty” of +/- 0.5°.
You have almost stated the case correctly, but offer no remedy. If our weatherman marks down 13° for every reading near to 13°, then we have at accept “13°” as a real value — unless we know that he has done so. If we know, then we can have a more accurate value of 13° +/- 0.5°.
We are not, with temperature, taking “repeated measurements of a quantity.” We are taking a series of measurements of different quantities measured at different times — a time series of measurements of a constantly varying quantity.
“ then we have at accept “13°” as a real value “
Yes, you do. But if you accept 0.5 as a real uncertainty, you will be wrong. The temperature does not have that much variability.
I know there is a school of thought here that says uncertainty can never be overstated. But it can.
“We are not, with temperature, taking “repeated measurements of a quantity.” “
It is certainly possible to take repeated measurements at the same place and nearly the same time. But anyway the quantification of uncertainty (see GUM etc) is the range of values you would get if you did that. Usually you have not done that, so you have to estimate the range by other means. Se all their type A, type B methods stuff.
Nick ==> We don’t know either the real temperature or the real uncertainty — but we can know the real uncertainty of the temperature records — each and every one of them — which have intentionally been recorded to the nearest whole degree.
and yes, of course, “It is certainly possible to take repeated measurements at the same place and nearly the same time.” But we don’t and we didn’t.
Again, you are still talking statistical uncertainty — which, if that’s what you want, go for it, man.
But if you want to know what we know about the temperature record and its mean value, then you must include at least the known absolute uncertainty of the original measurement record.
Using a statistical approach is just a method of trying to improve our guesses of the mean of a set of numbers — which is not a measurement at all. Just a concept. It doesn’t tell us what the temperature was….
“But we don’t and we didn’t.”
Since most daily Tmax and Tmin temps are taken at vastly different times of the day you better hope there is some variability otherwise it would be hard to distinguish day from night!
Not for temperature! Once and it is gone forever.
The GUM “quantification of uncertainty” does not generate a “range of values”!
“does not generate a “range of values”!”
Really? Quoth Kip:
” the value (the true or correct value) lies between the values “2.5 + 0.5” and “2.5 – 0.5””
Sounds like a range to me. It’s true that the more scientific version characterises the range by probability markers.
You read this as some kind of statistical population—it is not.
And you ran away from the time-series problem, just like all your disciples.
It is a single value with an uncertainty interval. That is *NOT* a range of values. Only one value in that interval is the true value (at least hopefully). One value does not a range make!
That is why I refuse to accept uncertainty as a rectangular distribution. There is only ONE true value and it has a probability of 1. All the other values have a zero probability of being the true value. The problem is that you don’t know and cannot know which value has a probability of 1.
“The temperature does not have that much variability.”
Temperature is a time function. You can certainly have +/- 0.5 variability is temperatures taken at different times.
You are still stuck in the box that all temperature measurements measure the same thing and form a Gaussian distribution!
Kip Hansen December 10, 2022 3:57 pm
Suppose we want to know the height of the average American. We select 100 of them and measure their height. We get an answer, along with a standard error of the mean.
Then we select another 100 of them, and get of course a different answer, along with a new SEM.
Your claim is:
w.r.t. temperature … and because of that, you say the law of large numbers doesn’t apply.
… but the same is true with my example. We’re not repeatedly measuring the same people. We’re measuring different people at a new and different time, some people have grown, some have died … and despite that, the law of large numbers applies.
w.
w. ==> One can always create a non-physical population — “people” and then run stats against it. Quite popular in some fields. Useful if one wants to set a standard for the length of a new line of mattresses –you’d need your standard deviation and maybe 2 SD — but non-physical in a medical study as the population is not real in a sense. It is unreal in that it is entirely a very small random sample of a very large data set (heights of all the people in the USA). Do we measure infants? children? only adults? skip midgets and dwarves? Asians? Old white men?
It is almost as silly as “average SAT in Texas.”
There is no true value to get closer to….only your statistically dervived central value which may or may not be useful.
Thanks, Kip. I have no idea what you mean by a “non-physical” population. Your selection of a given population (e.g. kids between 6 and 12 with cystic fibrosis) may not be appropriate for say setting mattress lengths, but AFAIK it is indeed a “physical population”.
And why is the average SAT in Texas “silly”? We use stats like that all the time, say “Average murders per month in Chicago vs Abilene”, and they have a real physical meaning.
Finally, there absolutely IS a true value of the average height of people in the US, or any subset thereof—it’s just very hard to measure.
w.
What does this average height tell you?
You better not order T-shirts for the 100 people based on the average height you calculate!
“I have no idea what you mean by a “non-physical” population”
If I give you 10 boards whose lengths are 2′, 4′, 8′, 10′, 20′, 2′, 4′, 8′, 10′, 20′ what is the average?
Does a board of that length exist in any of the population you are analyzing?
While the individual boards exist the average board does not. So what does the average mean? Can I pick any 2 of the boards, nail them together, and have a board that is 29′ long?
Since the uncertainty of the length of the board can be different for each board what is the total uncertainty for the length you happen to get by nailing the two together? Will it be 2 x u_avg?
There are certain things that the standard statistical descriptions of mean and standard deviation can tell you and there are things that they can’t tell you. Temperature is not even a Gaussian curve, it is sinusoidal during the day and an exponential decay at night. You can’t find the “average” by adding Tmax and Tmin and dividing by 2. That won’t give you the average value of a sine wave or an exponential decay so you already have an in-built uncertainty of what the so-called “average” actually means.
I personally am still an advocate of using the integral method of calculating cooling and heating degree-days if you want to actually see what the climate is doings.
You nailed it. The example does not find a true value of height.
If these were temperatures would the average be the true value for temperature?
In addition, are all these people going to be measured using the same device? If so, how does the statistical analysis provide for identifying any systematic error? *ALL* of the measurements may be off by 1″ even if the same instrument is used. So the average will have an uncertainty not captured by the average or SEM. If you use different devices with different systematic bias, resolution, etc. then the average may still not be the true value and neither the average nor the SEM will give you the uncertainty of the average.
As a rule of thumb, accuracy should be better than half the resolution. If the graduation is 1 degree, the resolution is +/- 0.5 degrees and the accuracy should be 0.25 degrees or better.
As an example, Mitutoyo’s 0.001″ micrometers and their 0.0001″ micrometers have the same screws, but the 0.0001″ version has a vernier scale marked on the thimble. The 0.001″ instrument has a resolution of 0.0005″, but an accuracy of 0.0001″
Oh, cool. Looking up specs on eBay, I found a set of bore micrometers 🙂
“As a rule of thumb, accuracy should be better than half the resolution.”
I can’t see the point of such a rule. Why wouldn’t you want the best possible accuracy and the best possible resolution?
Take a thermometer. To be accurate, you want the height of the mercury to be proportional to temperature, with calibration. To measure the height, you have a scale with resolution. Why wouldn’t you want the best resolution your eye, even with optical help, can manage? It may be that you have enough resolution to detect failings in accuracy. OK, deal with it. You don’t have to not know about it through poor resolution.
Accuracy needs to be at least as good as the resolution or you’re playing with yourself.
There’s no point in having a thermometer marked in tenths of a degree which is accurate to half a degree.
Also, if the resolution is finer-grained than the accuracy, repeatability suffers.
You are absolutely right. Liquid in glass thermometers have hysteresis, i.e. it can read different when the temp is falling than when it is rising. That’s accuracy. How do you calibrate the scale to allow for that? Put two scales on the thermometer? It’s still a case where the accuracy is not as good as the resolution and you are only kidding yourself that the resolution is a measure of the accuracy.
Quoting Kip:
This is your raison d’etre in a nutshell, and it is wrong.
That is *only* if you can assume partial cancellation of the uncertainties, i.e. some pluses cancel some minuses.
If you have two temperatures, 45F and 75F taken by different instruments, each with an uncertainty of +/- 0.5C, why would you think that the uncertainties would cancel even partially?
Also note that a root-sum-square value of uncertainty *grows* as you add compenents.
Because they are random variables. I’m sure Taylor explained it. You understood this at the start when you were saying if you add 100 temperature measurements each with an uncertainty of +/- 0.5°C, theuncertainty of the sum would be +/- 5.0°C, that is the individual uncertains times √100.
Man, you are one expert BS arteest.
Do you remember how in our last discussion I said you felt the need to respond to my every comment with an attention seeking one line insult – and you said I needed to provide the receipts to show that?
Poor bellcurveman, he doesn’t get the respect he demands.
And you forget the little word in bold—every.,
My mistake. I should have said you add a one line attention seeking insult to just 99% of my comments.
Then stop posting nonphysical nonsense.
uncertainty^2 = u1^2 + u2^2 + … + u100^2 = [(u^2)(100)]^2 = u^2(100^2).
uncertainty = 0.5 * 10 = 5
So what is your point?
By uncertainty here you mean the uncertainty of the sum.
The point is you are adding the uncertainties in quadrature which is valid if they are random. You are not doing what Kip says which would be
uncertainty = u1 + u2^ + … + u100 = 0.5 * 100 = 50.
The only reason you can get 5 and not 50 is because you are assuming the uncertainties cancel.
But this contradicts your argument that
“If you have two temperatures, 45F and 75F taken by different instruments, each with an uncertainty of +/- 0.5C, why would you think that the uncertainties would cancel even partially?“
“By uncertainty here you mean the uncertainty of the sum.”
He’s terminally, apparently permanently blocked on this.
Take a good long look in the mirror, blob.l
Of course they cancel partially !
The clue is in the sign “+/-“.
The value 45 could take any value from 44.5 to 45.5 with equal probability.
Likewise the value 75 could be any value from 74.5 to 75.5 with equal probability.
Go reread the main post.
I have and I couldn’t believe my eyes.
Actually no. Only ONE value in the interval can be the true value, it has a probability of 1 of being the true value. All the rest have a probability of 0. You just don’t know and cannot know which value has the probability of 1 of being the true value.
So where does your cancellation of uncertainty come from? The range of possible values goes from 75.5 – 44.5 = 31 to 74.5 -45.5 = 29. The base range is 75 – 45 = 30. If the ranges are different then how can there be cancellation? Your actual uncertainty went from .5 for each to 2 for the combination. So where is the cancellation?
Because they are probability densities.
The central expectation is indeed 30, but the uncertainty on 30 is not 2.
It is 0.707, assuming the uncertainties are approximately normally distributed.
I think there is some misunderstanding here about the terms error and uncertainty (and average and precision).
For example, if a measurement is repeated then the precision will improve by averaging multiple measurements if the variation is random. This will tend to improve as the square root of the number of measures – think of it as a reduction of noise term. Uncertainty is then an interval around the measurement in which repeated measurements are expected and is a measure of precision. Note that repeating measures does reduce the uncertainty (the precision increases). In general the terms uncertainty and precision are interchangeable.
However, error is the degree to which a measurement agrees with the true value (error and accuracy are interchangeable here). No amount of repeat measurement will change this. Accuracy is related to bias in measurements.
There are good pictures on the internet relating accuracy and precision to targets. Here’s one showing from left to right inaccurate and imprecise; accurate and imprecise; inaccurate and precise; accurate and precise.
Error is the distance from a measurement to the true value.
An uncertainty interval is the range of values within which the true value is expected to lay.
The problem is that true values are unknowable, thus error is also unknowable.
Modern uncertainty analysis encompasses both precision and bias in a single interval.
In fact in statistics the uncertainty is the range or interval within which a further measurement is expected to lie. It is not a measure of error but of precision.
I recommend reading the terminology definitions in the GUM.
In fact, the GUM does give a definition very similar to that of ThinkingScientist:
2.2.3 The formal definition of the term “uncertainty of measurement” developed for use in this Guide and in the VIM [6] (VIM:1993, definition 3.9) is as follows:
uncertainty (of measurement)
parameter, associated with the result of a measurement, that characterizes the dispersion of the values that could reasonably be attributed to the measurand N
“dispersion” does not not imply any kind of statistical distribution, it is only a range.
No, GU*M said it is a parameter that characterises the dispersion. Usually a width property, most often sd.
Where does it say standard deviation in the measurement of different things is always a standard deviation?
Equation H.36 more accurately describes the issue.
The difference between two measurements is
u^(Δc) = [(s_av)^2(z_bar_s)]/m + [s_av)^2(z_bar)]/n
When you have one measurement of an object you only have one data value so there isn’t a standard deviation “s”. Both “m” and “n” equal 1 in this case. You can replace the standard deviation with the uncertainty since the uncertainty is basically the same as the variance of a random variable (or the square of the standard deviation).
Thus you get Var1/1 + Var2/1 as the uncertainty of the difference between two measurements.
The uncertainties of the individual measurements add.
This is only true if you are measuring the same thing. If your first measurement is of a 2′ board off the rack and the second is of a 10′ board from a different lumber yard then how does the 2′ length predict the 10′ length? Would you *really* expect the first measurement to predict the second one? It’s the same for temperature. Is the Tmin value a *real* predictor of what Tmax will be? With what uncertainty? In November we had daily temps of 35/60 and 45/80 on different days. Vastly different ranges and temps. Not very good for predicting the next Tmax measurement.
Thinking ==> More pragmatism, please. A measurement taken with a rule only marked in whole inches only produces measurements within a knowable uncertainty. It cannot be used to measure to the 10th of an inch. Likewise, temperature records, for instance, which were intentionally rounded to the nearest whole degree have a known — a certain and necessary — uncertainty.
That is the situation with temperature records and with Tide Gauge records which have their own spec’d uncertainty. Neither the temperature record or the tide gauge record is a measurement of “the same thing more than once” — each measurement is unique and must needs fall within the spec’d known absolute uncertainty.
Imagine we have an object which is known to be 6.5000 inches long with a very small uncertainty.
Get 1000 people to measure the length with a ruler and report the result to the nearest inch. That is important, they must report to the nearest inch, not try to estimate with any greater precision.
All other things being equal, you will get something like half reporting 6 inches and half reporting 7 inches. It’s unlikely be exactly half and half though.
You might get 512 reporting 6 inches and 488 reporting 7 inches.
Their average would be [(512 x 6) + (488 x 7)]/1000 = 6.488 inches.
Very close to the known true value of 6.5000.
If you increase your number of measurements, statistically the precision is increased. This is an accepted fact.
KB ==> I appreciate your dedication to the principle which I think you use, the law of large numbers).
Unfortunately, we are taking multiple measurements in a time series, in which the value of the thing measured is constantly changing. In the case of temperature, using a ruler that reports only whole number values (whole degree or rounded to whole degrees).
There is no TRUE VALUE to be found….we don’t know the actual temperature that was recorded — we only have the “to the nearest whole degree” record. No amount of statistical calculations will return the real original value — it is unknowable. It is hubris to think that we can somehow recover the actual temperature behind that now-lost measurement.
We can only know that the weatherman thought it was closest to 13°C (or that a computer program rounded it to 13°C) — and that means the value (most probably) was somewhere between 13.5 and 12.5°C.
Bingo! Absolutely correct!
I don’t agree.
I am absolutely in agreement that it would be better if all temperatures were reported without rounding. But I don’t accept that there is a fundamental limit to the uncertainty of 0.5 degrees when using the rounded readings.
Consider that we have a series of temperature readings which we want to plot.
They are (plucking numbers out of the air) 19.4, 20.7, 21.6, 21.8, 22.6 degrees, recorded to one decimal place.
Rounding to whole degrees, these become:
19, 21, 22, 22, 23.
Plot both sets of numbers.
By rounding, all you have done is increase the scatter of the points. Your regression line intrinsically has the same slope and intercept, but the estimates of these parameters becomes more uncertain because of the rounding.
However those uncertainties are not limited fundamentally to 0.5 degrees. With sufficient number of points, the rounded data set can give slope and intercept with lower uncertainty than that. It’s simply a question of having a large enough number of points.
KB ==> The uncertainty of added and divided values with KNOWN ABSOLUTE UNCERTAINTIES is not limited to the original measurement error /original measurement absolute uncertainty). It could be and is usually HIGHER. What is CANNOT BE is LOWER.
It absolutely can be lower. The law of propagation of uncertainty says that when the partial derivative of a measurement upon a higher level measurement model is < 1/sqrt(N) where N is the number of inputs into the model then the uncertainty is lower. Averages have that ability. Sums do not. In fact sums do the opposite and make the uncertainty higher.
The usual nonphysical nonsense: as N grows, uncertainty goes to zero.
Idiocy.
It’s a result of always assuming that measurement uncertainties always cancel. To the climate crowd all data sets are Gaussian.
I’ve just given you an example of how it can be lower.
Multiple persons have posted reasons why it can be lower.
We can calculate PI to a milion places, but how do we know if it is correct?
There is a procedure after estimation of uncertainty that is named validation. For example, in the chemical analysis of Moon rocks and soils for major elements, we can use methods like neutron activation, mass spectrometry, atomic absorption, optical emission spectrometry and classical wet chemistry. The methods that tend to give similar results are often quoted as validating each other, but there is abundant interesting and relevant literature too large to summarise here.
A critical episode was written by George H Morrison for Apollo 11 soils. The best labs in the world were chosen to participate. After the event, there were red faces when often, the lab results were outside prior uncertainty estimates. Unfortunate, but seldom mentioned any more as later generations have failed to learn, so the bad practises continue.
Geoff S
+100!
Using that approach, if the true length was 6.3750″, you would end up with an arithmetic mean of 6.000″
I agree. KB had to use a “true” value of 6.5″ to make his point work in the face of this artificial restriction on measurement reporting. This is why we always want measurements reported to the limit of the measuring process. If you did that, his point would be valid for both your example and his.
Not really. You can only go to half the measurement resolution, so the arithmetic mean for my 6 3/8″ is still going to be 6″ because everybody would round down.
Midpoints are usually going to come out somewhere about right if rounding.
Where it gets interesting is if you tell the measurers that they can guesstimate to a half inch. People can halve intervals quite well, so both examples would come out at 6 1/2″ (not 6.5″, which gives spurious precision).
You could probably get fairly good results telling them they could go to the 1/4″ estimate level. The mean of the 6 3/8″ example would probably be biased high. Interpolation beyond halves is more difficult, so more people would take it to the 1/2 interval rather than the 1/4.
But it was meant to be an analogy for rounding temperatures to the nearest whole degree.
True values are unknowable.
You might understand this if you understood that error is not uncertainty.
“True values are unknowable.”
Your argument to dispense with any data evaluation, ever, would have more weight, generally applied. Just restricting it to evaluations that don’t back up your prejudgments, not so much.
So you are claiming that you can determine true values of temperature and thus the absolute value of the error?
You truly are delusional, explains a lot.
You could do a similar experiment though.
You have a ruler graduated only in whole inches, and each person uses the same ruler.
Get 1000 people to measure the 6.3750″ object with this ruler. They are to report a value for the exact length, so they must estimate the 0.3750″ part by eye.
The mean of 1000 people will be closer to the true value than individual estimates.
3/8 is an interesting one, which is part of the reason for picking it. I suspect most people would report it as 1/3 if they’re allowed to make an eyeball estimate, rather than halving the invisible intervals. Experienced carpenters, mechanics and machinists will be used to halving intervals and have techniques to do so, and will probably get it right.
Maybe I should have gone with 1/3, which would probably work out the other way around.
This is measuring the same thing and generating a random, Gaussian distribution of measurements.
As Kip points out this is an unwarranted assumption for temperature measurements where each measurement is of a different thing.
For daily figures yes, but not if you use the rounded daily figures to calculate a monthly mean. Then you can estimate the monthly mean and SD very close to if you had full precision. Try it with random numbers in a spreadsheet.
Thinking ==> Of course one can calculate anything — all you need is some numbers and a formula. That does not mean that your result will be applicable in the real world….it will not tell you the temperature, it will be separate, apart from, the reality we hoped to discover something about.
YTou yourself provide the truth of this: It can be done with random numbers that don’t even pretend to represent anything.
Kip, I suspect in all this there is discussion at cross-purposes.
Error is how close the measurement is to the true value. Precision is the repeatability of the measurement.
For temperature you can still track relative changes over time with the precision of, say, the monthly average of daily values. The error (ie deviation from the true value) is then moot as the values are relative.
That comment assumes that errors do not vary with time – of course they may do for example paint deterioration on a Stevenson screen. That is a separate question of non-stationary drift.
Thinking ==> This essay is about the simplest form of uncertainty –that resulting from the measurement instrument or process itself — in the examples, a nunmberic value for that uncertainty that is known precisely. We are not talking “error” in your sense here. We are not talking repeatability — one unique measurement of one unique moment.
Can we track an ever changing temperature — as trends I suppose you mean — yes, if you chose. The uncertainty of the recorded values affects the precision of those trends though — if we’ve only recorded temperatures to +/- 05°, the trends look like that of GISTEMPv4 in the essay.
If the daily values are inaccurate then their average will be also. Look at the picture of precision and accuracy you posted. It doesn’t matter how much you average inaccurate but precise measurements. That average will still be inaccurate.
I like the attached picture better than yours. A is inaccurate and is not precise. Averaging the distances won’t give you are more accurate value. B is accurate and precise. C is precise but not accurate. No matter what the average of C is it will never be accurate. You may be able to calculate the average of the points in C with a small standard deviation of the sample means or just a small standard deviation of the population (inaccurately known as the uncertainty of the average) but that average will *never* be accurate.
If daily values of temperature are inaccurate then no amount of averaging will fix it and neither will the precision of measurement. That inaccuracy will carry on to the monthly average of the daily mid-range temperatures. Think of four different people shooting at target C. Each set of hits go very precisely along the bottom edge of the target. No amount of averaging those four sets of data (like four days of daily temps) will give you an accurate result.
“For example, if a measurement is repeated then the precision will improve by averaging multiple measurements if the variation is random.”
I don’t agree. The precision is very much related to the resolution of the measuring instrument. You can’t get more precision than the measuring instrument allows. You may get loser to the “true value” if the variation is random but you can’t know more about the true value than the precision the measuring device can tell you.
It’s a matter of significant digits. If your measurement only provides for two significant digits then your average shouldn’t have more than two significant digits, otherwise you are implying you know more about the average than you can possibly know.
This is only the case for data exhibiting stationarity, that is the mean and SD do not have a trend over time. For a time-series with a trend, the change in SD will be a function of the slope of the regression line.
Not sure that is true – the SD is a centred statistic, subtraction of the local mean along the time series effectively makes the variance stationary.
However, usually you should consider stationarity to refer to the underlying behaviour fo the phenomena being measured. So it is entirely possible to have a random function with stationary mean and non-stationary variance for example.
I think you misunderstand. If you calculate the mean and SD for a single year, with a time-series that has a long-term increasing trend, the results will be smaller than the mean and SD for that year compared to or even combined with a subsequent year. On the other hand, calculating the mean of the first of several measurements of the diameter of a ball bearing, any changes in the mean and SD of subsequent measurements should oscillate around the initial mean, i.e. stationary data.
There is an old joke that the NTSC analog TV standard stood for, “Never Twice the Same Color.” Non-stationary temperature data could be said to be “Never Twice the Same Celsius.”
On a long term slow trend with a clear quasi-random path component like global temps the variance of annual values will be trivially affected by the trend.
Inflation of variance due to the presence of a non-stationary trend is well known. But that is not what is being computed in this case. Its irrelevant. For example, the variance of monthly temps on a year by year basis are unlikely to show changing variance as a function of time.
The real issue is that if trendology were to show realistic “error bars” on their trend graphs, like the hockey sticks, all of the alarmism would disappear in a puff of greasy green smoke. The first time I reposted the UAH day with ±1.5°C uncertainty limits, bellman whined for a whole month.
And what does the GAT tell you about climate, really? Very little.
“This is only the case for data exhibiting stationarity, that is the mean and SD do not have a trend over time.”
You’re right about how that improvement proceeds. The “tendency” does indeed require identical error bars. And adding data with higher error bands might very well add to the standard error of the resulting trend. But I would include that additional data, whether it comes from the left, right, or center of the time series, as it will always increase the overall quality of the result.
“For a time-series with a trend, the change in SD will be a function of the slope of the regression line.”
You’re “SD” is of the trend, right? If so, wrong. Per my earlier request, your claim, your requirement to provide the data that shows it. But here’s my rebut data. I just used the wft data that comes up in interactive.
https://www.woodfortrees.org/plot/
Now, here’s (1) the data, (2) the detrended data, and (3) the trend statistics for both. Leaving out the obvious sig fig violations, the standard error of both trends is 0.000505254 degC/year.
1979 -0.3745 -0.3745
1979.08 -0.4325 -0.433916309
1979.17 -0.306 -0.309009657
1979.25 -0.36175 -0.366175966
1979.33 -0.379 -0.384842275
1979.42 -0.317 -0.324435623
1979.5 -0.28175 -0.290601932
1979.58 -0.2965 -0.306768241
1979.67 -0.2195 -0.231361589
1979.75 -0.1345 -0.147777898
1979.83 -0.187 -0.201694207
1979.92 -0.0565 -0.072787555
1980 -0.1605 -0.178203864
1980.08 -0.06225 -0.081370173
1980.17 -0.16475 -0.185463521
1980.25 -0.11575 -0.13787983
1980.33 -0.08325 -0.106796139
1980.42 -0.13075 -0.155889487
1980.5 -0.19175 -0.218305796
1980.58 -0.1955 -0.223472105
1980.67 -0.19175 -0.221315453
1980.75 -0.25575 -0.286731762
1980.83 -0.17775 -0.210148071
1980.92 -0.27225 -0.306241419
1981 -0.03225 -0.067657728
1981.08 -0.05475 -0.091574037
1981.17 -0.10275 -0.141167385
1981.25 -0.153 -0.192833694
1981.33 -0.208 -0.249250003
1981.42 -0.2095 -0.252343351
1981.5 -0.17175 -0.21600966
1981.58 -0.149 -0.194675969
1981.67 -0.259 -0.306269317
1981.75 -0.29725 -0.345935626
1981.83 -0.21775 -0.267851935
1981.92 -0.05825 -0.109945282
1982 -0.3455 -0.398611592
1982.08 -0.3035 -0.358027901
1982.17 -0.40525 -0.461371248
1982.25 -0.3235 -0.381037558
1982.33 -0.3255 -0.384453867
1982.42 -0.34775 -0.408297214
1982.5 -0.3925 -0.454463524
1982.58 -0.37925 -0.442629833
1982.67 -0.31925 -0.38422318
1982.75 -0.38875 -0.455139489
1982.83 -0.327 -0.394805799
1982.92 -0.14125 -0.210649146
1983 0.0205 -0.050315455
1983.08 -0.1115 -0.183731765
1983.17 0.01525 -0.058575112
1983.25 -0.09475 -0.169991421
1983.33 -0.07725 -0.153907731
1983.42 -0.234 -0.312251078
1983.5 -0.139 -0.218667387
1983.58 -0.08425 -0.165333696
1983.67 -0.1115 -0.194177044
1983.75 -0.2245 -0.308593353
1983.83 -0.12475 -0.210259662
1983.92 -0.3175 -0.40460301
1984 -0.30775 -0.396269319
1984.08 -0.31325 -0.403185628
1984.17 -0.23475 -0.326278976
1984.25 -0.3595 -0.452445285
1984.33 -0.19 -0.284361594
1984.42 -0.35325 -0.449204942
1984.5 -0.333 -0.430371251
1984.58 -0.28225 -0.38103756
1984.67 -0.43525 -0.535630908
1984.75 -0.27675 -0.378547217
1984.83 -0.3895 -0.492713526
1984.92 -0.476 -0.580806874
1985 -0.3095 -0.415723183
1985.08 -0.48825 -0.595889492
1985.17 -0.394 -0.50323284
1985.25 -0.39125 -0.501899149
1985.33 -0.38225 -0.494315458
1985.42 -0.404 -0.517658806
1985.5 -0.48775 -0.602825115
1985.58 -0.32675 -0.443241424
1985.67 -0.384 -0.502084772
1985.75 -0.40925 -0.528751081
1985.83 -0.365 -0.48591739
1985.92 -0.30975 -0.432260738
1986 -0.2075 -0.331427047
1986.08 -0.29125 -0.416593356
1986.17 -0.2665 -0.393436704
1986.25 -0.23025 -0.358603013
1986.33 -0.2455 -0.375269322
1986.42 -0.3 -0.43136267
1986.5 -0.34375 -0.476528979
1986.58 -0.32675 -0.460945288
1986.67 -0.379 -0.514788636
1986.75 -0.34175 -0.478954945
1986.83 -0.282 -0.420621254
1986.92 -0.25025 -0.390464602
1987 -0.07825 -0.219880911
1987.08 0.00375 -0.13929722
1987.17 -0.2685 -0.413140568
1987.25 -0.10175 -0.247806877
1987.33 -0.166 -0.313473186
1987.42 -0.05 -0.199066534
1987.5 -0.044 -0.194482843
1987.58 -0.12975 -0.281649152
1987.67 -0.1085 -0.2619925
1987.75 -0.04625 -0.201158809
1987.83 -0.06525 -0.221575118
1987.92 0.14925 -0.008668466
1988 0.0925 -0.066834775
1988.08 -0.09375 -0.254501084
1988.17 0.0365 -0.125844432
1988.25 -0.05175 -0.215510741
1988.33 -0.024 -0.18917705
1988.42 -0.04425 -0.211020398
1988.5 -0.031 -0.199186707
1988.58 -0.07475 -0.244353016
1988.67 -0.0075 -0.178696364
1988.75 -0.14025 -0.312862673
1988.83 -0.25025 -0.424278982
1988.92 -0.21975 -0.39537233
1989 -0.3975 -0.574538639
1989.08 -0.2525 -0.430954948
1989.17 -0.2505 -0.430548296
1989.25 -0.2215 -0.402964605
1989.33 -0.29975 -0.482630914
1989.42 -0.29725 -0.481724261
1989.5 -0.18825 -0.374140571
1989.58 -0.182 -0.36930688
1989.67 -0.136 -0.324900227
1989.75 -0.1755 -0.365816537
1989.83 -0.241 -0.432732846
1989.92 -0.1075 -0.300826193
1990 -0.12475 -0.319492503
1990.08 -0.14725 -0.343408812
1990.17 0.14325 -0.054502159
1990.25 -0.03425 -0.233418468
1990.33 -0.0195 -0.220084778
1990.42 -0.028 -0.230178125
1990.5 -0.065 -0.268594434
1990.58 -0.1055 -0.310510744
1990.67 -0.182 -0.388604091
1990.75 -0.02325 -0.2312704
1990.83 0.08075 -0.12868671
1990.92 0.00275 -0.208280057
1991 -0.02575 -0.238196366
1991.08 0.00975 -0.204112675
1991.17 0.032 -0.183456023
1991.25 0.00375 -0.213122332
1991.33 -0.00725 -0.225538641
1991.42 0.12 -0.099881989
1991.5 0.0305 -0.190798298
1991.58 0.0075 -0.215214607
1991.67 -0.08975 -0.314057955
1991.75 -0.19825 -0.423974264
1991.83 -0.22525 -0.452390573
1991.92 -0.2445 -0.473233921
1992 -0.0925 -0.32265023
1992.08 -0.15475 -0.386316539
1992.17 -0.0885 -0.321659887
1992.25 -0.2455 -0.480076196
1992.33 -0.2185 -0.454492505
1992.42 -0.23725 -0.474835853
1992.5 -0.3945 -0.633502162
1992.58 -0.422 -0.662418471
1992.67 -0.4515 -0.693511819
1992.75 -0.34075 -0.584178128
1992.83 -0.361 -0.605844437
1992.92 -0.30375 -0.550187785
1993 -0.231 -0.478854094
1993.08 -0.209 -0.458270403
1993.17 -0.278 -0.528863751
1993.25 -0.26875 -0.52103006
1993.33 -0.1995 -0.453196369
1993.42 -0.16275 -0.418039717
1993.5 -0.1605 -0.417206026
1993.58 -0.26225 -0.520372335
1993.67 -0.342 -0.601715683
1993.75 -0.19775 -0.458881992
1993.83 -0.28375 -0.546298301
1993.92 -0.122 -0.386141649
1994 -0.14975 -0.415307958
1994.08 -0.32175 -0.588724267
1994.17 -0.1785 -0.447067615
1994.25 -0.131 -0.400983924
1994.33 -0.1245 -0.395900233
1994.42 -0.026 -0.298993581
1994.5 -0.091 -0.36540989
1994.58 -0.145 -0.420826199
1994.67 -0.09875 -0.376169547
1994.75 -0.151 -0.429835856
1994.83 -0.02 -0.300252165
1994.92 -0.034 -0.315845513
1995 0.05075 -0.232511822
1995.08 0.12275 -0.161928131
1995.17 -0.05925 -0.345521479
1995.25 0.055 -0.232687788
1995.33 -0.0695 -0.358604097
1995.42 0.02275 -0.267947445
1995.5 -0.001 -0.293113754
1995.58 0.112 -0.181530063
1995.67 0.03775 -0.257373411
1995.75 0.02175 -0.27478972
1995.83 0.01725 -0.280706029
1995.92 -0.20625 -0.505799377
1996 -0.20075 -0.501715686
1996.08 -0.0305 -0.332881995
1996.17 -0.0765 -0.380475343
1996.25 -0.14875 -0.454141652
1996.33 -0.141 -0.447807961
1996.42 -0.16275 -0.471151309
1996.5 -0.06475 -0.374567618
1996.58 -0.0315 -0.342733927
1996.67 -0.07675 -0.389577274
1996.75 -0.1195 -0.433743584
1996.83 -0.07525 -0.390909893
1996.92 -0.102 -0.41925324
1997 -0.17075 -0.48941955
1997.08 -0.084 -0.404085859
1997.17 -0.0625 -0.384179206
1997.25 -0.18875 -0.511845516
1997.33 -0.10025 -0.424761825
1997.42 0.00225 -0.323855172
1997.5 0.00125 -0.326271481
1997.58 0.0385 -0.290437791
1997.67 0.06425 -0.266281138
1997.75 0.1245 -0.207447447
1997.83 0.1125 -0.220863757
1997.92 0.1785 -0.156457104
1998 0.2935 -0.042873413
1998.08 0.503 0.165210277
1998.17 0.318 -0.02138307
1998.25 0.488 0.147200621
1998.33 0.40625 0.064034311
1998.42 0.3875 0.043690964
1998.5 0.37275 0.027524655
1998.58 0.35275 0.006108346
1998.67 0.193 -0.155235002
1998.75 0.1745 -0.175151311
1998.83 0.0245 -0.32656762
1998.92 0.16125 -0.191410968
1999 0.0255 -0.328577277
1999.08 0.1745 -0.180993586
1999.17 -0.1185 -0.475586934
1999.25 -0.0385 -0.397003243
1999.33 -0.10075 -0.460669552
1999.42 -0.14275 -0.5042629
1999.5 -0.05725 -0.420179209
1999.58 -0.1125 -0.476845518
1999.67 -0.02725 -0.393188866
1999.75 -0.07975 -0.447105175
1999.83 -0.09725 -0.466021484
1999.92 -0.032 -0.402364832
2000 -0.2195 -0.591281141
2000.08 0.01275 -0.36044745
2000.17 0.02225 -0.352540798
2000.25 0.093 -0.283207107
2000.33 -0.01625 -0.393873416
2000.42 -0.0265 -0.405716764
2000.5 -0.0765 -0.457133073
2000.58 -0.06825 -0.450299382
2000.67 -0.0365 -0.42014273
2000.75 -0.07875 -0.463809039
2000.83 -0.108 -0.494475348
2000.92 -0.1285 -0.516568696
2001 -0.03 -0.419485005
2001.08 0.01125 -0.379651314
2001.17 0.0895 -0.302994662
2001.25 0.13125 -0.262660971
2001.33 0.12975 -0.26557728
2001.42 0.03225 -0.364670628
2001.5 0.091 -0.307336937
2001.58 0.18375 -0.216003246
2001.67 0.046 -0.355346594
2001.75 0.12075 -0.282012903
2001.83 0.21675 -0.187429212
2001.92 0.11 -0.29577256
2002 0.27225 -0.134938869
2002.08 0.3175 -0.091105178
2002.17 0.31475 -0.095448526
2002.25 0.1705 -0.241114835
2002.33 0.177 -0.236031144
2002.42 0.191 -0.223624492
2002.5 0.191 -0.225040801
2002.58 0.1465 -0.27095711
2002.67 0.1575 -0.261550458
2002.75 0.06975 -0.350716767
2002.83 0.162 -0.259883076
2002.92 0.06 -0.363476424
2003 0.3055 -0.119392733
2003.08 0.173 -0.253309042
2003.17 0.13225 -0.29565239
2003.25 0.129 -0.300318699
2003.33 0.18325 -0.247485008
2003.42 0.04175 -0.390578356
2003.5 0.12725 -0.306494665
2003.58 0.1665 -0.268660974
2003.67 0.172 -0.264754322
2003.75 0.289 -0.149170631
2003.83 0.14675 -0.29283694
2003.92 0.318 -0.123180288
2004 0.18375 -0.258846597
2004.08 0.25475 -0.189262906
2004.17 0.27575 -0.169856253
2004.25 0.1525 -0.294522563
2004.33 0.011 -0.437438872
2004.42 -0.00175 -0.451782219
2004.5 -0.1175 -0.568948529
2004.58 -0.0035 -0.456364838
2004.67 0.07025 -0.384208185
2004.75 0.1625 -0.293374495
2004.83 0.18825 -0.269040804
2004.92 0.04375 -0.415134151
2005 0.288 -0.17230046
2005.08 0.15775 -0.30396677
2005.17 0.2415 -0.221810117
2005.25 0.304 -0.160726426
2005.33 0.1755 -0.290642736
2005.42 0.18725 -0.280486083
2005.5 0.2215 -0.247652392
2005.58 0.1645 -0.306068702
2005.67 0.2625 -0.209662049
2005.75 0.29575 -0.177828358
2005.83 0.265 -0.209994667
2005.92 0.14025 -0.336338015
2006 0.11925 -0.358754324
2006.08 0.21675 -0.262670633
2006.17 0.1695 -0.311513981
2006.25 0.08025 -0.40218029
2006.33 -0.001 -0.484846599
2006.42 0.13675 -0.348689947
2006.5 0.13025 -0.356606256
2006.58 0.195 -0.293272565
2006.67 0.18525 -0.304615913
2006.75 0.242 -0.249282222
2006.83 0.18475 -0.307948531
2006.92 0.2805 -0.213791879
2007 0.497 0.001291812
2007.08 0.237 -0.260124497
2007.17 0.257 -0.241717845
2007.25 0.23325 -0.266884154
2007.33 0.15525 -0.346300463
2007.42 0.1265 -0.376643811
2007.5 0.16225 -0.34231012
2007.58 0.18525 -0.320726429
2007.67 0.14375 -0.363819777
2007.75 0.14275 -0.366236086
2007.83 0.082 -0.428402395
2007.92 0.01175 -0.500245743
2008 -0.1725 -0.685912052
2008.08 -0.10425 -0.619078361
2008.17 0.09775 -0.418671709
2008.25 -0.01875 -0.536588018
2008.33 -0.1015 -0.620754327
2008.42 -0.051 -0.571847675
2008.5 0.0495 -0.472763984
2008.58 -0.00575 -0.529430293
2008.67 0.09075 -0.434523641
2008.75 0.1455 -0.38118995
2008.83 0.16875 -0.359356259
2008.92 0.0695 -0.460199607
2009 0.15725 -0.373865916
2009.08 0.10625 -0.426282225
2009.17 0.0775 -0.456625573
2009.25 0.10625 -0.429291882
2009.33 0.06075 -0.476208191
2009.42 0.046 -0.492551539
2009.5 0.242 -0.297967848
2009.58 0.18475 -0.356634157
2009.67 0.28775 -0.255227505
2009.75 0.19925 -0.345143814
2009.83 0.2795 -0.266310123
2009.92 0.1635 -0.383903471
2010 0.38925 -0.15956978
2010.08 0.387 -0.163236089
2010.17 0.4725 -0.079329437
2010.25 0.37675 -0.176495746
2010.33 0.3575 -0.197162055
2010.42 0.30425 -0.252005403
2010.5 0.318 -0.239671712
2010.58 0.314 -0.245088021
2010.67 0.2695 -0.291181369
2010.75 0.21325 -0.348847678
2010.83 0.25925 -0.304263987
2010.92 0.0635 -0.501607335
2011 0 -0.566523644
2011.08 -0.018 -0.585939953
2011.17 0.009 -0.560533301
2011.25 0.0875 -0.48344961
2011.33 0.048 -0.524365919
2011.42 0.17225 -0.401709266
2011.5 0.225 -0.350375576
2011.58 0.213 -0.363791885
2011.67 0.14675 -0.431635232
2011.75 0.0745 -0.505301542
2011.83 0.0235 -0.557717851
2011.92 0.0695 -0.513311198
2012 -0.07775 -0.661977508
2012.08 -0.1055 -0.691143817
2012.17 0.03325 -0.553987164
2012.25 0.2185 -0.370153473
2012.33 0.1915 -0.398569783
2012.42 0.202 -0.38966313
2012.5 0.133 -0.460079439
2012.58 0.1805 -0.413995749
2012.67 0.26475 -0.331339096
2012.75 0.2745 -0.323005405
2012.83 0.23325 -0.365671715
2012.92 0.06 -0.540515062
2013 0.318 -0.283931371
2013.08 0.162 -0.441347681
2013.17 0.16075 -0.444191028
2013.25 0.119 -0.487357337
2013.33 0.11775 -0.490023646
2013.42 0.22325 -0.386116994
2013.5 0.14625 -0.464533303
2013.58 0.169 -0.443199612
2013.67 0.252 -0.36179296
2013.75 0.21575 -0.399459269
2013.83 0.2375 -0.379125578
2013.92 0.1915 -0.426718926
2014 0.24775 -0.371885235
2014.08 0.10525 -0.515801544
2014.17 0.23575 -0.386894892
2014.25 0.265 -0.359061201
2014.33 0.30375 -0.32172751
2014.42 0.27675 -0.350320858
2014.5 0.2255 -0.402987167
2014.58 0.2515 -0.378403476
2014.67 0.2615 -0.369996824
2014.75 0.311 -0.321913133
2014.83 0.2295 -0.404829442
2014.92 0.3 -0.33592279
2015 0.36775 -0.269589099
2015.08 0.34025 -0.298505408
2015.17 0.34975 -0.290598756
2015.25 0.23875 -0.403015065
2015.33 0.3315 -0.311681374
2015.42 0.36875 -0.276024722
2015.5 0.27375 -0.372441031
2015.58 0.36125 -0.28635734
2015.67 0.38975 -0.259450688
2015.75 0.5495 -0.101116997
2015.83 0.49925 -0.152783306
2015.92 0.62625 -0.027376654
2016 0.65175 -0.003292963
2016.08 0.8995 0.243040728
2016.17 0.8485 0.19044738
2016.25 0.70125 0.041781071
2016.33 0.4885 -0.172385238
2016.42 0.39425 -0.268228586
2016.5 0.4215 -0.242394895
2016.58 0.49125 -0.174061204
2016.67 0.4805 -0.186404552
2016.75 0.4005 -0.267820861
2016.83 0.40375 -0.26598717
2016.92 0.31575 -0.355580518
2017 0.457 -0.215746827
2017.08 0.54925 -0.124913136
2017.17 0.50275 -0.173006484
2017.25 0.40225 -0.274922793
2017.33 0.428 -0.250589102
2017.42 0.2825 -0.39768245
2017.5 0.3545 -0.327098759
2017.58 0.44725 -0.235765068
2017.67 0.44075 -0.243858416
2017.75 0.48825 -0.197774725
2017.83 0.35075 -0.336691034
2017.92 0.406 -0.283034382
2018 0.314 -0.376450691
2018.08 0.285 -0.406867
2018.17 0.344 -0.349460348
2018.25 0.31075 -0.384126657
2018.33 0.26125 -0.435042966
2018.42 0.262 -0.435886314
2018.5 0.34875 -0.350552623
2018.58 0.2745 -0.426218932
2018.67 0.2635 -0.43881228
2018.75 0.36525 -0.338478589
2018.83 0.31275 -0.392394898
2018.92 0.34625 -0.360488245
2019 0.44125 -0.266904555
2019.08 0.4135 -0.296070864
2019.17 0.539 -0.172164211
2019.25 0.52 -0.192580521
2019.33 0.363 -0.35099683
2019.42 0.4855 -0.230090177
2019.5 0.4425 -0.274506487
2019.58 0.445 -0.273422796
2019.67 0.5385 -0.181516143
2019.75 0.48425 -0.237182452
2019.83 0.4955 -0.227348762
2019.92 0.59475 -0.129692109
2020 0.65525 -0.070608418
2020.08 0.7555 0.028225272
2020.17 0.63675 -0.092118075
2020.25 0.52875 -0.201534384
2020.33 0.53225 -0.199450694
2020.42 0.45675 -0.276544041
2020.5 0.452 -0.28271035
2020.58 0.46075 -0.275376659
2020.67 0.536 -0.201720007
2020.75 0.464 -0.275136316
2020.83 0.55875 -0.181802625
2020.92 0.30125 -0.440895973
2021 0.30675 -0.436812282
2021.08 0.28275 -0.462228591
2021.17 0.28325 -0.463321939
2021.25 0.2525 -0.495488248
2021.33 0.3 -0.449404557
2021.42 0.2965 -0.454497905
2021.5 0.41975 -0.332664214
2021.58 0.37375 -0.380080523
2021.67 0.4455 -0.309923871
2021.75 0.514 -0.24284018
2021.83 0.357 -0.401256489
2021.92 0.3715 -0.388349837
0.017703864 -35.36334248 -1.43715E-17 -0.327395897
0.000505254 1.010759351 0.000505254 1.010759351
0.704897561 0.142466264 1.70272E-15 0.142466264
1227.768053 514 8.75198E-13 514
24.9195618 10.43247113 1.77636E-14 10.43247113
What do you think you did here? It sure looks like nothing more than calculating the difference between two trends assuming all data points for both trends are 100% accurate. If each of the data points have an uncertainty then how do we know the trends themselves are actual trends?
This is nothing more than the typical: averaging cancels all uncertainty!
“What do you think you did here? It sure looks like nothing more than calculating the difference between two trends assuming all data points for both trends are 100% accurate.”
I rebutted an incorrect claim by Clyde Spencer. The data is as presented. If the expected values were provided to us by the Imaginary Guy In The Sky, and presented with or without our man made error bands, the standard error of it’s resulting trend would not be changed by detrending (actually changetrending) it.
blob is yet another trendologist who thinks that uncertainty limits are “error bands”.
“The mean is certain.”
The mean is usually the mean of a sample being used to estimate the mean of the whole population. As such the mean is certainly not certain.
Really? The equation in the article says exactly the opposite.
Bell and others ==> Can you do arithmetic? I give the mathematical rules and then an example following the rules…..where does the uncertainty arise in the arithmetic?
Addition and division of known numbers do not produce uncertainty.
As I said when the mean of a sample is being used to estimate the mean of the population, then it is an uncertain estimate of the population.
The mean of a sample might be an exact value of the mean of that sample (ignoring uncertainties in the measurements), but the sample is just a random selection from the population. The uncertainty arises from the randomness of the sample. Select a different sample and you get a different mean.
Bell ==> I am aware that many here consider that you are trolling — but you bring up a good point which is that if one is calculating the mean of a known data set, using all the individual values, and all have a known absolute uncertainty, the the solution is simple arithmetic.
There is no sampling (we have and use the whole data set, all the data), there is no need for a statistical approach — there is only arithmetic.
and Bell ==> This is why Schmidt is correct in stating +/- 0.5°C (he adds a very tiny bit, see his quote in the essay).
I think Bellman has a point of this one. The uncertainty isn’t in the arithmetic, it is in extrapolating the results out to represent things that were never measured. This is acknowleged by all when we claim, “the sample isn’t big enough” or “climate needs at least 30 years of data.” And averaging data doesn’t always give a useful or appropriate answer. I’m still trying to figure out what the average coin toss is.
Hoyt ==> (The first name of my favorite folk singer in the 1960’s)
In the case of a temperature (or tide gauge) time series, we are using the whole data set to produce an arithmetic mean. There is no “things that were never measured” in these particular examples.
“averaging data doesn’t always give a useful or appropriate answer.” Abslutely true. See mine on averages.
Go back and look at the graphic Kip used…
Kip Hansen said: “where does the uncertainty arise in the arithmetic?”
It is a good question. Part of the answer may depend on semantics, but I’m only going to focus on the mathematics here. The answer requires an understanding of the law of propagation of uncertainty. The law of propagation of uncertainty is given in equation E.3 of JCGM 100:2008, but I want to focus on the more specific uncorrelated case in equation 10 for simplicity. It is given as:
u = Σ[(∂f/∂x_i)^2 * u(x_i)^2, 1, N]
The important term in this equation is ∂f/∂x_i which is the partial derivative of the function f wrt to the input quantity x_i. For example, if f(x_1, x_2) = x_1 + x_2 then ∂f/∂x_1 = 1 and ∂f/x_2 = 1 because when you change either term by 1 unit the output of the function f changes by 1 unit.
This partial derivative is crucial in answering the question “where does the uncertainty arise in the arithmetic”. When ∂f/∂x >= 1/sqrt(N) then uncertainty increases as a result of the arithmetic. When ∂f/∂x < 1/sqrt(N) then uncertainty decreases as a result of the arithmetic.
The meaning is very deep and requires a lot of understanding in multivariant calculus. But that’s your answer…the uncertainty arises in the arithmetic due to the partial derivative term in the law of propagation of uncertainty.
You can actually prove this out for yourself using the NIST uncertainty machine. Create two input quantities and set the output to 1.0 * x0 + 1.0 * x1. Notice that the arithmetic causes an increase of the uncertainty in this case. Now set the output expression to 0.6 * x0 + 0.6 * x1 and notice that the arithmetic causes a decrease of the uncertainty. The reason…the partial derivative changes from 1 to 0.6. Notice that 0.6 < 1/sqrt(2).
Oh please, not the NIST machine again. A classic case of garbage-in-garbage-out.
bdgwx ==> “Where does the uncertainty arise in the arithmetic?” There is no uncertainty in the arithmetic.
You have retreated into the world of probability and statistics. Arithmetic involves neither of those.
It is not semantics — it is addition and division.
Even simple measurement models using only addition, subtraction, multiplication, and division require complex calculus to propagate the uncertainty. And when those simple arithmetic operations result in partial derivatives >1/sqrt(N) then those operations cause additional uncertainty. It may be an annoying fact, but it’s a fact nonetheless.
No! Averaging temperature does not make uncertainty vanishingly small!
Why do you continue to push this nonphysical nonsense?
Because it lets them tell the difference in hundreths between two different years.
They simply cannot admit that the standard deviation of the sample means is not the accuracy of the mean. If all their shots on target wound up in the 4-ring with a spread of 1″ they would say that the average of their shots is very accurate since the spread is only 1″. Pete forbid that they missed the 10 ring (highly accurate) by a huge margin for all shots.
I’d like an posting of the “Compounding of Errors” (I think that’s what it’s called at some point if that’s possible. It’s quite relevant in a digital world, with Analogue-Digital and Digital-Analogue conversions.
It was explained to me in the following way many years ago, before radar speed guns and Gatso camera.
You’re driving your car witha speedometer rated at +/-10% in a car with brand new tyres and the measurement done on the speed of rotation of the wheels.
Your speed reads 30mph, but is reading at -10%, the new tyres add a further amount to the actual speed of your car. You are followed by a traffic cop whose speedometer reads +10% of actual and his tyres need replacing further adding to the error. Not to mention parallax when reading an offset speedomemter
You are stopped and given a ticket for exceeding the 30mph speed limit.
In those circumstances what should your indicated speed be to ensure you don’t get a ticket? I still have difficulty in working it out, but you think 30mph, but could be doing an actual ~35mph and the cop thinks that it’s 39-40mph
Thanks for an interesting read.
Ben ==> Hopefully, this will be a SERIES, I have two additional essays in the works on this general topic.
The stats kids won’t like them — generally because so many stats kids have been trained to think only statistically. Some above are “arithmetic deniers”.
The stats kids are fine with that, but the metrologists might find a few nits to pick…
Before you write anything else, please read up on how uncertainties are evaluated.
I recommend this document.
https://www.ukas.com/wp-content/uploads/schedule_uploads/759162/M3003-The-Expression-of-Uncertainty-and-Confidence-in-Measurement.pdf
KB ==> I am well aware of the statistical approaches to uncertainties. In the examples given, statistical approaches are not only not needed, they are inappropriate.
“The uncertainty of the mean would not and could not be mathematically less than the uncertainty of the measurements of which it is comprised.”
If this is true then geodetic surveyors for 200 years were wasting their time reading angles 24 times, mistakenly believing this increased precision. They could have just taken the first observation and forgotten about the mean.
They must have got their wrong idea from Gauss, and from years of field experience.
Later, trilateration – distance measurements – replaced triangulation, then GPS (actually GNSS) replaced trilateration. However they kept using the mean. A GNSS observation is the mean of hundreds of observations (depending on the time-length of occupation of the station).
Engineers must also have been wrong in using the mean as the best estimate of the true value. No wonder structures fail.
Mavis ==> They took many measurement because they could not define the original measurement uncertainty.
If they know that their instruments could do no better than +/- some known error, and they only one unique chance to make each measurement, they would have to use the arithmetic above and report x° +/- know error,
There are lots of scientific measurement for which there is no possibility of repeating the measurement sufficient times to give us a good idea of a mean value.
Surveying acreage with a surveyors chain that, unbeknownst to the surveyor, has had a few links removed can never produce an accurate measurement no matter how many times the same distance is measured.
survey chains, the best measuring tool available in some time past, had known absolute measurement error: “all lengths of chain a tolerance is given, 5m chain = + or – 3mm 10m chain = + or – 3mm 20m chain = + or – 5mm 30m chain = + or – 8mm”. Pretty accurate but with known uncertainty.
Nobody has argued that using an instrument with a 2 link – 0.4 metre gross error will give accurate results. That manages to be a straw man and a red herring. Instruments do have small systematic errors, but there are techniques to eliminate those too.
We are talking about random errors aren’t we? In that case do you agree that taking multiple readings will lead to a more precise measurement, expressed as a smaller standard deviation, than taking a single reading?
Mavis ==> “We are talking about random errors aren’t we?” No, I am writing about measurements with known absolute uncertainty or known absolute measurement uncertainty (to be more exact). There is no question at all about the value of the this type of uncertainty of the measurements. That’s why it is called “absolute”.
There are and will be some other uncertainties in almost any measurement — but here we have the known absolute measurement uncertainty caused by the measurement tools and methodology — as is the case with whole degree temperature records, it must be at least +/- 0.5°.
The measurement error is the difference between the true value and the measured/reported value and in statistical modelling that difference is considered random, and expectation zero if the measurements are unbiased but imprecise to some degree. Read for example Diggle et al 2001 Analysis of Longitudinal Data.
Again, for the 1,024th time, error is not uncertainty, they are completely different beasts.
Yes you are correct in modern usage, but some older readers will recall terms like the propagation of errors. Which is propagation of uncertainties in modern usage.
Steve ==> Be pragmatic! The type of uncertainty I write about in this essay is not “error” and it is not “the difference between the true value and the measured/reported value”. If you are not sure of that, re-read the essay.
I write about “absolute uncertainty” — we know for certain, we know absolutely, what the uncertainty is because of the way we measured and recorded the values.
This uncertainty is NOT random, it is precisely what is stated.
The uncertainty is in what the true value was — it was not recorded and cannot be recovered. We only know the range of the value, because the range was recorded, precisely.
You throw around the term “uncertainty” without any mathematical (i.e. probability theory, statistical distributions) theory to back up your ideas. Yes, mathematical statistics does involve algebra and much more and we are trying to have a technical discussion of uncertainty and not the layman’s “how safe is it to cross this street in this traffic stream”. To quantify uncertainty you need probability density functions which are attributed to a mathematical model’s error terms (or stochastic elements). So you need to start with such a model of the error terms to allow quantification of uncertainty. You cannot just hang your “uncertainty” off a “sky hook”.
What you have missed is that Kip is using combined temperature uncertainties for which the propagation has already done.
It is inappropriate to apply the propagation methods to an average.
steve ==> I am using the normal English language (remember that? it was taught in a different building than maths and stats) definition of uncertainty — not something out of someone’s statistics text book. Uncertain means just that — uncertain.
When a temperature measurement was taken and the record made as 13°C — the weathermen used 13°C for all temps between 12.5 and 13.5°C. Therefore, in the present, looking back at the single record, we are “uncertain” as to what the actual, original temperature was — because he did not record it more exactly.
This absolute measurement uncertainty is hung on reality.
A climate modelling team which hopefully included some professional statisticians would proceed in modelling the uncertainty by inferring say a uniform distribution to that measurement error as has been pointed out by myself and a number of other commentators while you and your supporters just throw you hands in the air and say its a maximum at +/-0.5 and go no further. The team, if doing the stats/modelling correctly then includes that measurement error into their model’s error budget in order to account for as many significant sources of stochasticity as they can identify along with non-stochastic terms.
That’s how science and mathematical/statistical modelling components of science work. That’s what my 45+year and still going career involved. So your reality is halt the work we cannot infer anything beyond +/0.5 or just add that in at the end to the support intervals with no theoretical justification and that’s why your graph doing just that is WRONG.
If there is one, and only one, true value in that uncertainty interval then how can its probability be less than 1? If it’s probability is 1 then how an any other value have a probability greater than zero?
I know that is a hard concept to swallow but it illustrates why a rectangular probability distribution has a problem when applied to an uncertainty interval.
Again, the issue is that you simply don’t know which value has the probability of 1. It is unknown and can never be known.
Rectangular distributions work great when there are more than one possible outcome, e.g. dice rolls or the wait time at a bus stop. But with a true value there is only one possible outcome, no more and no less.
If you will, the probability distribution of an uncertainty interval is an impulse function centered at the true value.
“it must be at least +/- 0.5°” No it could be by chance close to zero if the true value is close to the integer whole degree or anywhere between 0.0 and 0.5 or -0.5 and 0.0 if the observer visually rounds to the nearest integer temperature value. So there is a probability distribution for the measurement errors. For the absolute value of the measurement error theoretically a beta distribution with abscissa limits of zero and the 0.5 in this case and mode at zero and assuming symmetrically distributed about zero signed errors would be a good candidate distribution (see doi.org/10.1016/j.fishres.2012.07.004 for a similar approach for an integer measurement error process).
I have revised that thinking because the fish ageing error modelling describes a different measurement error process to integerizing temperatures. For rounding to nearest integers for a continuous metric like temperature I suggest a uniform distribution for the measurement errors would be a good candidate with range in this case of -0.5 to 0.5.
Steve ==> I am catching up on comments,and the process to backtrack to find what you are responding to is cumbersome. But I’ll try this:
“it must be at least +/- 0.5°” No it could be by chance close to zero”. It is the uncertainty that must be at least +/- 0.5°. That is not the same as the difference between the true value and the recorded value. The true value of the temperature at that moment could have been exactly as recorded. But our uncertainty about what it was is limited by our knowledge of the measurement — and our knowledge is restricted to the recorded value of (for example 13° +/- 0.5°) Knowing only that recorded value, the real value could have been ANYTHING between 13.5 and 12.5°. All with equal probability, by the way.
Kip what do you think is the relevance of absolute temperature measurement errors? Eg does it change our understanding of temperature increasing with time since 1910?
With caveats in all cases.
Is that done with all measurement stations used in calculating the global averages?
I’m not sure I understand the intent of the question. It could be similar to something you raised a long way above: “ I doubt that any of the data processing uses a rigorous propagation of error where every instrument has its unique precision and bias taken into account.” In the case of random error each different instrument will deliver a different standard deviation in its results, and that will determine the weighting given to the observations when the least squares adjustment is made.
If you are now talking about systematic error: In the case of a survey instrument one common systematic error is that the telescope is not exactly at right-angles to its axis. The effect of this can be completely eliminated by reversing the telescope and reading again on the other side.
In differential GNSS the idea is that refraction errors affect the observations at each station equally, so taking the difference between the stations cancels out that systematic error.
In the end the World Geodetic System should be free of systematic errors (but probably isn’t) and have minimized random errors.
I still wonder if that has anything to do with your question.
“In differential GNSS the idea is that refraction errors affect the observations at each station equally, so taking the difference between the stations cancels out that systematic error.”
The key here is that you are using the same instrument. When you have 1000 instruments, each with a different systematic bias that assumption doesn’t hold true. You can’t cancel it by “reversing the telescope”. Nor do all stations have the same “refraction” bias.
There are multiple things going on here. Uncertainty is also related to resolution. Your mean can’t have a resolution less than the resolution of the measuring instrument. Having more measurements of the same thing won’t increase the resolution.
Those surveyors were measuring the same thing, both with old methods and new methods, and hopefully generating a Gaussian distribution of measurements. Thus the mean becomes the best estimate of the true value. However it is *still* not infinitely accurate, even with a Gaussian distribution there will be a standard deviation associated with the mean and that must be considered as well. That applies even to GPS measurements.
Let’s face it, to the illuminated wokesters, uncertainty is boring. They know what they know cannot be wrong or uncertain. I often marvel at people, especially ones who are of a significant age, that can still see the world in black and white.
The use of a high school ‘football’ field as an example is misleading all by it’s lonesome! That’s because the reader does not know if the field is an American high school football field where everything is measured in yards and inches OR a high school soccer field where meters and centimeters are used.
Ask the normal American what the difference is between a yard and a meter and you most likely will get a blank look. A quarterback in American football throws for X yards, not meters. Everything in common American football is yards and inches.
Then when the writer switches to the metric system, you have completely lost the American reader. In my everyday use of measurements for just about anything from cooking to construction, it is all cups and teaspoons or yards and inches. I have to get out a calculator to convert to metric. (which is not done)
My Davis weather system uses American measurements of temperature in Fahrenheit, wind speed in mph and rainfall in inches. I do not even consider switching to metric as it becomes gibberish to me.
This is not a rant between metric and other measurement systems, but if you want to reach an American audience, you have to define in American terms and keep to them.
derbrix ==> Alas, a constant problem when writing for an international audience. The units are not germane to the example, in either case.
Thank you for the feedback!
The approach taken in Gavin Schmidt’s article and this essay substantial over-estimates the uncertainty bounds by considering fixed maximum absolute value measurement errors when in practice the measurement error process should be modelled stochastically using expectation-zero random measurement errors along with other expectation-zero random errors such as sampling error and model lack-of-fit error.
Therefore the graph in the essay labelled “The absolute uncertainty in GISTEMPv4 “ that uses a fixed maximum absolute value measurement error, misnamed as “absolute uncertainty”, gives substantially over-estimated uncertainty bounds.
(see https://www.researchgate.net/profile/Steven-Candy
for my justification of the above)
In a time-series measurement, you get exactly one chance to measure the variable in question before it is gone forever.
The term absolute uncertainty should be viewed against its opposite, relative uncertainty, i.e. in what units the uncertainty is specified.
Karlmonte ==> As clearly stated, one has an absolute uncertainty when stated in the same units as the value. Correct absolute uncertainty stems from the inability of the measurement instrument and’ measurer’ to record the value any closer than the given absolute uncertainty.
Sorry, a mix-up of terminology—my experience has been that an absolute uncertainty is expressed as (X ± U) (units of X), while a relative uncertainty is X (units of X) ± U%.
Steve ==> Well, I hope Gavin Schmidt reads your comment — he could save himself all the trouble caused by trying to use anomalies to confuse the world about rising temperatures.
Note that the absolute uncertainty is caused by the measurement instruments — originally High/Low glass thermometers read to the nearest degrees. The records kept with daily averages (really medians of the Hi and Lo) rounded to the nearest degree.
A special case of the median called the “mid-range.”
How do you do stochastic analysis if the underlying data is not a random distribution? Temperature is not a random variable. You don’t go from winter temps to summer temps on a random basis.
And, because of seasonal variation, temperatures tend to be auto-correlated.
One might be more accurate measuring the same thing multiple times. The issue is usually measuring different things many different times.
Lumping together multiple temperature readings is the “different things lumped together”, not multiple measurements of the same thing.
This is impossible in a time-series measurement like air temperature.
Yes, with each measurement taking a finite amount of time, every measurement is a different measurement along the time continuum.
Tom ==> Yes, of course. If we could measure the exact same cubic meter of air a hundred times in the same instant (before the temperature changed), we’d have very highly accurate temperature records. Alas, that is impossible in a practical sense. Even 6 second temperature measurements at an automatic weather station don’t measure the same air twice in a row. The only thing that is the same is the location — not the time, not the air.
As karlomonte says below.
Even with measuring the same thing multiple times the mean of the measurements will still be associated with an uncertainty. If the measurements are strictly Gaussian that uncertainty can be estimated by the standard deviation of the sample means. Even you have the entire population then the uncertainty of the population mean is still subject to the propagation of uncertainty from the individual elements making up the population. The mean is the BEST ESTIMATE of the true value from a Gaussian distribution but it is *still* just the best ESTIMATE.
Can I recommend you read the following before writing anything else.
https://www.ukas.com/wp-content/uploads/schedule_uploads/759162/M3003-The-Expression-of-Uncertainty-and-Confidence-in-Measurement.pdf
That’s a good source. It is consistent with the GUM, but a little more succinct. I put it in my document archive. Thanks.
I read the first 6 pages and didn’t see anything that was new to me or Tim. Is there something in particular you would like to point out because I don’t think it is worth my time to read the whole 90 pages.
I did see something on page 6 that I think you should consider carefully: “This average, or arithmetic mean, of a number of readings can often be closer to the “true” value than any individual reading is.” Note that is doesn’t say “will be,” only that it “can often be.”
It is interesting that those citing the various authorities on measurement have not responded to one of the documents only giving conditional support to the idea multiple measurements improving the accuracy.
Nobody as far as I know has said that uncertainty will always be reduced. The condition for it not being reduced is if all the uncertainties are caused by the same systematic error.
But that’s not the same as claiming “The uncertainty of the mean would not and could not be mathematically less than the uncertainty of the measurements of which it is comprised.”.
bellman:
“The condition for it not being reduced is if all the uncertainties are caused by the same systematic error.”
Now you are back to ASSUMING that random error always produces a Gaussian distribution. That is simply not true. If the error is associated with SINGLE measurements of different things there is no guarantee that the random uncertainties will form a Gaussian distribution.
If the distribution is skewed then the integral of the values to one side can easily be different than the integral of the values on the other side. In that case you simply won’t get complete cancellation of random error and therefore can’t assume the standard deviation of the stated values is the uncertainty of the average.
Since winter and summer temps have different variances it is almost guaranteed that you will get a skewed distribution when combining them (almost – not certainly but a very good chance). That means you can’t just ignore the propagation of the individual uncertainties so you can use the standard deviation around the mean as the uncertainty.
Why this is so hard for statisticians and climate scientists to accept is a complete mystery to me. It just tells me that they have *NO* experience in the real world of measurements of different things and are so indoctrinated into believing everything is Gaussian that they have blinders on.
“Now you are back to ASSUMING that random error always produces a Gaussian distribution.”
I’ve told you explicitly numerous times, I am not assuming that, I don’t care what the shape of the distribution of the errors is, as long as it’s mean is zero, the uncertainty will reduce with averaging.
I remember though the last time we had this discussion it turned out you didn’t understand what a Gaussian distribution meant, so maybe you are just confused again.
“If the error is associated with SINGLE measurements of different things there is no guarantee that the random uncertainties will form a Gaussian distribution.”
Nor is there any guarantee if you measure the same thing.
“If the distribution is skewed then the integral of the values to one side can easily be different than the integral of the values on the other side.”
Hence the requirement that the mean has to be zero. If it’s not zero you have a systematic error.
“Since winter and summer temps have different variances it is almost guaranteed that you will get a skewed distribution when combining them”
We are talking about the measurement uncertainties, not the variance in the temperatures.
“Why this is so hard for statisticians and climate scientists to accept is a complete mystery to me.”
Has it occured to you that it might be becasue they understand the subject better than you, and that you are just wrong?
“I don’t care what the shape of the distribution of the errors is, as long as it’s mean is zero, the uncertainty will reduce with averaging.”
Nope. If you have a skewed distribution the mean and standard deviation are not even acceptable statistical descriptions according to the five STAT 101 textbooks I have here. You must use the 5-number statistical description (or some other descriptor) which does *not* include either the mean or the standard deviation.
The *ONLY* thing that reduces is how close you are to the population mean. That means *NOTHING* when it comes to the accuracy of the mean.
“I remember though the last time we had this discussion it turned out you didn’t understand what a Gaussian distribution meant, so maybe you are just confused again.”
I’ve *always* understood what a Gaussian distribution is. It’s what you assume all measurement distributions are!
“Nor is there any guarantee if you measure the same thing.”
How then do you get cancellation of measurement uncertainty so you can assume the standard deviation of the stated values is the uncertainty?
If you integrate the left side of a skewed distribution and you integrate the right side will you get the same quantitiy? If you don’t then how to you assume cancellation of measurement uncertainty?
“Hence the requirement that the mean has to be zero. If it’s not zero you have a systematic error.”
I assume you are talking about a “normalized” distribution. The mean can be zero in a normalized skewed distribution and still not be the same as the median – the definition of a skewed distribution.
Nor do skewed distributions have to indicate systematic bias. When you are measuring DIFFERENT THINGS there is no guarantee you won’t get a skewed distribution – all of the measuring devices may have zero systematic bias and still generate a skewed distribution.
You are back to throwing crap against the wall hoping something will stick. It won’t.
“We are talking about the measurement uncertainties, not the variance in the temperatures.”
*YOU* are the one that wants to use the stated values as the measure of uncertainty and not the measurement uncertainties! So now you have been reduced to the argumentative fallacy of Equivocation!
“Has it occured to you that it might be becasue they understand the subject better than you, and that you are just wrong?”
They don’t. I have too many references that show how they don’t. They pay no attention to any of the recognized authorities, including the GUM. Neither do you!
Where do you get this garbage from?
from all of his cherry picking of things he doesn’t understand.
Any specific garbage, or all my garbage in general?
Don’t you have a database of every one of your posts handy?
Start here:
sigma/root(N)
Your saying sigma/root(N) is garbage? No wonder you spend most of your time posting content free comments, if this is what happens when you try to explain yourself.
Not getting on the hamster wheel, you know what it means.
sigma/root(N) is how close the sample means are to the population mean. It does *NOT* tell you how accurate the population mean is so it can’t tell you how accurate the sample means are either.
Again, I don;t know what you mean by the accuracy of the population mean.
It’s like when you measure one of your boards. The uncertainty of the measurement tells you how close you are to the actual length of the board. It would be meaningless to say you don’t know how accurate the length of the board is. The board is the length it is.
When dealing with measurements it is recommended that in general, uncertainties, or standard deviations, not have more than one significant digit. Sometimes two under the right circumstances.
Remember, we are not doing “statistics” here, we are using statistical calculations only to determine uncertainty. That places certain restrictions on what is done.
Lastly, why do you think sigma or standard error of the sample means should have more significant digits than the data used to calculate them? It is a fallacy propagated by studying statistics for statistics sake and not understanding what statistics are being used for. You should have learned that some descriptive statistics can only be applied to the group, and not to individuals.
“When dealing with measurements it is recommended that in general, uncertainties, or standard deviations, not have more than one significant digit. Sometimes two under the right circumstances.”
Different sources have different recommendations (and a recommendation is not a rule). Taylor says one digit unless the first is a one in which case you can use two. Bevington says it should be two unless the first digit is large, in which case it should be one. The GUM says at most two, but it may be necessary to quote more to avoid rounding errors.
“Remember, we are not doing “statistics” here, we are using statistical calculations only to determine uncertainty.”
Not sure what you think the difference is between doping statistics and using statistical calculations.
“Lastly, why do you think sigma or standard error of the sample means should have more significant digits than the data used to calculate them?”
Because a mean can be more accurate than the individual measurements. Because the mean is the best estimate of what is being measured. Because excessively rounding your estimate is adding uncertainty.
Incidentally, whilst I remember to ask – I don’t think you have addressed the elephant in the room yet.
You keep insisting that you don’t divide the uncertainty of the sum by sample size to get the uncertainty of the mean. You have repeatedly told my I’m wrong to suggest such a thing, and you insist that this means that the uncertainty of the mean increases with sample size.
In this essay, Kip does just what I and others have said you must do – divide the uncertainty of the sum by sample size. You seem to agree with Kip on most things, so I just want to know, do you agree with Kip on this or do you still insist that you never divide the uncertainty?
“You keep insisting that you don’t divide the uncertainty of the sum by sample size to get the uncertainty of the mean.”
That’s not what I have said at all. I have said that the average uncertainty (sum of uncertainty divided by sample size) is *NOT* the accuracy of the mean! The term “uncertainty of the mean” is the propagated uncertainty of the individual uncertainties, not the average uncertainty or the standard deviation of the stated values.
What you class as the uncertainty of the mean is the AVERAGE UNCERTAINTY. Stop using the term “uncertainty of the mean” and just use the term “Average Uncertainty”.
It’s like the term “standard error of the mean”. It isn’t. It’s the standard deviation of the sample means. It’s a measure of how close the estimated mean is to the population mean. That has nothing to do with error or uncertainty. It is *NOT* a measure of the accuracy of the mean!
” uncertainty of the mean increases with sample size.”
See what I mean? You have now used the term “uncertainty of the mean” in two different ways. You’ve used it to mean the “average uncertainty” and as the “measurement uncertainty of the mean.”
Stop using the term “uncertainty of the mean”. The uncertainty of the mean has ONE meaning in metrology – and that is accuracy of the mean. The accuracy of the mean is *not* the average uncertainty and it is *not* how close the sample means are to the population mean. The “average uncertainty” is meaningless and the population mean isn’t automatically 100% accurate!
Bellman: “You keep insisting that you don’t divide the uncertainty of the sum by sample size to get the uncertainty of the mean.”
Tim Gorman: “That’s not what I have said at all.”
I think Tim’s memory needs refreshing.
This started with this comment from Tim
https://wattsupwiththat.com/2021/02/24/crowd-sourcing-a-crucible/#comment-3193098
To which I replied
And everything exploded from there, and it’s a bit difficult to choose the main argument, but this will do, Tim says
https://wattsupwiththat.com/2021/02/24/crowd-sourcing-a-crucible/#comment-3195160
I then try to point out that the last equation implies that
u_r = u_A / B
and Tim responds
And so on ad infinitum
“To which I replied
”
I calculated no mean. I did a root-sum-square addition of the measurement uncertainty.
“And you are left with u_r/r = sqrt( (u_A/A)^^2 )”
This is a relative uncertainty, NOT AN AVERAGE.
Over and over and over again.
q = x + y
u(q) = u(x) + u(y)
q_avg = (x + y) /2
u(q_avg) = u(x) + u(y) + u(2) = u(x) + u(y)
Thus u(q_avg) = u(q)
u(avg) = [ u(x) + u(y)] /2
So u(avg) ≠ u(q) , u(avg) ≠ u(q_avg)
The average uncertainty is *NOT* the uncertainty of the average!
It truly is just that SIMPLE!
So to be clear you are now saying you don’t divide the uncertainty of the sum by sample size, despite a couple of comments ago insisting you had said no such thing?
“u(q_avg) = u(x) + u(y) + u(2) = u(x) + u(y)”
Pointless to explain this to you once again, but you cannot combine the propagation of uncertainties from adding and from division like this. They are two different equations, one involving absolute uncertainties and one involving relative uncertainties. They have to be calculated separately. Your equation would be correct if you were adding 2 to the sum, not if you are dividing by 2.
“It truly is just that SIMPLE!”
You keep saying that to no purpose, almost as if you want the obvious come back “no, you truly are that SIMPLE!”.
“So to be clear you are now saying you don’t divide the uncertainty of the sum by sample size, despite a couple of comments ago insisting you had said no such thing?”
You are TOTALLY lost in the forest.
Can you not read simple algebraic equations?
I just keep repeating the same derivations I gave you two years ago.
u(q_avg) ≠ u(avg)
How much more simple do I have to make it before you understand?
“Pointless to explain this to you once again, but you cannot combine the propagation of uncertainties from adding and from division like this. They are two different equations, one involving absolute uncertainties and one involving relative uncertainties.”
IT DOESN’T MATTER! Say all the absolute values are 1.
What do you get? you get u(x)/x = u(x), u(y)/1 = u(y)
u(constant) remains ZERO!
The point is that the average uncertainty is *NOT* the uncertainty of the average. The uncertainty of the average is the propagated uncertainties of the measured values! u(q) = u(sum)
Now, come back and tell me the absolute values can’t be 1 so I am wrong.
“u(q_avg) ≠ u(avg)
How much more simple do I have to make it before you understand?”
Well, for a start you could explain what the difference is between q_avg and avg.
u(q_avg) is the uncertainty of the average.
u(avg) is the average uncertainty.
They are *NOT* equal.
Then we agree. So what is your point, apart from showing you never listen to me.
You’ll never understand. You can’t even see that they are not equal!
“You can’t even see that they are not equal!”
Either you are even denser than I supposed, or this is one extremely long drawn out pranks.
I can see they are not equal, I keep telling you they are not equal, I have never claimed they are equal. You need to read the words I am typing, rather than argue with the voices in your head.
Then why do you keep saying the uncertainty of the mean is the average uncertainty?
This is just getting desperate. You are either knowingly lying or suffering from some serious memory issues.
/Jack Benny/
“IT DOESN’T MATTER! Say all the absolute values are 1.
What do you get? you get u(x)/x = u(x), u(y)/1 = u(y)
u(constant) remains ZERO!”
What do you mean, assume all the absolute values are 1? Are you talking about the uncertainties, or x and y? None of what you are saying here makes sense.
If x = y = 1
then u(q) is still
u(q) = u(x) + u(y), or u(q) = root[u(x)^2 + u(y)^2]
then
u(q_avg) / q_avg = u(q) / q + 0
and if x = y = 1, so q = 1 + 1 = 2, and
q_avg = (1 + 1) / 2 = 1, we have
u(q_avg) / 1 = u(q) / 2.
Been here before also, see below.
“u(q_avg) / q_avg = u(q) / q + 0″
Nope.
You do uncertainty term by term
You don’t do u(q)/q
you do u(x)/x + u(y)/y ==> u(x)/1 + u(y)/1
so u(q_avg)/q_avg = u(x) + u(y)
so u(q_avg) = q_avg * [ u(x) + u(y) ]
u(q_avg) = 1 * [ u(x) + u(y) ]
u(q_avg) = u(x) + u(y)
“so u(q_avg)/q_avg = u(x) + u(y)”
Why do you do this? This would be the equation if q_avg = x / y (assuming x = y = 1). But in reality q_avg = (x + y) / 2.
And the rest of you nonsense follows from your mistake.
Honestly, this isn’t rocket science. It’s a simple application of two simple equations. The problem is that you are so determined to reach a wrong conclusion that you have to keep dodging doing the obvious thing, and twisting yourself in knots to get to wrong result.
All you have to do is work out the uncertainty of a sum, then the uncertainty of the sum divided by N. It’s so simple even the author can do it.
“ But in reality q_avg = (x + y) / 2.”
And when you calculate the uncertainty of q_avg it is the uncertainty of a quotient.
You use relative uncertainty to calculate the uncertainty of a quotient. And that relative uncertainty is done term by term.
[u(q_avg)/q_avg]^2 = u^2(x)/x + u^2(y)/y + u^2(2)/2
If x = y =1 this reduces to
u(q_avg)/q_avg = sqrt[ u^(x) + u^2(y) ]
If you want to use other values for x and y then go ahead.
But the uncertainty of the average is *NOT* gong to be the average uncertainty!
TG said: “And when you calculate the uncertainty of q_avg it is the uncertainty of a quotient.”
Your q_avg has a both a + (sum) and / (quotient) in it. Given that your q_avg contains a + (sum) what is your justification for using Taylor 3.18 only 3.18 in this case?
Someday people are going to name a syndrome after you.
Again. x + y is not a quotient it’s a sum. You can not use the uncertainty propagation rules for quotients when you are adding numbers. It doesn’t matter if they are part of an equation that includes a quotient. Dividing x + y by 2 doesn’t suddenly make x + y a quotient.
You have to take each part separately and use the correct rules for each.
Surely you figured this out when you were doing all the exercises in Taylor’s book.
“The point is that the average uncertainty is *NOT* the uncertainty of the average.”
This is getting to be such a pathetic strawman.
I agree completely, and have been agreeing ever since you hit upon this dumb argument, that the average uncertainty is *NOT* the uncertainty of the average. Why do you keep arguing it isn’t as if it is something I’ve been saying?
“The uncertainty of the average is the propagated uncertainties of the measured values! u(q) = u(sum)”
Of course u(q) = u(sum), because you defined q as the sum.
If the average uncertainty is not the uncertainty of the average then why do you keep on trying to convince everyone that it is?
Of what use is the average uncertainty if it tells you nothing about the accuracy of the average, i.e. the uncertainty of the average?
“Of course u(q) = u(sum), because you defined q as the sum”
You didn’t even bother to follow the math for meaning! I just know that tomorrow you’ll be saying that [u(x) + u(2)] / 2 is the uncertainty of the average!
I’ll guarantee it! In fact I’m going to save this post on my hard drive so I can give it back to you verbatim.
“If the average uncertainty is not the uncertainty of the average then why do you keep on trying to convince everyone that it is?”
I do not. You just keep lying that I do.
“I’ll guarantee it! In fact I’m going to save this post on my hard drive so I can give it back to you verbatim.”
I’ll save you the bother. [u(x) + u(2)] / 2 is the uncertainty of the average. (Subject to the idea that the only uncertainty of the average is measurement uncertainty, and all those uncertainties are dependent.)
Really you spend far to much time arguing with what you think I’m saying rather than what I’m actually saying, which is why you keep coming up with these strawmen arguments.
And I showed above how it is u(x) + u(y). The only way you get 1/N is to modify the averaging.
Absolutely no one calculates the mean as x/2 + y/2.
So someone else who thinks Kip and the GUM are wrong.
Do I have to trawl through all 200 of your troll posts to find the one where you actually prove something?
“Absolutely no one calculates the mean as x/2 + y/2.”
Do you have to keep demonstrating your mathematical ignorance.
(x + y) / 2 = x/2 + y/2
It makes absolutely no difference how you write it.
I guess you are referring to this comment for your prove. Lets see:
OK so far.
I think your terminology is a bit off here. 2 isn’t a variable, so you can;t say the derivative of r with respect to 2, but the meaning is fair enough. The derivative of the constant function r is 0.
Oh dear. The partial derivative of s with respect to q_sum, is 1/r. And the partial derivative with respect to r is q_sum / r^2. The second error doesn’t matter but the first one is propagated through the rest of your argument, and your conclusion is false.
Really, I’ve been through all this a few days ago with Tim making exactly the same mistake. If you don’t understand how to calculate a partial derivative, here’s a handy online calculator which will do it for you.
https://calculator-derivative.com/partial-derivative-calculator
Here’s my result when I plug in x/r with respect to x.
“I’ll save you the bother. [u(x) + u(2)] / 2 is the uncertainty of the average.”
The math simply doesn’t matter to you, does it? It’s an article of religious dogma for you that the average uncertainty is the uncertainty of the average Nothing is going to shake your faith in that dogma.
You are in good company. There are lots of other climate alarmists on here that have the same belief in the same dogma and it doesn’t matter whether it is right or wrong.
TG: “It truly is just that SIMPLE!”
bellcurveman: “So to be clear you are now saying [yadda yadda yadda]?”
Just give up, this is not working for you.
How”s your tactic of adding a one line insult to every post working out? Have you persuaded anyone one you know what you are talking about?
Don’t care—the only people whining are trendologists.
Hah! bellcurveman digs deep and pulls out whines from two years back! A lot of trips around the hamster wheel since.
squeak*squeak*squeak*squeak*squeak*squeak
And you continue to ignore the inconvenient little fact that Kip is dealing with combined uncertainties for which propagation appropriate for the particular measurement system(s) has already been done.
That’s because, for the evaluations under review here, these “conditions” have easily been met.
FYI, I would rather increase the precision of the expected value of an evaluation than it’s target spread. I would always use extra info that was judged to hone in on the right value, even if it’s target spread was larger.
Yes, I misused the term “precision” here How’s this?
I would rather have a larger target spread, over a value closer to the real one, than visa versa. If there were additional, more widely error banded, data that I thought would get me closer to the real value, I would use it. Even if it increased the spread.
I would also like for Clyde Spencer to leave his ever shrinking comfort zone and respond.
Poor baby blob, he don’t get the respect he thinks he deserves.
“Lumping together multiple temperature readings is the “different things lumped together”, not multiple measurements of the same thing.”
And when those properties being measured are intensive properties, averaging them is a major no no.
This is something that climate science in general does not understand, especially when comparing proxy derived temp series to instrument derived temp series.
Excellent explanation, Kip, well done. This is vitally needed.
For the record, the NASA webpage about GISS uncertainty has a rather different version of this same graph. Notice the (claimed) uncertainties are significantly smaller than the 1°F resolution (0.6°C) of the historic thermometers, and get progressively smaller with time, reaching absurdly small values by the present day:
https://data.giss.nasa.gov/gistemp/uncertainty/
Karlo ==> They are using what is roughly a statistical/CLT approach and ignoring the actual original measurement error, which Gavin Schmidt demonstrated. Further, they are doing the uncertainty of anomalies – several times removed from measurement.
Gavin also thinks that anomalies somehow produce less uncertainty.
I think you’re still missing Schmidt’s point. GISTEMP’s anomaly uncertainty is factoring in measurement error [Lenssen et al. 2019]. It’s just that the uncertainty on the anomaly values is less than the uncertainty on the absolute values because the absolute values have a major component of uncertainty arising from a system effect where the anomaly values don’t due to the additive identity property of algebra.
bdgwx ==> Try diagramming it…see what you get.
I’m not sure what the ask is here. Can you explain more about the diagram you seek?
And you still can’t acknowledge or understand that error is not uncertainty, so you continue to place both into a blender and get a fish malt.
Not being a mathematician, it isn’t immediately obvious to me how the “additive identity” is germane to the discussion, would you be good enough to explain it in 25 words or less?
It’s the x – 0 = x or x – x = 0 rule. It’s the reason why the systematic effect term, which is included in both the single month value and baseline value, cancels out.
This requires ALL systematic biases to be the same. How, with measurements of different things with different devices can you assume that all systematic bias is the same?
You simply don’t have x-x, you would have x1 – x2 where x1 ≠ x2!
This is part of the reason for using root-sum-square addition of uncertainties. NOT ALL MEASUREMENT UNCERTAINTY CANCELS. That includes the corollary that not all systematic bias is the same!
This has been pointed out to you at least three times and yet you insist on posting the same garbage – that all systematic bias is the same so it all cancels.
It’s simply a lie since you should know better by now!
Of course. +/1 is a function that produces a range of values. Unless that function is defined you have no idea how those values are distributed within that range. In absence of knowledge of the function that produced that range you really have no idea of what is going on with the data to begin with!
swamp ==> Yes, a numeral with the notation +/- a numeral (both in the same units) SHOULD signify that the value is in reality a range. If the portion in the +/- statement is unit-less, then you might be dealing with a statistician, who is reporting something else.
However, that range is not the range of all possible values. It is more commonly the 68% (typical in climatology) or 95% (that or more in most other sciences) probability of encountering a measurement within the given 1 sigma or 2 sigma range. It does not preclude values beyond the given range. It only says that it has low probability.
Clyde ==> In the examples I give, the range is not the entirety of the possible values….but the values can not be less than that range.
There is no probability in he arithmetic solutions I provide. No 68%, yes if any probability is mentioned, it is usually the 95% confidence interval.
I believe the 68% is some statistical creature.
Lots of words to signify that the actual bounded area under the curve/total area under the curve needs to be explicated. If it is fuzzy in communication in any scientific discipline, then it is use free.
You will many times see an uncertainty interval defined as being a uniform distribution, a Gaussian distribution, etc. I have never looked at it that way. I have always considered an uncertainty interval as having one value with a probability of 1 of being the true value and all the values as having a probability of zero. The issue is that you don’t know which value has the probability of 1! It is an unknown. An unknown implies you don’t know the probability distribution either!
Stokes claimed above that it does give a distribution, somehow, somewhere.
So, all that being said I am left with the certainty that the temperature may, or may not, have gone up half a degree or so since the end of the coolest period of the last 10,000 years when the weather really sucked for much of the Northern Hemisphere and lots of people died of cold and starvation, and the thing that supposedly caused some of the slight warming is the reason why so many fewer people are suffering.
Remind me of why I am supposed to be terrified.
Mark ==> And we apologize for your dilemma.
Personally, in my view, the bulk of physical evidence shows the Earth climate has warmed since the end of the Little Ice Age.
Kip
I suspect you are correct. I won’t lose any sleep over it!
Excellent post, Kip. The worst abuse of this sort is the satellite altimetry SLR. Posted about it several times previously here. Inexcusable NASA misleading the public, since an article of faith is that SLR is accelerating when properly measured it isn’t.
Worse than cooling the past?
IMO yes, because their accuracy claims are provably false simply using their own satellite documentation.A known satalt SLR uncertainty >3 centimeters can NEVER be reduced to a claimed NASA uncertainty of a tenth of a millimeter no matter how many times the measurement is repeated.
Rud ==> Thank you…sometimes a real effort to have to repeat and repeat the same simple, pragmatic truths over and over, and fight with the stats guys over what is basic, arithmetic.
I aced stats and probability theory (PhD level econometrics), but am at heart just a math guy. Look forward to your upcoming CLT post.
Rud ==> Yes, you and I have written about satellite SLR and its true uncertainty to many times, I even forget to mention it — it is like a boogeyman story.
I await the discussion of whether the “adjustments” made to the readings have their own uncertainties and whether those uncertainties are additive.
Ed ==> simple answer: YES “adjustments” all have their own uncertainties.
If adjustments are guesses then they have their own uncertainty which adds to the uncertainty of the original value. It’s a simple sum or difference: (v_orig +/- u_orig) + (v_adj +/- u_adj). As Kip has pointed out the final uncertainty will be the root-sum-square of the two uncertainties.
Kip, you say:
That’s not true at all. In fact, in statistics it generally means that there is about a 2/3 chance that the value is between 2 and 3 cm. This means that there is about a 1/3 chance that it is NOT in that range.
It seems that you are discussing “triangular fuzzy numbers”, a very valuable way to express uncertainty. These are expressed as 3 values: a least possible value, a most probable value, and a greatest possible value.
These act in the way you describe above, where the value is indeed between the least possible and the greatest possible value.
The rules for arithmetic on triangular numbers are simple. Let me show by example:
Triangular number T1: [1, 3, 6] (1st number is least possible, 2nd is most probable, 3rd is greatest possible)
Triangular number T2: [2, 3, 8]
Sum: [3,6,14]
Multiplication [2,9,48]
Division: T1/T2 = [1/8, 3/3, 6/2] = [1/8, 1, 3]
Division: T2/T1 = [2/6, 3/3, 8/1] = [1/3, 1, 8]
In some ways I find triangular fuzzy numbers to be a superior way to handle uncertainty.
Next, you seem to be overlooking the concept of symmetrical errors. IF (and it’s a big if) the errors are symmetrical, we can indeed use repeated measurements to reduce the uncertainty of the result.
Finally, in general errors of the usual type (value ± one sigma) add in quadrature, not directly as you say.
My best to you, and thanks for your post.
w.
“That’s not true at all. In fact, in statistics it generally means that there is about a 2/3 chance that the value is between 2 and 3 cm. This means that there is about a 1/3 chance that it is NOT in that range.”
Willis, you are describing a measurement distribution that is Gaussian where ≈68% of the values will be within one standard deviation of the mean.
The only way to assume this is the case is to have multiple measurements of the same thing. Then the best estimate of the true value is the mean and the standard deviation will tell you where most of the possible values lie.
If you are combining a number of individual, single measurements of different things (such as minimum and maximum daily temperatures) there is really no “true value” and there is no guarantee the measurement distribution will be Gaussian at all. It could be very skewed or even multi-nodal as is the case where you are jamming winter temps (NH) with summer temps (SH) which even have different variances. In the case of a single measurement there is no actual data values upon which to base a distribution. For a single measurement all you can do is estimate the uncertainty interval based on things like manufacturers specs and judgement for things that would affect individual microclimate (e.g. elevation, pressure, humidity, terrain, geography, etc).
Daytime temps are primarily sinusoidal, nighttime temps are primarily an exponential decay. The actual average value of these are *not* well-defined by Tmax-Tmin leading to a lot of uncertainty is what the actual daily average really is. I very seldom see this analyzed anywhere although I have one paper that addresses it. If you like to look at it I’ll see if I can find it.
Global air temperatures are almost certainly skewed, with a long tail on the cold side, as I have illustrated in a previous WUWT submission.
The UAH LT distributions are highly skewed.
“The UAH LT distributions are highly skewed.”
Not their trends.
https://www.woodfortrees.org/plot/uah6
R^2 ~= 0.992, Fun fact. If you plot this, you need to skinny up one of the marker fills to even see most of the second data series.
Now, show us the “skew”…..
Show me the Gaussian, blob:
“Show me the Gaussian, blob:”
Unlike your irrelevant pix, I actually will. Oh, BTW, I typo’d the R^2 in my last post. Instead of ~0.992 it should be ~0.998..
Irrelevant? Are you serious? These are the standard deviations of the 1991-2020 monthly averages that go straight into the UAH baseline.
Try again, blob.
“These are the standard deviations of the 1991-2020 monthly averages that go straight into the UAH baseline.”
Fooled me. No titles, and no explanation of relevance. Especially w.r.t. “skewness”. OTOH, my plot definitively shows non skewed, almost perfect normality.. Maybe you can try harder.
Oh, FYI, your shorter, less physically/statistically significant data set also has a correlation between it’s trend residuals and their best fit cumulative normal distribution of ~0.997. And the skewness of those residuals is ~0.12, i.e. not “highly skewed”. The 1980-present data has a skewness of ~0.16 – also not “highly skewed”.
Do the work….
In my evaluation of your less statistically/physically significant time series I made an evaluative mistake. Yes, the residuals from that data are indeed negatively skewed to ~-0.6. The longer, more physically/statistically significant time period I chose gave the tiny skew I referenced before. The trend residuals from both time periods had excellent normality.
FYI, the trend from 1980-present is 1.34 +/-0.06 degC/century. Your 1991-2020 trend is actually higher, 1.5 +/- 0.1 degC/century.
Moral of the story. Data cherry picks can often be spotted….
There are none so blind as those who refuse to see.
Has no clue what is behind the UAH trendlines; not even worth the time it would take to type a reply.
“Not worth the time”. Seems to come up a lot when you’re definitively outed.
Hah! The delusions are strong in this one.
willis ==> You are talking one thing while I am writing about another. Don’t tell me you are an arithmetic denier as well.
You are talking some version of standard deviation statistical definitions while I am talking simple arithmetic. I give an example, I give exact correct definitions, I give the rules of finding arithmetic means, with their absolute uncertainty, I show how to do it using arithmetic correctly….I even illustrate it for those having trouble reading words and then write it out in simple words as well.
and yet, you run off into stats-land unnecessarily, in which you chose statistical definitions for use in an arithmetic problem, where they don’t apply.
One does not need statistical approaches to solve arithmetic problems.
Maybe the concepts are just too simple for you. They really are that simple. Even Gavin Schmidt understands it the way I explain.
Even a sixth grader can understand the principles at work here — and understand the illustration as well.
The KEY to understanding this is that (for instance) temperature records have been recorded, historically, as whole degrees. More modernly, they are rounded to whole degrees. Each and every whole degree record (be it for six minutes or any other time period) is absolutely uncertain to +/- 0.5°. Always, every time, it is a real world fact and cannot be fudged or statistic’d around.
Are are not talking probabilities in straight arrow arithmetic — the answers are certain if no arithmetical error has been made. 1 + 1 is ALWAYS 2. 10 ÷ 2 is ALWAYS 5. There is no uncertainty about these results.
I will show in the next essay how Central Limit Theorem confuses things and takes some things totally out of whack when applied inappropriately.
It is totally inappropriate in this simple arithmetic problem.
Had we been talking statistics, probabilities, etc your definitions and formulas would of course be correct — just not in the cases I use as examples — which are real world CliSci examples in which they are inappropriate as well.
Kip, I’m gonna pass. You comparing me to a sixth grader is a non-starter. Come back when you can leave out the insults.
w.
Good call Willis. I know we don’t see eye to eye very much but I absolutely agree with you on that comment.
Kip, consider there are intelligent knowledgeable people commenting on this thread and some of your tone and comments are…less than helpful in making your case.
Kip,
The rounding to whole degrees and the resulting ±0.5 C is a component of the total uncertainty characterized by a uniform or rectangular distribution. That is a distribution in which all value are equally likely. For example, 0 C is just as likely as -0.5 or +0.5 C.
So for the scenario of 1 + 1 the answer is 2 but the uncertainty is sqrt((0.5/sqrt(3))^2 + (0.5/sqrt(3))^2) = 0.4. We state this in standard form as 2 ± 0.4. Note that the divide by sqrt(3) is the shortcut for the standard deviation of a rectangular distribution. Anyway, I encourage you to confirm this with the NIST uncertainty machine. Select two input quantities (x0 and x1). Let x0 and x1 both be rectangular distribution with endpoints 0.5 and 1.5. Then set the output quantity to x0+x1 and click Run.
The salient point is that your 0.5 uncertainty in this case is inherently probabilistic. Just because it is a rectangular uncertainty and not a normal uncertainty in no way makes it any less probabilistic.
bdgwx ==> Try diagramming it — use the range, add the other range, find the uncertainty of the sum of the ranges as numbers.
The NIST uncertainty machine will do the diagramming for us.
Text
1±0.5 (rectangular) + 1±0.5 (rectangular) = 2±0.4 (standard)
Diagram
this + this = this
The configuration that you can plug into the machine is as follows.
Kip you are not realising they are probability density distributions not hard numbers.
When you combine two rectangular distributions of approximately equal size the result is a triangular distribution.
This is meaningless word salad. A combined uncertainty has NO distribution associated with it.
It depends. Generally, uncorrelated measurement uncertainties are added in quadrature. They are added if correlated. If one is looking to establish an upper-bound, then simple addition is the best choice.
Kip, saying that observing a thing twice makes you more uncertain about it than if you had observed it only once is so conceptually nonsensical that it should put a stop to the discussion right there. If you think this your understanding is flatly wrong.
The mean of a series of measurements of some quantity is the best estimate of the value of the quantity and our best estimate becomes more certain the more times we measure it (assuming that the errors in our measurements are random). If we measured it infinitely many times there would be no uncertainty at all.
AlanJ December 10, 2022 12:17 pm
Well, kinda.
Suppose we have an object that is laser-measured to be exactly 110.31465 mm in length.
We ask one person to measure it with a ruler marked in mm. They say “a bit more than 110 mm, call it 110.2”.
We then repeat that process using 100 different people. We average all their answers and we get say 110.25163. Better than a single measurement.
BUT, even if we measure it an infinite number of times, we cannot get to “no uncertainty at all”. We are limited by the precision of our measuring device, which is only marked every one mm.
My real-world rule of thumb is that repeated measurements can buy us an extra decimal of accuracy beyond the units of our measuring device, but not more than that. Given that, I think we can use averages to get e.g. individual station average max temperature to the nearest tenth of a °C, but not more than that.
But hey, I was born yesterday, what do I know?
w.
If you have one board 2′ +/- 1″ and a second board that is 10′ +/- 2″ and you join them to reach over a span what does their average tell you and what is the uncertainty of that average? How does that increase the resolution of your measurement?
Assuming the uncertainties are normally distributed, the length of them when joined together is is 12′ +/- SQRT(1^2 + 2^2).
The uncertainty is 2.24 inches (at the same level of confidence as your individual uncertainties)
If the uncertainties are ranges with rectangular distributions, you would divide them by SQRT(3) before doing the above sum.
The average is of no use in this particular case and I don’t know what you mean.
w. ==> Yeah, what do you know? Which of the CliSci measurements, for instance, is measuring the same thing multiple times? Temperature? Sea Level? Humidity? Any weather station metric?
Answer is none if the above. They all are measurements of differing times, air volumes, the values change between measurements…the only “same thing” is the location, which is not a metric at all and doesn’t count.
The rules that apply to multiple measurement of the same thing, which allows us to use the “law of large numbers” or other statistical approaches (Central Limit Theorem).
Misapply either of those two (LLN or CLT) and one gets non-physical results.
Now, sometimes one wants those non-physical results for some purpose — say a mean of apples and oranges — in which case — have at it. But at your own risk.
“The rules that apply to multiple measurement of the same thing, which allows us to use the “law of large numbers” or other statistical approaches (Central Limit Theorem).”
Where does it say that the CLT only applies to measurements of the same thing? The CLT applies to any sample of random variables. This usually is used for samples taken at random from a population. The elements in the population do not have to be the same thing.
Thanks, Bellman. If I hadn’t been so pissed at Kip for accusing me of being dumber than a sixth grader I’d have pointed that out.
w.
w. ==> No personal insult intended — just the point that the arithmetic involved is at the sixth grade level. No need for higher maths, no need for statistics — just addition and division.
If you weren’t intending an insult, you’ve failed totally. However, your backhanded apology is accepted. Thanks.
In any case, here’s a question for you.
Suppose we want to know the 2021 average maximum daily temperature at some fixed spot. We can get an exact answer (± some instrumental uncertainty) by measuring it every day.
One person measures the maximum temperature in that spot on 12 randomly chosen days of the year and averages them.
Another person measures the maximum temperature on 180 randomly chosen days of the year and averages them.
Which person’s average is more likely to be closer to the actual average, and why?
w.
w. ==> Why would one want “2021 average maximum daily temperature”? That would be my first question. It is non-physical…it represents nothing real. In that sense, it doesn’t matter. …. but yes, since it is a daily high, and if we have a measurement for every day, then the average is arithmetic — simple arithmetic and “We can get an exact answer (± some instrumental uncertainty)”
My next question would be: Is that information useful in any practical sense? Is it some conceptual statistical animal?
So, with all that out of the way, the method you would use to calculate your “2021 average maximum daily temperature” without actual daily records is a model…and the model will tell you what it is told to say, which is that more values will produce a narrower variance around a central point, which is the purpose of the model. That’s how the model works.
What we will not know after all that is what the “average maximum daily high temperature” was in 2021. We will have a statistical estimate of the mean and its variance.
Only when you have all the daily values can you have the true arithmetic mean (± some instrumental uncertainty). Without the values, you have an estimate.
And if that is what you were hoping to find, you’re golden.
Whether you then more knowledgeable about temperatures in 2021 in that fixed spot is still an open question. But you’d have a number labelled “average of daily high temperatures at Fixed Spot”!
Kip Hansen December 10, 2022 6:11 pm
To tell if it is warmer regarding max temps than say 2020 or 1920, obviously.
I asked a ways upthread what you meant by “non-physical”, no answer. No average is “physical”, it’s a number … and?
Of course an average can be very useful, whether of temperature or any other value.
You’ll have to define what you are calling a “model”, and what difference it makes to call something a “model” rather than calling it “a simple average”, for that to make sense.
Not true in the slightest. It is an EXPERIMENT, not a model. Take 12 values and average them. Take 180 values and average them. Which one is closest to the actual mean? Repeat it 100 times. What do you find out?
Hey, I thought you were the one that was claiming that the law of large numbers doesn’t work with temperature measurements because they’re not measuring the same thing at the same time … but now you say it does work.
Which one is it?
Thanks,
w.
w. ==> “To tell if it is warmer regarding max temps than say 2020 or 1920, obviously.” But you won’t know that — the average won’t tell you that. It will only tell you if your “just a number” average is larger or smaller. Now, such a senseless rsult may be of interest to some, but it is not a real thing in the physical universe, as you say, its just a number.
For instance, it tells nothing about the amount of heat energy found in the air at 2 meters height at Fixed Place, Nebraska.
I’m not all in on the usefulness of averages for averages sake.
You started this with a question about sampling the Tmax and finding an average then finding more Tmaxs and averaging those — and throwing those into a distribution etc etc. That’s the model.
That process (which is a model) produces the same result using entirely random numbers over the same range as he measurements. The measurements don’t mater at all in that process/model, only the range. Terrifically useful! saves us all that effort of taking measurements.
“Hey, I thought you were the one that was claiming that the law of large numbers doesn’t work with temperature measurements because they’re not measuring the same thing at the same time … ” It doesn’t work for real things — you can get the answer you are looking for by plugging in random numbers — no real measurements need apply.
A model that works for anything at all cannot be considered to be working.
Waaah!
Bell ==> The Central Limit Theorem doesn’t have the same limitations as the Law of Large Numbers, but it has its own set of limitations.
“The elements in the population do not have to be the same thing.” I have serious doubts that the Central Limit Theorem can be applied to data sets of heterogeneous measurements — say the weight of pet rabbits in New Zealand mixed with the number of honey bees per hive in Australia mixed with the BMIs of sixth graders in the United States.
Maybe that’s not what you meant….
in any case, the CLT is not a universally applicable tool. Hang in there, that’s my next post in this series.
“Maybe that’s not what you meant…”
Of course that’s not what I meant. You can’t average things that have different dimensions, and even if you could average the weight of rabbits with the number of bees the average would be meaningless.
This is the sort of nonsensical argument that always seems to be used by people who just don’t want to believe that statistics can have any use. Point to some meaningless average, the masses of planets in the solar system, or even phone numbers, and claim that because those averages are meaningless, then so must be all averages.
Bellman ==> I tease to prompt you to explain what you do mean when you say:
“The elements in the population do not have to be the same thing.”
What parameters exist for appropriate use of the CLT?
“I tease to prompt you to explain what you do mean when you say:“The elements in the population do not have to be the same thing.”“
I mean they don’t have to be the same thing. If you want a philosophical argument about when things are or are not the same thing you need to ask those who insist there is a whole different set of uncertainty calculation for measuring the same verses different things.
With regard to the central limit theorems, I think the requirement is generally that you are summing random independent variables from the same population. That would usually mean they are the same type of thing but different. E.g. if you are taking peoples heights, each height is from a different person, but they are all heights. It makes no sense to add peoples heights to the height of buildings and even less sense to add heights to weights, but the I guess the CLT would still work, it’s just a meaningless sum.
Of what value is it to average temperatures with wind speed? Just because something can be done doesn’t meant that it is appropriate or has any value.
That was what I was saying. Just because you can average things doesn’t mean the result is automatically meaningful. And if the things you are averaging use different units it’s obvious you can’t average them. What would be the unit of an average of °C and km / h?
But some think it’s meaningless, or that statistics don’t apply if you average “different things” by which they mean anythings that are not the same object. You can average multiple measurements of the length of the same piece of wood, but can’t average the lengths of two different pieces of wood.
“You can average multiple measurements of the length of the same piece of wood, but can’t average the lengths of two different pieces of wood.”
Welcome to the club of metrology!
You can’t average temperatures taken at different times either, they are just like two different pieces of wood.
Averaging measurements of the same thing is done to get the best estimate of the true value of that “same thing”. When you are averaging measurements of different things there can’t be a “true value” so the averaging tells you literally nothing in the real world.
You *can* average the heights of a population of a population. But you don’t get a “true value” that allows you to order one size of T-shirt to fit everyone. Nor does the average fully describe the population, you also need at least one of range, variance, or standard deviation of the population. Range is a primary descriptor because it is part of every distribution, be it skewed, muti-nodal, or anything else.
Funny how you *NEVER* see any of these in regard to temperature.
“Welcome to the club of metrology!”
You are quoting me out of context. I was clearly saying that’s what some here believe, not what I think.
“You can’t average temperatures taken at different times either, they are just like two different pieces of wood. ”
Sure you can. It’s just arithmetic, or are you one of Kip’s arithmetic deniers?
“When you are averaging measurements of different things there can’t be a “true value” so the averaging tells you literally nothing in the real world. ”
You really need to define what you think a “true value” is. As I keep trying to explain to you, the purpose of taking a mean is to get the best estimate of the true mean value. Just because you can’t hold it in your hand does not make it a false value.
Then you need to ex[plain what you mean by “literally” and “nothing”. Because I can think of many examples where the average of different things literally tell you something about the real world.
“You *can* average the heights of a population of a population. ”
What do you mean by a population of a population? A population made up of sub-populations?
“But you don’t get a “true value” that allows you to order one size of T-shirt to fit everyone.”
There’s no such thing as a t-shirt that fits everyone. Why would you think an average would give you one. Would measuring a single person give you a t-shirt that fits everyone either? If you are only allowed one size of t-shirt, which would be the best estimate for the size that would fit most people, an average size of the population, or a size based on the first person you pulled off the street?
“Nor does the average fully describe the population, you also need at least one of range, variance, or standard deviation of the population.”
Nobody says the average fully describes the population. You are falling back on your strawmen again.
Free clue—Kip is using combined temperature measurement uncertainties. This is why he states that the values are already known.
He’s using interval arithmetic and claiming anyone who uses statistics is wrong.
Blindly applying the propagation equation to an average in order to reach the lowest possible numbers is wrong. Jim has used the GUM innumerable times to tell you this, yet you persist in believing things that are not true.
Your choice.
All I’ve been trying to do when arguing with you, Tim, and Jim, is to point out that it is simply wrong to calculate the absolute uncertainty of the sum and claim that it is the absolute uncertainty of the mean. That therefore uncertainties do not increase with sample size.
I’ve also pointed out that in general, using the correct equations the uncertainties will decrease with sample size, precisely because as Kip says you have to divide the uncertainty of the sum by the sample size.
But I’ve also said to you on many occasions that this does not mean real world uncertainties can be reduced to zero with infinite sample size. There will always be systematic errors, and other real-world issues.
The point of these never ending discussions is not about how to exactly calculate real world temperature data sets. It’s just to explain why you keep getting the wrong results becasue you don’t understand your own equations. No matter which method you use, there is no way uncertainty of the mean grows with sample size.
u(mean) / mean = u(sum) / sum
does not mean that
u(mean) = u(sum)
no matter how many times Tim says it does.
And you are still wrong, averaging different things does not reduce uncertainty.
Never mind for now if it can reduce uncertainty. I’m asking you if you think it will inevitably increase uncertainty, with the larger the sample size the less certainty. That’s what I’ve been arguing with Tim et al for the past two years.
I’ll try again. Do you personally agree or disagree that you can take the uncertainty of the sum to be the uncertainty of the average?
Look at GUM, Annex H, Equation H.36.
If you have ONE measurement each of different things then for every individual element you add the uncertainty will grow.
H.36
u_total^2 = s_avg^2(x1)/m + s_avg^2(x2)/n
This becomes for a single measurement of each object
u_total^2 = u(x1^2/1 + u(x2)^2/1 + …..
Every individual measurement you add grows the total uncertainty.
Are now going to tell us that the GUM is wrong?
What are you on about now? The equation
u_total^2 = u(x1)^2/1 + u(x2)^2/1 + …..
Is just saying to get the uncertainty of the total, add the individual uncertainties in quadrature. Just as we’ve always said. The total uncertainty grows as you add more items, and as always you fail to see the step that happens when you convert the total top a mean and divide by the number of measurements.
“Is just saying to get the uncertainty of the total”
THAT IS THE WHOLE POINT!
You keep trying to push the idea that the AVERAGE UNCERTAINTY is the uncertainty of the average!
It’s *NOT*.When you divide by a constant it adds nothing to the uncertainty nor does it reduce it in any way.
The uncertainty of the average *is* the uncertainty of the total .
If q = x + y then the uncertainty of q is not [u(x) + u(y)]/2!
The uncertainty of q = x + y is u(q) = u(x) + u(y)!
(or root-sum-square if it is assumed that some cancellation of uncertainty exists).
Again: Average uncertainty is meaningless!
if q_avg = (x + y + z + w)/4
the uncertainty of q_avg is u(x) + u(y) + u(z) + u(w) + u(4)
Since the uncertainty of 4 is zero the uncertainty of the average is the *sum* of all the other uncertainties!
u(q) = u(q_avg) = u(x) + u(y) + u(z) + u(w)
Convert it to root-sum-square if you want but u(q) will still equal u(q_avg).
This is all right out of Taylor, Chapter 3 and the GUM Eq H.36.
“THAT IS THE WHOLE POINT!”
Amazing. I’ve just responded to a comment where you denied you had ever said the uncertainty of the mean is the uncertainty of the sum and now you say that’s the whole point. I’m not sure you understand what you are saying at this point.
“You keep trying to push the idea that the AVERAGE UNCERTAINTY is the uncertainty of the average!”
It’s pointless arguing with you in this state. Half your comments no is just you yelling variations of this, and it doesn’t matter how many times I’ve explained to you why this isn;t true, you’ll just keep repeating it.
“It’s *NOT*.When you divide by a constant it adds nothing to the uncertainty nor does it reduce it in any way.”
So ask Kip to explain to you why this is wrong. You never take any notice when I do.
“Amazing. I’ve just responded to a comment where you denied you had ever said the uncertainty of the mean is the uncertainty of the sum and now you say that’s the whole point. I’m not sure you understand what you are saying at this point.”
I have ALWAYS said the uncertainty of the mean is *NOT* the average uncertainty, it is the uncertainty of the sum!
Put down the bottle!
“So ask Kip to explain to you why this is wrong. You never take any notice when I do.”
I assure you that Kip will agree with me.
Again:
q = x – y
u(q) = u(x) + u(y)
q(avg) = (x + y) / 2
u(q_avg) = u(x) + u(y) + u(2) = u(x) + u(y)
Thus u(q) = u(q_avg)
u(avg) = [ u(x) + u(y) ] / 2
Thus u(avg) ≠ u(q_avg)
The average uncertainty is *NOT* the uncertainty of the average.
When is this ever going to sink in?
“I assure you that Kip will agree with me. ”
It doesn’t matter what you showed. The average uncertainty is not the uncertainty of the average.
I’ve shown you that and you haven’t refuted the math in any way shape or form. All you’ve done is said I should have used relative uncertainty instead of just uncertainty. That does *NOT* mean that the average uncertainty equals the uncertainty of the average.
Bullshit. You want to divide everything by root(N) so you can get those impossibly tiny numbers.
Are you really this dense? Or is this all an act?
And thus applying the GUM propagation equation to the average formula is inappropriate!
You were the one who told me I had to use that formula. If it’s now inappropriate, which formula should I use.
Use GUM, Eq H.36.
That’s about subtracting one value from another. It’s the uncertainty of the sum not the average. And it’s saying to add in quadrature.
Rule 1: Uncertainties add whether you are subtracting or adding!
So ADD the uncertainties in quadrature. All that means is that the uncertainty doesn’t grow as fast as with direct addition.
You just keep digging yourself deeper and deeper into your delusional box. You’ve now been reduced to saying that uncertainties don’t add when you subtract.
Just how deep are you going to dig before you stop?
Again:
q = x – y
u(q) = u(x) + u(y)
q(avg) = (x + y) / 2
u(q_avg) = u(x) + u(y) + u(2) = u(x) + u(y)
Thus u(q) = u(q_avg)
u(avg) = [ u(x) + u(y) ] / 2
Thus u(avg) ≠ u(q_avg)
The average uncertainty is *NOT* the uncertainty of the average.
Keep digging. You are only making a fool of yourself.
I refer the honorable member to my previous answer, and the 10000 other times I’ve tried to explain it to you.
Two wrongs don’t make a right. It doesn’t matter how many times you’ve posted incorrect information, it still remains incorrect.
until you can show my math to be wrong your are just whining.
The problem is it doesn’t matter how many times I show you are wrong. The argument continues until you understand why you are wrong – and that will never happen because you are incapable of believing you could be wrong.
And there’s no way of demonstrating to you, you are wrong because every experiment I suggest is just dismissed. You have created the perfect unfalsifiable hypothese. You must be right because no one can prove you wrong, and no one can prove you wrong because you are right.
Not quite. It means the uncertainty of the average *is* the uncertainty of the sum. u(q) = u(q_avg).
u(q_avg) ≠ u(q) / N
u(average) = u(q) / N
u(q_avg) ≠ u(average)
u(q)/N is the average uncertainty. It is taking the uncertainty of the measurements and making them all equal. You still get the same total uncertainty.
It is *still* not the uncertainty of the average. The term “average uncertainty” should be banned in science.
Along with “standard error”.
And absolutely no understanding that the “standard error” is a measure of how close you are to the population mean and *NOT* a measure of the uncertainty of the population mean!
“…a measure of how close you are to the population mean and *NOT* a measure of the uncertainty of the population mean!”
Really not sure how you define uncertainty of the mean, except by how close it’s likely to be to the population mean.
Is the population mean always 100% accurate?
If yes, then you’ve learned nothing about the propagation of uncertainty.
If no, then why does it matter how close you are to the population mean? You are just getting close to an inaccurate value!
“Is the population mean always 100% accurate?”
Yes. By definition it’s the true value of the thing we are trying to find.
“If yes, then you’ve learned nothing about the propagation of uncertainty.”
The population mean does not include uncertainties. It’s the mean of all true values in the population.
Where do you get these “true values”? In the looking glass?
A darker place I think.
Indeed.
The population mean is *ONLY* an estimate of true value and ONLY when the restrictions are met – 1. symmetric distribution, 2) totally random error
If there is any systematic bias then the average is not a true value. If the population consists of independent, single measurements of different things then there is no guarantee that either of the two restrictions above are met. I.e. the average can *NOT* be assumed to be a true value.
How many examples must be provided to you before you understand this?
The average of a 2′ board and a 10′ board first doesn’t exist physically and second have measurement uncertainty that is not captured by the average of the stated values.
“The population mean does not include uncertainties. It’s the mean of all true values in the population.”
Again – PUT DOWN THE BOTTLE!
If the mean of the stated values is always a true value then why do people worry about measurement uncertainty at all? Just leave the measurement uncertainty totally out and use the stated values for everything.
I know that is what statisticians and climate scientists do. Believe me, engineers do not. Nor do most physical scientists.
“The population mean is *ONLY* an estimate of true value”
If the population mean is not the true value, what on earth do you think is the true value?
“If the population consists of independent, single measurements of different things then there is no guarantee that either of the two restrictions above are met.”
I think this may just be down to terminology. The population is not something you are measuring. You are only measuring a sample. The population consists of everything in the population, which may be a physical thing like all the trees in forest or all the people in the world, but better thought of as an abstract set representing all possible things that could exist in the population. Any sample is an imperfect estimate of the population. Imperfect both because its a random sample and because the measurements will be uncertain.
“How many examples must be provided to you before you understand this?”
Please, not another of your idiotic examples, which prove nothing.
“The average of a 2′ board and a 10′ board first doesn’t exist physically and second have measurement uncertainty that is not captured by the average of the stated values.”
Really? You’re saying 6′ boards don’t exist? Not that it has anything to do with what you were just saying.
“Again – PUT DOWN THE BOTTLE!”
But I want my milk!
“If the mean of the stated values is always a true value then why do people worry about measurement uncertainty at all?”
If you ever thought about what you were saying you might not have to keep asking so many dumb questions. I am saying the population mean has no uncertainties. But you don’t know what that is, so you take a sample and the mean of that sample is an imperfect, uncertain estimate of the true population mean.
Yowza, back to uncertainty 101, again?
True values are unknowable (except perhaps in climastrology).
That’s my point. The true population mean is unknowable. That’s why you have to estimate it from a sample.
Oh, MALARKY!
There were 1500 2-dr sedan Chevy Belair’s made in 1957. That’s the entire population. It would have certainly been possible to measure something on each one of them and calculate a population mean. For example, wheel horsepower and torque, clutch free travel, etc.
Even the GUM says exactly what you just said!
It looks like you’re considering discrete values and Tim is discussing continuous.
Discrete values don’t have uncertainties, but measurements from a continuum are limited by the resolution of the measuring instrument.
“If the population mean is not the true value, what on earth do you think is the true value?”
If the population average is the true value then why do you say the average uncertainty is the uncertainty of the average? The “true value” should have no uncertainty, right?
“The population is not something you are measuring.df”
The population is made up of measurements.
“Any sample is an imperfect estimate of the population. Imperfect both because its a random sample and because the measurements will be uncertain.”
Correct. But that means as a corollary that the average uncertainty is *not* the uncertainty of the mean. The standard deviation of the sample means is only an indicator of how close you are to the population mean. But if the measurements are uncertain then how can getting close to the mean imply anything about the uncertainty of that mean.
“Really? You’re saying 6′ boards don’t exist? Not that it has anything to do with what you were just saying.”
NO. 6′ boards don’t exist in your sample. So you’ve calculated something that doesn’t exist. The average value of a six sided die is 3.5. Can you roll a 3.5? Does 3.5 exist in the physical world?
“I am saying the population mean has no uncertainties.”
I’ll ask again. If the population mean has no uncertainties then why do you keep talking about the average uncertainty?
“If the population average is the true value then why do you say the average uncertainty is the uncertainty of the average? ”
I don’t say it. The voices in your head say it.
“The “true value” should have no uncertainty, right?”
That might depend on what definition of uncertainty you are using.
But what I’m saying is you are wrong to claim “The population mean is *ONLY* an estimate of true value”.
“The population is made up of measurements.”
No it isn’t. At least not how I see it.
“But that means as a corollary that the average uncertainty is *not* the uncertainty of the mean.”
The same lie twice in one comment – you really need help.
“The standard deviation of the sample means is only an indicator of how close you are to the population mean.”
Hence an indicator of the uncertainty.
“But if the measurements are uncertain then how can getting close to the mean imply anything about the uncertainty of that mean.”
The closer your sample mean is likely to be to the population mean the less uncertainty. Or, to use the GUM’s definition, the confidence interval characterizes the dispersion of values that could reasonably be attributed to the population mean.
“I’ll ask again. If the population mean has no uncertainties then why do you keep talking about the average uncertainty?”
Strike three. The only one who keeps talking about the average uncertainty is you.
Find a good neurosurgeon and have him/her remove the statistics sampling tumor that seems to keep growing larger and larger.
The true population mean is knowable if you can use a census. Samples are far less difficult, and, within limits, provide a “near enough” estimate.
Switching this back to a global average temperature. The population is the grid mesh. In many cases (like HadCRUTv5, BEST, ERA, etc.) each grid cell has a value. In that case the average is the population mean since all cells are included. The problem is that each cell has its own uncertainty that propagates into the population mean. It is a case where the population mean is not know for certain.
For the global average surface temperature, the grid mesh is a sample, or, if you squint hard enough, a very low spatial resolution population.
I don’t know what would be a sufficiently high resolution, but probably on the order of 10km spatial separation.
Yes, each reading taken from a continuum is a spot estimate of the value, subject to resolution bounds, so the population mean is also subject to resolution bounds.
This is a good straight line to feed into some pedantry (who, me? Pedantic?). A population is the full set of the items of interest. If that is “every thermometer on Earth, that’s what it is, and every subset is a sample. A corollary is that if any of those thermometers changes, you have a different population. There may be a very broad overlap, but they are different populations.
Another pedant point is that of the widely used measures of centrality, the mode(s) and median (if we aren’t being lazy) are members of the population (or sample), but the arithmetic mean almost certainly isn’t. The arithmetic mean is the ratio of the sum of values over the number of values, so the SigFig rules don’t strictly apply. Uncertainty bounds do, but not the significant digits.
</pedantry>
None, Zip, Zero, Nada, to paraphrase Rush Limbaugh.
I miss him too. Think of how different his views would have been if his 4th try at rehab woulda’ took.
“Kip, saying that observing a thing twice makes you more uncertain about it than if you had observed it only once is so conceptually nonsensical “
Alan, you are *NOT* observing the same thing twice. You are only observing it once. Tmax is one measurand. Tmin is a second, different measurand.
Therefore you are *NOT* building a random distribution of measurements around a mean associated with a single measurand.
AlanJ ==> ADDING two values which each have a given absolute uncertainty (see essay for the definition) results in a sum that is less certain the the two values added. You can draw an illustration to prove it to yourself.
No one, certainly not me, said “observing a thing twice makes you more uncertain about it than if you had observed it only once”.
Adding measurements and dividing them by the number of measurements is the process of finding an arithmetic mean. If these measurement have absolute uncertainty values, one must know how to add and divide the sums. which is the subject of this essay.
It would be terrific if we could measure the temperature at the single weather station a thousand times at each instant…boy, then we could have some really good records. But we don’t, we measure different times, different volumes of air, at each measurement. worse, most of the historical record gives only whole degree recordings….or rounded values to the nearest degree, which means we only know what the temperature was to within a range of whole degree (or given value +/- 0.5°).
Having a more precise measurement does not mean we have more information. If a bug flies into the Stevens Screen and impacts the temperature sensor with its wings, at the 6th digit to the right of the decimal point, that doesn’t mean we have any more insight on the causes or consequences of long term climate changes. There is an optimal level of accuracy and precision for answering the question. However, I don’t think any of the alarmists have given much thought to just what the optimum is. They are trying to make do with a data collection system what wasn’t intended for climatology.
Of if a mudauber wasp builds a nest on the air intake opening. Of if cottonwood tree spring fluff gets into the air intake and/or on the sensor itself. Or any myriad of other things.
I’m reminded of the old joke: “A man who owns one watch always knows what time it is. A man with two watches is never sure of the correct time.”
Good one!