Guest Essay by Kip Hansen —10 December 2022

“In mathematics, the ± sign [or more easily, +/-] is used when we have to show the two possibilities of the desired value, one that can be obtained by addition and the other by subtraction. [It] means there are two possible answers of the initial value. In science it is significantly used to show the standard deviation, experimental errors and measurement errors.” [ source ] While this is a good explanation, it is not entirely correct. It isn’t that there are two possible answers, it is that the answer could be as much as or as little as the “two possible values of the initial value” – between the one with the absolute uncertainty added and the one with the absolute uncertainty subtracted.
[ Long Essay Warning: This is 3300 words – you might save it for when you have time to read it in its entirety – with a comforting beverage in your favorite chair in front of the fireplace or heater.]
When it appears as “2.5 +/- 0.5 cm”, it is used to indicate that the central value “2.5” is not necessarily the actually the value, but rather that the value (the true or correct value) lies between the values “2.5 + 0.5” and “2.5 – 0.5”, or fully stated calculated “The value lies between 3 cm and 2 cm”. This is often noted to be true to a certain percentage of probability, such as 90% or 95% (90% or 95% confidence intervals). The rub is that the actual accurate precise value is not known, it is uncertain; we can only correctly state that the value lies somewhere in that range — but only “most of the time”. If the answer is to 95% probability, then 1 out of 20 times, the value might not lie within the range of the upper and lower limits of the range, and if 90% certainty, then 1 out of ten times the true value may well lie outside the range.
This is important. When dealing with measurements in the physical world, the moment the word “uncertainty” is used, and especially in science, a vast topic has been condensed into a single word. And, a lot of confusion.
Many of the metrics presented in many scientific fields are offered as averages, as the arithmetic or probabilistic averages (usually ‘means’). And thus, when any indication of uncertainty or error is included, it is many times not the uncertainty of the mean value of the metric, but the uncertainty of the mean of the values. This oddity alone is responsible for a lot of the confusion in science.
That sounds funny, doesn’t it. But there is a difference that becomes important. The mean value of a set of measurements is given in the formula:

So, the average—the arithmetic mean—by that formula itself carries with it the uncertainty of the original measurements (observations). If the original observations look like this: 2 cm +/- 0.5 cm then the value of the mean will have the same form: 1.7 cm +/- the uncertainty. We’ll see how this is properly calculated below.
In modern science, there has developed a tendency to substitute instead of that, the “uncertainty of the mean” – with a differing definition that is something like “how certain are we that that value IS the mean?”. Again, more on this later.
Example: Measurements of high school football fields, made rather roughly to the nearest foot or two (0.3 to 0.6 meters), say by counting the yardline tick marks on the field’s edge, give a real measurement uncertainty of +/- 24 inches. By some, this could be averaged to produce a mean of measurements of many high school football fields by a similar process with the uncertainty of the mean reportedly reduced to a few inches. This may seem trivial but it is not. And it is not rare, but more often the standard. The pretense that the measurement uncertainty (sometimes stated as original measurement error) can be reduced by an entire order of magnitude by stating it as the “uncertainty of the mean” is a poor excuse for science. If one needs to know how certain we are about the sizes of those football fields, then we need to know the real original measurement uncertainty.
The trick here is switching from stating the mean with its actual original measurement uncertainty (original measurement error) replacing it with the uncertainty of the mean. The new much smaller uncertainty of the mean is a result of one of two things: 1) it is the Product of Division or 2) Probability (Central Limit Theory).
Case #1, the football field example is an instance of: a product of division. In this case, the uncertainty is no longer about the length of the, and any of the, football fields. It is only how certain we are of the arithmetic mean, which is usually only a function of how many football fields were included in the calculation. The original measurement uncertainty has been divided by the number of fields measured in a mockery of the Central Limit Theory.
Case#2: Probability and Central Limit Theorem. I’ll have to leave that topic for the another part in this series – so, have patience and stay tuned.
Now, if arithmetical means are all you are concerned about – maybe you are not doing anything practical or just want to know, in general, how long and wide high school football fields are because you aren’t going to actually order astro-turf to cover the field at the local high school, you just want a ball-park figure (sorry…). So, in that case, you can go with the mean of field sizes which is about 57,600 sq.ft (about 5351 sq. meters), unconcerned with the original measurement uncertainty. And then onto the mean of the cost of Astro-turfing a field. But, since “Installation of an artificial turf football field costs between $750,000 to $1,350,000” [ source ], it is obvious that you’d better get out there with surveying-quality measurement tools and measure your desired field’s exact dimensions, including all the area around the playing field itself you need to cover. As you can see, the cost estimates have a range of over half a million dollars.
We’d write that cost estimate as a mean with an absolute uncertainty — $1,050,000 (+/- $300,000). How much your real cost would be would depends on a lot of factors. At the moment, with no further information and details, that’s what we have….the best estimate of cost is in there somewhere —> between $750,000 and $1,350,000 – but we don’t know where. The mean $1,050,000 is not “more accurate” or “less uncertain”. The correct answer, with available data, is the RANGE.
Visually, this idea is easily illustrated with regards to GISTEMPv4:

The absolute uncertainty in GISTEMPv4 was supplied by Gavin Schmidt. The black trace, which is a mean value, is not the real value. The real value for the year 1880 is a range—about 287.25° +/- 0.5°. Spelled out properly, the GISTEMP in 1880 was somewhere between 286.75°C and 287.75°C. That’s all we can say. GISTEMPv4 mean for 1980, one hundred years later, still fits inside that range with the uncertainty ranges of both years overlapping by about 0.3°C; meaning it is possible that the mean temperature had not risen at all. In fact, uncertainty ranges for Global Temperature overlap until about 2014/2015.
(Correction: Embarrassingly, I have inadvertently used degrees C in the above paragraph when it should be K — which in proper notation doesn’t require a degree symbol. The values are eyeballed from the graph. Some mitigation is Gavin uses K and C in his original quote below. The graph should also be mentally adjusted to K. h/t to oldcocky! )
The quote from Gavin Schmidt on this exact point:
“But think about what happens when we try and estimate the absolute global mean temperature for, say, 2016. The climatology for 1981-2010 is 287.4±0.5K, and the anomaly for 2016 is (from GISTEMP w.r.t. that baseline) 0.56±0.05ºC. So our estimate for the absolute value is (using the first rule shown above) is 287.96±0.502K, and then using the second, that reduces to 288.0±0.5K. The same approach for 2015 gives 287.8±0.5K, and for 2014 it is 287.7±0.5K. All of which appear to be the same within the uncertainty. Thus we lose the ability to judge which year was the warmest if we only look at the absolute numbers.” [ source – repeating the link ]
To be absolutely correct, the global annual mean temperatures have far more uncertainty than is shown or admitted by Gavin Schmidt, but at least he included the known original measurement error (uncertainty) of the thermometer-based temperature record. Why is that? Why is it greater than that? …. because the uncertainty of a value is the cumulative uncertainties of the factors that have gone into calculating it, as we will see below (and +/- 0.5°C is only one of them).
Averaging Values that have Absolute Uncertainties
Absolute uncertainty. The uncertainty in a measured quantity is due to inherent variations in the measurement process itself. The uncertainty in a result is due to the combined and accumulated effects of these measurement uncertainties which were used in the calculation of that result. When these uncertainties are expressed in the same units as the quantity itself they are called absolute uncertainties. Uncertainty values are usually attached to the quoted value of an experimental measurement or result, one common format being: (quantity) ± (absolute uncertainty in that quantity). [ source ]
Per the formula for calculating a arithmetic mean above, first we add all the observations (measurements) and then we divide the total by the number of observations.
How do we then ADD two or more uncertain values, each with its own absolute uncertainty?
The rule is:
When you add or subtract the two (or more) values to get a final value, the absolute uncertainty [given as “+/- a numerical value”] attached to the final value is the sum of the uncertainties. [ many sources: here or here]
For example:
5.0 ± 0.1 mm + 2.0 ± 0.1 mm = 7.0 ± 0.2 mm
5.0 ± 0.1 mm – 2.0 ± 0.1 mm = 3.0 ± 0.2 mm
You see, it doesn’t matter if you add or subtract them, the absolute uncertainties are added. This applies no matter how many items are being added or subtracted. In the above example, if 100 items (say sea level rise at various locations) each with its own absolute measurement uncertainty of 0.1 mm, then the final value would have an uncertainty of +/- 10 mm (or 1 cm).
This is principle easily illustrated in a graphic:

In words: ten plus or minus one PLUS twelve plus or minus one EQUALS twenty-two plus or minus two. Ten plus or minus 1 really signifies the range eleven down to nine and twelve plus or minus one signifies the range thirteen down to eleven. Adding the two higher values of the ranges, eleven and thirteen, gives twenty-four which is twenty-two (the sum of ten and twelve on the left) plus two, and adding the too lower values of the ranges, nine and eleven, gives the sum of twenty which is twenty-two minus two. Thus our correct sum is twenty-two plus or minus two, shown at the top right.
Somewhat counter-intuitively, the same is true if one subtracts one uncertain number from another, the uncertainties (the +/-es) are added, not subtracted, giving a result (the difference) more uncertain than either the minuend (the top number) or the subtrahend (the number being subtracted from the top number). If you are not convinced, sketch out your own diagram as above for a subtraction example.
What are the implications of this simple mathematical fact?
When one adds (or subtracts) two values with uncertainty, one adds (or subtracts) the main values and adds the two uncertainties (the +/-es) in either case (addition or subtraction) – the uncertainty of the total (or difference) is always higher than the uncertainty of either original values.
How about if we multiply? And what if we divide?
If you multiply one value with absolute uncertainty by a constant (a number with no uncertainty)
The absolute uncertainty is also multiplied by the same constant.
eg. 2 x (5.0 ± 0.1 mm ) = 10.0 ± 0.2 mm
Likewise, if you wish to divide a value that has an absolute uncertainty by a constant (a number with no uncertainty), the absolute uncertainty is divided by the same amount. [ source ]
So, 10.0 mm +/- 0.2mm divided by 2 = 5.0 +/- 0.1 mm.
Thus we see that the arithmetical mean of the two added measurements (here we multiplied but it is the same as adding two–or two hundred–measurements of 5.0 +/- 0.1 mm) is the same as the uncertainty in the original values, because, in this case, the uncertainty of all (both) of the measurement is the same (+/- 0.1). We need this to evaluate averaging – the finding of a arithmetical mean.
So, now let’s see what happens when we find a mean value of some metric. I’ll use a tide gauge record as tide gauge measurements are given in meters – they are addable (extensive property) quantities. As of October 2022, the Mean Sea Level at The Battery was 0.182 meters (182 mm, relative to the most recent Mean Sea Level datum established by NOAA CO-OPS.) Notice that here is no uncertainty attached to the value. Yet, even mean sea levels relative to the Sea Level datum must be uncertain to some degree. Tide gauge individual measurements have a specified uncertainty of +/- 2 cm (20 mm). (Yes, really. Feel free to read the specifications at the link).
And yet the same specifications claim an uncertainty of only +/- 0.005 m (5 mm) for monthly means. How can this be? We just showed that adding all of the individual measurements for the month would add all the uncertainties (all the 2 cms) and then the total AND the combined uncertainty would both be divided by the number of measurements – leaving again the same 2 cm as the uncertainty attached to the mean value.
The uncertainty of the mean would not and could not be mathematically less than the uncertainty of the measurements of which it is comprised.
How have they managed to reduce the uncertainty to 25% of its real value? The clue is in the definition: they correctly label it the “uncertainty of the mean” — as in “how certain are we about the value of the arithmetical mean?” Here’s how they calculate it: [same source]
| “181 one-second water level samples centered on each tenth of an hour are averaged, a three standard deviation outlier rejection test applied, the mean and standard deviation are recalculated and reported along with the number of outliers. (3 minute water level average)” |
Now you see, they have ‘moved the goalposts’ and are now giving not the uncertainty of the value of mean at all, but the “standard deviation of the mean” where “Standard deviation is a measure of spread of numbers in a set of data from its mean value.” [ source or here ]. It is not the uncertainty of the mean. In the formula given for arithmetic mean (image a bit above), the mean is determined by a simple addition and division process. The numerical result of the formula for the absolute value (the numerical part not including the +/-) is certain—addition and division produce absolute numeric values — there is no uncertainty about that value. Neither is there any uncertainty about the numeric value of the summed uncertainties divided by the number of observations.
Let me be clear here: When one finds the mean of measurements with known absolute uncertainties, there is no uncertainty about the mean value or its absolute uncertainty. It is a simple arithmetic process.
The mean is certain. The value of the absolute uncertainty is certain. We get a result such as:
3 mm +/- 0.5 mm
Which tells us that the numeric value of the mean is a range from 3 mm plus 0.5 mm to 3 mm minus 0.5 mm or the 1 mm range: 3.5 mm to 2.5 mm.
The range cannot be further reduced to a single value with less uncertainty.
And it really is no more complex than that.
# # # # #
Author’s Comment:
I heard some sputtering and protest…But…but…but…what about the (absolutely universally applicable) Central Limit Theorem? Yes, what about it? Have you been taught that it can be applied every time one is seeking a mean and its uncertainty? Do you think that is true?
In simple pragmatic terms, I have showed above the rules for determining the mean of a value with absolute uncertainty — and shown that the correct method produces certain (not uncertain) values for both the overall value and its absolute uncertainty. And that these results represent a range.
Further along in this series, I will discuss why and under what circumstances the Central Limit Theorem shouldn’t be used at all.
Next, in Part 2, we’ll look at the cascading uncertainties of uncertainties expressed as probabilities, such as “40% chance of”.
Remember to say “to whom you are speaking”, starting your comment with their commenting handle, when addressing another commenter (or, myself). Use something like “OldDude – I think you are right….”.
Thanks for reading.
# # # # #
Epilogue and Post Script:
Readers who have tortured themselves by following all the 400+ comments below — and I assure you, I have read every single one and replied to many — can see that there has been a lot of pushback to this simplest of concepts, examples, and simple illustrations.
Much of the problem stems from what I classify as “hammer-ticians”. Folks with a fine array of hammers and a hammer for every situation. And, by gosh, if we had needed a hammer, we’d have gotten not just a hammer but a specialized hammer.
But we didn’t need a hammer for this simple work. Just a pencil, paper and a ruler.
The hammer-ticians have argued among themselves as to which hammer should have been applied to this job and how exactly to apply it.
Others have fought back against the hammer-ticians — challenging definitions taken only from the hammer-tician’s “Dictionary of Hammer-stitics” and instead suggesting using normal definitions of daily language, arithmetic and mathematics.
But this task requires no specialist definitions not given in the essay — they might as well have been arguing over what some word used in the essay would mean in Klingon and then using the Klingon definition to refute the essay’s premise.
In the end, I can’t blame them — in my youth I was indoctrinated in a very specialist professional field with very narrow views and specialized approaches to nearly everything (intelligence, threat assessment and security, if you must ask) — ruined me for life (ask my wife). It is hard for me even now to break out of that mind-set.
So, those who have heavily invested in learning formal statistics, statistical terminology and statistical procedures might be unable to break out of that narrow canyon of thinking and unable to look at a simple presentation of a pragmatic truth.
But hope waxes eternal….
Many have clung to the Central Limit Theorem — hoping that it will recover unknown information from the past and resolve uncertainties — that I will address that in the next part of this series.
Don’t hold your breathe — it may take a while.
# # # # #
Discover more from Watts Up With That?
Subscribe to get the latest posts sent to your email.
One will not find a set of rules or procedures to prove uncertainty of any phenomenon. Rather, the selection of rules, or the acceptance of certain rules, is a judgement. This judgement exists outside the realm of computation. The reported uncertainties in most applications are almost always completely made up, based on our judgement of the feature of interest. The application of rules in such circumstances is really just hog wash.
JCM ==> I have given the example, in the real world, of temperature records which were originally kept only in whole degrees, or later , measured in fractional degrees but rounded to whole degrees, so that our knowledge of the true values is reduced to x° +/- 0.5°.
We know that is the uncertainty of the recorded value — quite exactly.
What we don’t know are many other uncertainties that may have caused some error or uncertainty in the measured temperature (which was then recorded with an absolute [measurement] uncertainty.)
What we know for sure is that we’ll judge the uncertainty on the data values to suit the story we’re trying to tell.
I do not have lab experience, rather I am trained to sample the great outdoors.
The methods to determine the uncertainty in a controlled lab testing might be one thing…but, in environmental science it’s really a crapshoot. So, we put a lot of faith in the judgment of the analyst. Ideally the analyst was also involved with collecting the data so they can understand the circumstances of the sampling.
Sure, my temperature probe might have manufacturer specs listed on the label … but, we know for sure that’s only the beginning of what’s unknowable. There are no arithmetic tricks to tell us anything about the truth.
If I’m sampling a stream weekly at one location for pH, dissolved oxygen, and temperature – do I simply do some arithmetic based on the manufacture’s specs of my sensor? Will this tell me how far I am from reality in monthly or annual values for that watercourse? That would be nonsense, IMO.
If you are attempting to take quantitative measurements rather than making subjective estimates, there are rules that have been developed that should be followed to insure that the quantitative measurements are truly representative and useful. Anything less is probably a waste of time and resources.
Subjective terms which require judgement. There is no standard set of rules that will sort this out for you.
I can’t resist
Gee, 1880 was a warm year 🙂
cocky ==> Have at it — everyone gets at least one free shot … after that, I get tetchy.
But — Good Heavens — did I really make that error? Yes, the 286 etc should be “K” not “C”. The only saving grace is that the size of C degrees and K degrees are the same. You are the only commenter that caught it!
I’ll make a correction. Thanks!
Kip Hansen said: “The rule is:
When you add or subtract the two (or more) values to get a final value, the absolute uncertainty [given as “+/- a numerical value”] attached to the final value is the sum of the uncertainties. [ many sources: here or here]
For example:
5.0 ± 0.1 mm + 2.0 ± 0.1 mm = 7.0 ± 0.2 mm
5.0 ± 0.1 mm – 2.0 ± 0.1 mm = 3.0 ± 0.2 mm”
Using the procedure in section 3 of Bevington 2003 yields 0.14 mm.
Using equation 3.16 in Taylor 1997 yields 0.14 mm.
Using equation 10 in JCGM 100:2008 yields 0.14 mm.
Using the procedure in appendix a of NIST TN 1297 yields 0.14 mm.
Using the NIST Uncertainty Machine yields 0.14 mm.
This is statement using the guard digit. If we exclude the guard digit they are all 0.1 mm.
Kip Hansen said: “Thus we see that the arithmetical mean of the two added measurements (here we multiplied but it is the same as adding two–or two hundred–measurements of 5.0 +/- 0.1 mm) is the same as the uncertainty in the original values, because, in this case, the uncertainty of all (both) of the measurement is the same (+/- 0.1). We need this to evaluate averaging – the finding of a arithmetical mean.”
This statement appears to be inconsistent with the sources I cited above. All sources say the uncertainty of the mean is u(Y) = u(X) / sqrt(N) where Y = Σ[X_i, 1, N] / N and u(X) = u(X_i) for all X_i.
Kip Hansen said: “How have they managed to reduce the uncertainty to 25% of its real value?”
By using law of propagation of uncertainty. See the sources I cited above.
Kip Hansen said: “Now you see, they have ‘moved the goalposts’ and are now giving not the uncertainty of the value of mean at all, but the “standard deviation of the mean” where “Standard deviation is a measure of spread of numbers in a set of data from its mean value.”
The standard deviation is the uncertainty. Using formal terminology it is the standard uncertainty. See JCGM 200:2012 entry 2.30.
Kip Hansen said: “It is not the uncertainty of the mean.”
This statement appears to be inconsistent with the sources I cited above.
Kip Hansen said: “The numerical result of the formula for the absolute value (the numerical part not including the +/-) is certain—addition and division produce absolute numeric values — there is no uncertainty about that value.”
This statement appears to be inconsistent with the sources I cited above. There is always uncertainty regardless of measurement model even if the measurement model is simple like Y = f = Σ[X_i, 1, N] / N (aka the average of a sample of X). In this specific case u(Y) = u(X) / sqrt(N) where u(X) = u(X_i) for all X_i. See sources I cited above.
Kip Hansen said: “Neither is there any uncertainty about the numeric value of the summed uncertainties divided by the number of observations.”
This statement appears to be inconsistent with the sources I cited above. There is always uncertainty regardless of measurement model even if the measurement model is simple like Y = f = Σ[X_i, 1, N] (aka the sum of a sample of X). In this specific case u(Y) = u(X) * sqrt(N) where u(X) = u(X_i) for all X_i. See sources I cited above.
You forgot to plug the NIST machine for the 375th time…
Errata…
The link to Taylor 1997 is here.
The link to JCGM 200:2012 is here.
The guard digit is commonly underlined or placed in brackets to denote it is not actually a significant figure.
I believe the rules allow you to use the guard digit without underlining or brackets if the most significant digit is 1. No?
As the second digit gets smaller, e.g. 0.11, it also becomes harder to justify keeping the second digit. Uncertainty has a precision just like the stated value. Stating a precision for uncertainty that cannot be justified is misleading to anyone reading it.
This is the so-called “half-digit” issue when 1 is the leading digit, it can be very tricky to implement in software.
Yep, and different people will code it differently so you can get different answers.
It’s a sliding scale so you must pick trigger points. Who sets the trigger points?
“This statement appears to be inconsistent with the sources I cited above. All sources say the uncertainty of the mean is u(Y) = u(X) / sqrt(N) where Y = Σ[X_i, 1, N] / N and u(X) = u(X_i) for all X_i.”
Once again, the uncertainty of the mean is *NOT* the accuracy of the mean. It is the interval in which the population mean could lie. The population mean is only a true value estimate if you have a random, Gaussian distribution of measurements where you get a cancellation of random measurement error. Then the standard deviation of the sample means gives an estimate of the accuracy of the calculated mean.
You simply cannot jam temperatures taken on the same day at different times to give a Gaussian distribution. So you *must* propagate the individual uncertainties onto anything you calculate from them. That uncertainty then propagates onto anything you calculate from a data set of daily “averages”. The accuracy of the monthly mean is *NOT* defined by the variation in the daily values but by the uncertainty propagated up the chain from the individual measurements.
What you are doing is assuming the individual temperatures are 100% accurate and that daily “average” values are therefore 100% accurate and generate no uncertainty when multiple daily averages are calculated.
Your base assumption is wrong therefore any edifice you build upon that base is wrong.
May I add my 2p worth, I purchase a strip of 10 47kohm resistors marked Yellow, Violet, Orange, Brown. For this discussion it is the Brown band that is important which is 1% tolerance for each resistor, this should form a normal distribution with the median at 47 kohm with outliers no more than 470 ohms away from the median. Each resistor would be 47kohms +/- 470 ohms, with the population forming the Gaussian distribution.
If using an analogue meter then the rule of thumb that I was taught many years ago is the error is +/- half the smallest division on the meter and if taking the average should restrict the number of decimal places to be the same as describe the error. For example if the thermometer is marked in 0.5 degrees then the error is +/- 0.25 degrees and the average should have no more than two places of decimals
Thus individual readings of a measured quantity can have a +/- absolute value but a population will have a range of errors, which would probably be in the form of n standard deviations.
John ==> And there you have it. You cannot measure your resisters to any closer than 1/2 the smallest division on your meter. What is exact value is will forever be unknown once you disconnect your meter and write the value as X ohm +/- 0.y ohms. ALL of your resistor measurements will have the same absolute uncertainty deriving from the measurement instrument and method. ( +/- 0.y ohms, 1/2 the smallest division on the meter).
If you are manufacturing the resistors, or evaluating the resistors, you may want to know the standard deviation values….but they SDs are not measurements and tell you NOTHING about the individual resistors, and definitely not their real individual values.
“this should form a normal distribution with the median at 47 kohm with outliers no more than 470 ohms away from the median.”
Not true for a normal distribution, where there is no prescribed limit. In normal usage, two thirds of instances would be within those limits.
But the case of resistors may be special. My information is at least fifty years old, and may be out of date. But as I understand, they don’t have a machine that specifically makes 47K 1% resistors. Instead they produce a broad range of resistors, maybe near 47K, and then test and label. The ones that get the brown label are just the subset that qualify. Others get other labels.
That would mean that the distribution is more like a uniform distribution.
It seems to me that much of the discussion here does not differentiate between the error of the measurement with the variability of the system being measured. The air temperature where I grew up varied between lower than -40 to higher than 100 F. In documenting something with that range, a measurement error around a half a degree could be considered a ‘perfect’ measurement for most analyses.
I appreciate that when looking for annual trends of a couple of degrees over decades, a half a degree is important. Unless, though, it can be shown that there is a systematic bias in the same direction in all of the instruments used in the measurements, that, too, can be assumed to be negligible in most analyses. A point is that the variability of the process being measured should be analyzed differently than the variability of the measurement system, and this is often not considered.
I was in a running group discussing uncertainty with a PhD biologist in Cambridge and how NASA and NOAA and other groups neglect it, particularly with global temperature averages. He was getting increasingly arrogant about how all you had to do was increase N to get lower uncertainty. I said screw it have all 8 billion people stick out their wet fingers and give an estimate of temperature and that should provide more certainty. Do you believe that? An robotics engineer who was also in the group heard the conversation and just said to the biologist: “No. That’s not how uncertainty works.”
The basic problem is that they don’t understand that error is not uncertainty; I keep posting this over and over, but it never sinks in. If you confront them with the nonphysical result that “uncertainty” goes to zero as N grows, they either ignore it or put up some word salad.
This is a great discussion of an interesting subject, but I think much of the back and forth has to do with different assumptions being made by different commenters on the meaning of specific terms as well as assumptions (or misinterpretation) of commenters as to what other commenters mean by the various terms they have used.
First, the algebraic formula for the average (for excel users) or the mean (for MATLAB users) of a set of numbers should perhaps not be considered an axiom from which conclusions can be derived by algebraic manipulations. The term uncertainty has many meanings to many different communities and it is not necessarily directly related to probabilities. In fact, standard statistics texts, such as Kendall’s, do not include the term in their treatments. The metrology community generally connects uncertainty to probabilities and due to the remarkable emergence of the normal distribution from many measurement processes, relies extensively on the normal distribution if no other distribution is specifically indicated (e.g. lognormal, poisson, etc.). The metrology community further separates uncertainty of a measurement into a random component (for which precision narrows if more measurements are made) and a non-random component, which does not improve with increasing measurements. This separation and the math treatment including both is extensively and definitively treated in the variety of metrology standards, which several commenters have cited (NIST, JCGM, ISO, etc.). For the random component, the algebraic formula given for the mean emerges as the consequence of the definition in statistics of algebraic mean, which is the integral over the domain of the random variable i.e. integral (xdF) where x is the random variable and F is the cdf. If F happens to be the normal cdf, then the common formula emerges as giving the maximum likelihood for the particular set of measurements (or samples) used. If the underlying distribution is not normal, then the algebraic formula is different.
Second, the algebraic formula with the plus/minus after it should also perhaps not be taken as an axiom from which to derive algebraic properties. If one is focusing on probabilities and using distributions, then the common quantities of interest are location, dispersion, and skewness (there are others, but these are the big three). Measures of location include the mean, median, mode, etc. but serve to present a single point estimate of the distribution, even if there are multiple random variables. Dispersion is generally a vector differential property and measures the scale of the distribution, i.e. how rapidly it spreads. So the two numbers, the one before the plus/minus and the one after the plus/minus have quite different interpretations and do not readily lend themselves to simple algebraic manipulations. This underlies the emergence of the partial derivatives in the error propagation protocols and the square root of the sum of squares reflects the common notion of a metric in a multidimensional space. The metrology community applies the same principle to the non-random components likely for lack of a better approach.
Third, the metrology approach assumes the quantity being measured, which they call the measurand, has a “true” value, exact in some idealized view. The notion is that if only we could increase the precision of our measurements and identify all non-random components, we could know the value of the measurand to ever more precise quantification (subject to quantum mechanical limits perhaps). Rutherford is often attributed as saying “If you need statistics, you ought to have done a better experiment.” Whether he actually said that or not, the notion underlies metrological (?) thinking, that there is some “true” value, stationary in some appropriate space, we are trying to measure. Personally, I am not convinced that taking an arithmetic average over any given set of measurements necessarily leads to an estimate of some “true” value. For example, in the many body problem, averages over some physical quantities are not particularly productive and only result in chaotic and non stationary results. An underlying notion in this discussion is that the mean of the set of temperatures does in fact have some “true” value, which could be known if only we “do a better experiment” in Rutherford’s words. That has been the subject of many conversations about extensive versus intensive variables, etc. and likely will continue to be so, but I think it should be the grain of salt one takes into any discussion of treatment of climatological uncertainties.
Another issue of vital importance: temperature measurements at a single location are essentially digitizing a time-varying signal. Because the temperature is continually changing, there is exactly one chance to measure the value at any given time before it is gone forever. There is no statistical sampling, no multiple measurements to average, and no standard deviation. The uncertainty of a single temperature result will be that of the measurement system and there is no way to reduce it with statistics. Yet there are plenty of people in this thread that believe it is possible.
The reported uncertainties on global mashups of environmental variables simply reflect the opinion of the researcher. The quantitative methods are just window dressing. We know it’s all nonsense, particularly when subsequent versions of adjusted data fall way outside the reported bands of previous generations. What a farce
And that’s only across one iteration. Does Hadcrut5 honor the reported uncertainty of Hadcrut3? certainly not! Nor does it honor Hadcrut4.
One notices in science that the further removed one becomes from the act of taking the measurement, the less respect they have for the meaning of such measurements. This appears to give license for all kinds of monkey business.
We see the limits of quantitative methods all the time, in both common research and skeptic circles. One may choose rules to reduce uncertainty when it suits your message, or one may choose rules to expand uncertainty when it suits your message. One may choose to depict uncertainty prominently when it suits your message, or leave it out of the analysis completely when it suits your message. Look for it, it’s everywhere.
This issue comes up in measurement laboratory accreditation. According to the standards that govern accreditation, especially ISO 17025, a lab is required perform an uncertainty analysis (UA) for the numerical measurement results they report or provide as a service. The UA is examined by the accreditation agency before the accreditation is approved, principally for adherence to the GUM. Note that the lab is free to choose either overly broad or overly narrow uncertainty limits, for whatever free-market reasons. Afterwords, the lab is required to participate in intercomparison testing with other labs on a regular basis—this is the point where the uncertainty limits are tested.
But you are right, research publications are not constrained by any similar standards. Climatology is a prime example.
The methods of quality assurance in a lab setting for the outputs of manufacturing products or medical tests is a different animal completely than earth observation in space and time.
Absolutely correct, and the people engaged in such are not required to employ them.
I am troubled that this group, which probably is the best educated and brightest on the internet out side of some specialty physics groups, can’t agree on what should be fundamental principles and procedures. I suspect there may be a generational gap between skeptics like myself and the minority commenters, but I have no good evidence to support that.
I was a sophomore in high school when Sputnik was launched in 1957. Everyone was concerned about us losing the Space Race. The response was to re-do the science and math curricula, making them more rigorous. However, that didn’t last long. By the 1960s, things had started to go downhill. That was accelerated when college grades started being inflated with the good intentions of keeping college males out of the draft and sent off to Vietnam. So, I don’t have as much faith in the competence of those who graduated from high school since the 1970s. Whether that plays into the situation, I don’t know. But, I’m concerned it may.
I think faith is the keyword here, in a different context. There is transparent unabashed parading of a desire to compute a preferred result. There are no limits to the depths of ideological bias on display. I think it goes much deeper than education.
I’ve told this stroy before: I leaned the fallacy of calculating “true values” from different things in my first electrical engineering lab in 1968. I was on a team of eight students assigned the project of each person building an amplifier and measuring it to describe its characteristics. We all thought it would be kosher to average all eight set of measurements to get a “true value” we could give to the lab assistant. We all got F’s. Measuring different things does *not* give a true value for anything. Each individual amplifier had its own characteristics plus measurement uncertainty. If we had each put down our own measurements plus uncertainty and then included the average value plus the propagated uncertainty for all eight we would have all received A’s.
It’s truly scary to think about what they are teaching engineering students today.
It would seem reasonable for the climate community to leave themselves suitable wiggle room in their reported uncertainties if they wish to continue with the adjustments. Otherwise there is no point in reporting such bands considering they are deemed to be worthless and not respected.
What we observe is that there is no purely objective methodical tool to determine these bands. The purists will refute this – but one must understand that science is not equivalent to number theory. One must step outside the rules of computation and think sometimes. That is what makes science so fascinating.
If anyone is struggling to understand how we fool ourselves and others using all this quantitative trickery simply observe the plot which I have provided
https://wattsupwiththat.com/2022/12/09/plus-or-minus-isnt-a-question/#comment-3649437
Most users will think of this as uncertainty of global average temperature anomaly. That is effectively what it’s trying to communicate, and that is what users will take away. The blue band from the era of HadCRUT3 and the red band the era of HadCRUT4. I have no interest in semantic quibbling.
Notice that the range in red is outside the range in blue. Are the ranges indicated not meant to communicate a reasonable sense of where the ‘true’ value may be? Here is clear evidence that the procedures have failed.
Needless to say, in the era of hadCRUT3 a more truthful indication of uncertainty actually encompasses at minimum the combined ranges of blue and red. The proof is there, the methods have failed.
Using this meta analysis, it seems reasonable to me that we may thus judge the uncertainty to be at minimum twice the range commonly reported, and very likely more. We must humbly admit that we have fooled ourselves and others by the misuse and abuse of quantitative methods in our guesstimates of historical timeseries.
This relates to the weasel words “may” or “could” used in the conclusions or summaries of many (if not most) so called scientific studies. Forgotten is that These words imply that the opposite may actually be true. After all “may not” has essentially the same meaning as “may.”
Kip,
I am glad to see that you have been working on this. Tim and I have been writing and discussing how to address this. One of my papers is currently up to 13 pages which is too long for this.
Do not let the naysayers lead you down the incorrect path of describing the use of temperature data to calculate uncertainty. Experimental uncertainty is designed to determine uncertainty from multiple replications of the same thing, be it chemical experiments, manufactured products like springs, or physical phenomena that is constant like gravity, or mass.
Temperature readings are single readings of a single device and are not repeatable. Repeatability is needed to determine a “true value” of a single measurand, and temperature readings just don’t meet that requirment.
Consequently, one can not determine the uncertainty of each reading (nor how it may change over time) by using multiple measurements of of the same thing by the same device. One is left with single readings with unknown uncertainty. That requires the use of nominal specified uncertainty by a standards body. The NWS and NOAA have supplied these in their documentation and those are what should be used as minimum values.
From that point standard Significant Digit Rules and other well known physical lab requirements should be followed.
It is poignant that the NIST 1900 document Example E2, determined a standard uncertainty interval of 1.8C at 95% while assuming no measurement uncertainty in a single months temperature readings. In essence, the variance supplied by the NWS for MMTS readings would expand that even further.
Jim G ==> Thanks for this. My essay is on the simplest level possible, which was perhaps a mistake — what I write is true, simple, basic and illustrable in simple diagrams. The more educated readers here abhor simplicity — they need to apply their higher maths and statistical knowledge and skills to every problem to justify their educational efforts. A hammer for every job.
Me? I’m just a science journalist, ultimately pragmatic. If we don’t know, we don’t know…we needn’t make things up.
If we know how uncertain a measurement was, we should just say so.
Kip Hansen said: “what I write is true”
The uncertainty of a sum is computed as the sum of the uncertainties. That just simply is not true.
And NIST 1900 E2 is an example where the uncertainty of the average is computed via u(avg) = u(x)/sqrt(N). Specifically the author does a type A evaluation to get u(x) = 4.1 C and then plugs that into the formula to get u(avg) = u(x)/sqrt(N) = 4.1 / sqrt(22) = 0.9 C. And had NIST elected to use the type B evaluation of uncertainty for u(x) instead of the type A evaluation the result would have been closer to u(avg) = 0.5 / sqrt(22) = 0.1 C.
You miss the whole issue completely. The E2 example says:
(bold by me)
The last one is the important one. Calibration uncertainty … and no other significant sources of uncertainty are in play.
There are reasons Dr. Possolo decided to proceed as he did. Remember, this is an example, not a formal scientific endeavor to analyze the temperature measurements.
I don’t understand why you posted this as a good example and are now indicating that it is incorrect.
The big thing to take away is the first thing I quoted. That the measurand is a statistical parameter of the temperature distribution. As such it should be analyzed in that format.
You need to ask yourself why he decided to analyze the uncertainty of the mean, i.e., SEM as a good indicator of how accurate the mean describes the variance in the sampled distribution.
He obviously did not want to do the treatment of evaluating a standard uncertainty based upon the NWS and NOAA specification of uncertainty. Remember, this specification is directly translated into the uncertainty of the mean since every data point carries that uncertainty.
Please note that Dr. Possolo also used a Student’s T factor to better display the uncertainty in the mean. You indicate that he calculated an SEM = 0.9°C. You forgot(?) to include the fact that he increased that to ±1.8°C by using a Student’s T factor of k = 2.08!
Threr is a reason for this if you would read the whole document.
Dr. Possolo explained in NIST 1900 that using “s/√n” is a rather unreliable evaluation for smaller samples (such as a month of temperatures). This means he is declaring one month of temperatures from one station as a sample. He goes on to say that that Student’s T in fact captures all of the shades of uncertainty under consideration and fully characterizes the uncertainty associated with the average as estimate of the true mean of that sample.
Kip,
I don’t think that’s the actual reason. Our education implants absolutely unquestionable axioms so deeply that it isn’t even possible to be aware of them, let alone question them.
Unfortunately, the mathematicians, statisticians and metrologists have slightly different axioms. A claw hammer, sledge hammer and machinist’s hammer all have different use cases.
Mr. Hansen
Thank you for an interesting stimulus to think about things. I learned a lot from the back and forth. The discussions here are certainly spirited, which is good, but perhaps the goals of some here are more personal in some way than benignly scientific. The word “true” is used a good deal in the discussion but my bias is against using it. In my experience, seeking truth is more a goal of philosophers or prophets than of scientists, although I admit I speak from the bias of hard-core physics along the lines of Feynman’s approach. In that approach a scientist’s aim is to produce a theory, express it quantitatively, compare its predictions with experiment, often called Nature, and if the prediction and Nature disagree, then declare the theory wrong, Nature is always right. In addition, a theory is never assessed to be right or “true”, only not wrong and useful for the time being.
In this view, the purpose of analyzing experimental or observational results is to be able to compare them to predictions of one’s theory to determine if the theory is wrong or not, preferably with some notion of probability involved. I think part of the difficulty of climatology (compared, say, to physics or chemistry) is the lack, at present, of predictions of definitive, non-arbitrary, quantitative near-term measurement outcomes. This complicates the assessment of uncertainties in aggregated observations, since it is hard to say what precision and accuracy is good enough to distinguish a prediction from an observation, or to distinguish between different predictions. In that situation it is easy to run down many rabbit holes chasing objectives that are more illusory than useful. Personally, I think a global average of temperatures may have more in common with a stock index than a physical measurement, but that is just my bias as an outside observer.
In any event, I would recommend avoiding getting into arguments about what is true. Whenever I am tempted to do that, I remember one of my favorite poems, by Stephen Crane:
I saw a man pursuing the horizon;
Round and round they sped.
I was disturbed at this;
I accosted the man.
“It is futile,” I said,
“You can never —”
“You lie,” he cried,
And ran on.
It basically boils down to the fact that one can not create precision that was not in the original measurements. Climate science has made an art of this by using bogus statistics.
I would like to point out that using pastel colors in fonts generally makes them illegible. They may be pretty but they are difficult to read. There are many colors available on palettes that provide the high contrast which make for easy reading. Pastels don’t do it!
I have added an Epilogue and Post Script at the end of the essay, if there is anyone still following this post.
Thanks for reading and participating.
Kip,
It might be relevant to note that I covered this topic with a series of 3 WUWT essays earlier this year. My second essay attracted some 800 comments, which seems to put your essay in the class of repetitive duplication. Geoff S
Geoff ==> Thanks for bringing up your previous essays on “Uncertainty Of Measurement of Routine Temperatures” Parts 1-3.
For readers interested in that series, the link is this.
The links you give in the article are just plain incorrect.
https://www.thestudentroom.co.uk/showthread.php?t=2661762
https://sciencing.com/how-to-calculate-uncertainty-13710219.html
These sites give erroneous information on the propagation of uncertainties.
They only happen to be correct in the special case when the uncertainties are all +/-1, because of course 1 squared =1.
The fact that you present these sites as some kind of gospel while ignoring the many authoritative sites on this subject really makes me wonder.
Really?
(3.4±0.2 cm)+(2.1±0.1 cm)=(3.4+2.1)±(0.2+0.1) cm=5.5±0.3 cm
These uncertainties are both +/- 1?
Measurements of high school football fields, made rather roughly to the nearest foot or two (0.3 to 0.6 meters), say by counting the yardline tick marks on the field’s edge, give a real measurement uncertainty of +/- 24 inches.
Hang on, what? You are making multiple small measurements in an attempt to determine an overall length is going to give you tolerance stack differences.
Basic Drafting (actually… basic drafting is ‘draw straight lines’… but you get my point). Selecting where you place your starting datum can significantly affect the way a part could be manufactured.
If you want to dimension a 100m long field you draw a dimension line that says 100m. Then the tolerance ONLY applies to that single dimension. You do not draw 20 each to each dimensions of 5m because in engineering terms they are not the same thing.
However, if the relationship between the tick marks IS important then you wouldn’t be use using simple linear tolerance, you would apply Geo Tolerance to the feature. This would mean that the position of each tick mark now needs to be calculated relative to each other.
This sort of thing is why I my no degree bum is paid more than the junior engineers.
I sort of think my God is different from yours.
Kip,
A link that works for Part 2 of my essay series of September this year, on temperature measurement uncertainty is here. It attracted 839 comments, which is quite large for WUWT. Geoff S
https://wattsupwiththat.com/2022/09/06/uncertainty-estimates-for-routine-temperature-data-sets-part-two/
There is an uncontrollable variable here. On one occasion apparently Anthony terminated comments at around 400. It was probably justifiable because it was getting down to a back and forth pretty much just between myself and Donald K.
You are spending an awful lot of words to try to describe the fact that calculating the error of a mean value when you make multiple measurements of the same is different from when you make multiple similar measurements of something varying in space and time. For temperature that varies both spatially and temporally, you will have to make multiple measurements at the same physical location AND the same time WITH the same type of instrument in order to claim the ability to reduce the error.
When you use an awful lot of words there is a growing probability some you will make some mistake that someone who wants to distract from the actual story can latch onto.
That is a point that I have been trying to make with my distinction between stationary and non-stationary data. However, it hasn’t gotten much traction in the comments.
I think it hasn’t gained much traction with the climate alarmists. It’s not to their advantage to accept the distinction.
“ For temperature that varies both spatially and temporally, you will have to make multiple measurements at the same physical location AND the same time WITH the same type of instrument in order to claim the ability to reduce the error.”
I would venture to offer that there is *NO* field temperature measuring device that has no systematic bias. So even if you take multiple measurements at the same time and at the same location there is no guarantee that you will get perfect cancellation of the measurement uncertainties.
And don’t overlook additional uncertainty from things like multiple operators, calibration drift, temperature coefficients..these will never be cancelled by multiple measurements.
“The hammer-ticians have argued among themselves as to which hammer should have been applied to this job and how exactly to apply it.”
The hammers in this case being metrology, uncertainty and statistics. These seem appropriate hammers to use when discussing measurement uncertainty and the uncertainty of the mean.
Any port in the storm that gives you the loooooooooowwwww numbers!
Bellman ==> No, this job requires only arithmetic, a pencil (if you wish a more permanent record) and maybe a school-kit ruler.
No statistics needed or wanted.
Just because you can do a calculation on the back of an envelope doesn’t mean it’s correct.
Interval arithmetic, is I’m sure useful for a lot of things, but is only going to tell you what the worst case is. If I add the result of ten six-sided dice, the arithmetic will only say the result could be anywhere between 10 and 60, but it won’t tell you what the odds of rolling 60 is.
Bellman ==> A true simple historgram of a thousand rolls will tell you precisely that. Don;t need stats for that — too simple.
In fact, I am using a dice rolling example in the next essay — so hold onto your expertise in that realm.
What do you think throwing the dice 1000 times and producing a histogram is if not stats? It’s certainly not the interval arithmetic you said was all that was needed.
Rolling dice is a counting experiment. It is not a measurement experiment.
“but is only going to tell you what the worst case is.”
That’s the whole point. In the *REAL* world, you need to know what the worst case is. You do *NOT* design a bridge based on anything less than he worst case. It would subject you to criminal and civil liabilities to do so. It simply doesn’t matter if 68% of the time the bridges you design work fine. It’s the other 32% that are problematic!
“Rolling dice is a counting experiment. It is not a measurement experiment.”
I feel sorry for you if you can’t see the connection.
“In the *REAL* world, you need to know what the worst case is.”
You need to stop trying to hit everything with the same hammer. Sometimes it might be important to consider the worst case, other times it’s better to look at the the most likely range. If you are building a bridge you might have to build the heaviest plausible load, but even you probably wouldn’t worry about the 1 in 10^100 cases.
But in many cases a 95% or so confidence interval is all you need. It doesn’t matter if a few months have an anomaly outside the stated confidence interval. There’s no liability.
“I feel sorry for you if you can’t see the connection.”
There is no connection. The uncertainty the average of a count is estimated as the sqrt(count).
“Sometimes it might be important to consider the worst case, other times it’s better to look at the the most likely range.”
Thus speaks a statistician who ignores reality.
“ If you are building a bridge you might have to build the heaviest plausible load, but even you probably wouldn’t worry about the 1 in 10^100 cases.”
You can’t even get this example right. Therein lies the absolute lack of understanding you have of the real world.
In building a bridge you are given the maximum load allowed – how many bridges have you seen with a maximum load sign?
What the engineer then has to consider is things like the shear stress the beams in the bridge can handle – including the worst case uncertainty in order to make sure the combination of all the beams will be able to support the given maximum load.
Everything you post puts you in a different world than the rest of us actually live in!
“But in many cases a 95% or so confidence interval is all you need. It doesn’t matter if a few months have an anomaly outside the stated confidence interval. There’s no liability.”
Really? No liability? So we taxing trillions of dollars out of the citizenry to pay for the idiotic net-zero plans of the elites in government is not a liability for the common man? Raising prices of heating your home to outrageous levels is not a liability to the common man paying those prices?
These may not be criminal liabilities but they *are* civil liabilities and those liabilities will be laid at the feet of the politicians pushing the policies causing them at some point in our electoral process.
Plus maximums are then increased in a real design for safety factors.
And this dose of reality earned you a downvote from the trendologists. They wonder why they aren’t taken seriously.
I’m pretty sure that none of them has ever designed or made something where they *needed* to be right or other people would be impacted. I’m pretty sure most of the climate scientists were told that if they needed measurements analyzed to go find a math major – who probably has *no* idea of how to handle measurement uncertainty so they just ignore it.
Yes, but now climatology is so ingrained into society that people are being impacted in large and multiple ways by these errors.
This is why your moniker “bellcurveman” is so apropos.
Mr. Hansen,
It strikes me it would be useful to clearly state the objectives of the calculational methods you are proposing. It seems from the comment stream that people have differing perspectives on what the intended use would be for the methodology you are discussing.
I think many of the commenters here come from the scientific community (as do I). In that community, the primary purpose of analyzing data and its uncertainty is to be able to use the result of the analysis to distinguish whether a proposed theory (in this case a climatological theory) is wrong or not, or whether one proposed theory is better than another. The present practice of science does that by evaluating observed or measured data in such a manner as to be able to make a probabilistic statement about the likelihood of a specific quantitative prediction of a variable measured or observed under specified or prescribed conditions being within the uncertainty bounds of the measurements (or observations). This is essential to be able to determine whether our technical understanding of the phenomena is wrong and needs to be modified. In modern times, this is rarely possible with paper and pencil.
It sounds like your proposed calculational method is not designed to serve that purpose, which is fine, but in that case, it would be useful to the conversation to say specifically what purpose the calculations you propose would serve. In other words, what could be done with your proposed calculations that can not or is not being done now, and what would be the benefit to the progress in climatology?
fah ==> 1) I am not analyzing data any data. I am simply pointing out an excruciatingly simple fact about original absolute measurement uncertainty.
2) I am not doing anything climatological– not even really weatherly. 3) I make no attempt at probabilistic statements about anything.
This essay is ONLY about the fact that many measurements (I use temperature and tide gauge sea level) have a known, numerical original measurement absolute uncertainty (or, some use, original measurement absolute error).
And what happens when we add and divide such measurement to find a mean of those uncertain original measurements.
It is not a proposed calculation method, it IS the proper arithmetic method.
Thanks for clarifying.
It is not a proposed calculation method, it IS the proper arithmetic method.
If so, show us one reputable source which backs up your method of uncertainty propagation.
Taylor Eq. 3.4 and Eq 3.18, the first is direct addition of uncertainties and the second addition by root-sum-square.
………………………………………………………..
From the MEASUREMENT STANDARDS LABORATORY OF NEW ZEALAND
8.1 Models involving only addition and subtraction
When a model only adds or subtracts terms, the uncertainty can
be calculated by taking the square root of the sum of the squared
standard uncertainties. That is, when the model has the form
Y = w ⨁ X1 ⨁ X2 ⨁ …., where w is a number, X1, X2, … are quantities, and ‘⨁’ means either ‘+’ or ‘-‘. then the standard uncertainty of the result
u(y) = sqrt[ u(x1)^2 + u(x2)^2 + …]
is evaluated from the standard uncertainties of the estimates of
X1, X2, … . Note, the number w makes no contribution to the
uncertainty, because it has a known value.
………………………………………………………
Direct addition is used when no cancellation of random error is assumed and root-sum-square is used when there might be *some* cancellation but not complete cancellation.
In other words the Measurement Standards Laboratory of New Zealand does not back up direct addition. It backs up what I said.
Reading comprehension block noted.
I’m assuming that this paragraph is added by Tim Gorman himself
Direct addition is used when no cancellation of random error is assumed and root-sum-square is used when there might be *some* cancellation but not complete cancellation.
and is not a quote from the New Zealand measurements standard lab.
It’s from Taylor’s 2nd edition of “An Introduction to Error Analysis.
I’ve given you a quote from his book laying this out. Is Taylor a liar or just an ignoramus in your opinion?
You implied it was from NZ Lab.
No, I didn’t. Get a new pair of reading glasses.
Why do you think I put the NZ quote in between two horizontal lines?
It is a statement about correlated quantities and is equivalent to the discussion in D.3 and specifically equation 9 of your UKAS document. The person you are engaging with is deflecting and diverting from the conservation by moving the conversation from random (uncorrelated) quantities to non-random (correlated) quantities and hoping no one will notice the diversion. Other tactics include conflating a sum Σ[x_n, 1, N] with an average Σ[x_n, 1, N]/N (yes, I’m serious), conflating the uncertainty of the average u(Σ[x_n, 1, N]/N) with the average uncertainty Σ[u(x_n), 1, N]/N, pretending like Σa^2 = (Σa)^2 is a real identity when trying to work with the equations in the Taylor reference. I should point out that everything in the Taylor reference is consistent with everything in the UKAS reference and with your statements.
You forgot to whine about “contrarians”, bgwxyz.
HTH
Rest of the lies in your post ignored.
Thanks, you’ve clearly got a better handle on the maths than me, but I knew he was getting confused with correlated uncertainties.
You need new reading glasses and a remedial algebra course.
You determine uncertainty term by term, not by average.
Average uncertainty is *NOT* the uncertainty of the average.
When you find the uncertainty of a quotient you add the uncertainty of each term together.
(u(avg)/avg)^2 = [ u(x)/x ]^2 + [u(y)/y]^2 + [ u(2)/2 ] ^2.
It is *NOT* [u(x) + u(y) ]^2 / 2
It is *NOT* [ u(x) + u(y) ] /2
Which is what you are trying to convince everyone of.
The uncertainty of the denominator terms is *added* to the uncertainty of the numerator terms. The uncertainty of the denominator terms is *not* divided into the uncertainty of the numerator terms.
My remedial algebra course taught me that + is a sum and / is a quotient.
Yet you seem hellbent on treating a + (sum) as if it were a / (quotient) while simultaneously creating strawman or explaining trivial things that everyone here already knows.
How else are you justifying the use of Taylor 3.18 and only 3.18 which says unequivocally that it only applies to * (products) and / (quotients)?
“Direct addition is used when no cancellation of random error is assumed and root-sum-square is used when there might be *some* cancellation but not complete cancellation.”
Common sense *should* tell you that when nailing two boards together of different lengths that you don’t have a situation where cancellation of uncertainty can happen, even partial cancellation. If you want to insure that beam can span a foundation then you better make sure you add the uncertainties directly and use the smallest number. Otherwise you are gong to have to scab something in thus wasting lumber.
There *are* situations where you can’t assume uncertainty cancellation. Adding in quadrature *REQUIRES* that all uncertainty intervals be independent and include only random effects. Any systematic bias in the uncertainties do not meet this requirement.
For any single measurement u_total = u_random + u_systematic.
If you do *not* know the values of u_random and u_systematic then there is no way to assume what the cancellation of uncertainty will be.
You can certainly add the uncertainties in quadrature as well as directly. The actual final uncertainty will lie somewhere in between the two values. Are you looking at something that could affect the lives or livelihood of others? Then you better use the worst case or criminal and civil liability could come back to haunt you at some future point. If you are designing a balcony off a 10th floor apartment you *better* use the worst case.
https://www.cnn.com/2015/06/16/us/california-balcony-collapse/index.html
Do you suppose the engineer that designed this balcony had some repercussions?
It’s just the equivalent of using a high confidence level for the uncertainty. For example 99.9% instead of the more usual 95%.
There are risks in the approach you support. By taking extreme limits of the total uncertainty you could find that a part does not fit, or that you have cost your company a lot of money.
Taking the extreme limits of the total uncertainty probably won’t keep parts that don’t fit. It might cost the company a lot of money BUT if that cost saves in liability costs in the future it is usually money well spent.
If it’s uncertainty in the heights of doors you can always just write off warranty costs and give the customer a new door. If its a seal on the space shuttle booster you should be a *lot* more careful.
Much of the problem here is trying to ascertain how climate science can state temperature change to the one-thousandths decimal when the resolution of the measurements is only recorded to the nearest integer. Statistics will not, in any science, allow the increase of precision beyond what was measured.
You’ve had multiple persons quoting multiple authoritative references telling you the reverse.
Quoting *what* multiple authoritative references? I haven’t seen any.
All I’ve seen is that the standard deviation of the sample means is the uncertainty of the mean – which is only true in one situation with very restrictive assumptions.
You can throw every statistics text and even NIST’s own calculator in your post and it won’t make a lick of difference with some of the participants here. If they can’t be convinced that sums are different than averages then there’s no way they’ll ever be convinced that NIST or UKAS knows what they are doing or they’ll make things up that NIST or UKAS never said or claim that you’re just abusing what they said or using their calculators wrong. Been there done that.
If they can’t be convinced that sums are different than averages, or trends.
if q = x + y (i.e. a sum)
the u(q) = u(x) + u(y)
q_avg = (x + y) / 2
u(q_avg) = u(x) + u(y) + u(2) = u(x) + u(y)
thus u(q) = u(q_avg)
The uncertainty of the sum and the uncertainty of the average ARE EQUAL!
Nothing can be allowed to threaten the holy and sacred trends.
Trendology at its acme.
They don’t understand that trends lie within uncertainty intervals. Depending on how wide the uncertainty intervals are you can have *multiple* different trends, some positive, some negative, and some zero. Which one is right?
Like they do with measurement uncertainty they just assume that all the stated values are 100% accurate and therefore the trends calculated from those stated values are 100% accurate.
And now we can add (no pun intended) another one to the list…conflation of division and addition. Notice how in q_avg = (x + y) / 2 in the post here it is being interpreted as if it were q_avg = (x + y) + 2 to get u(q_avg) = u(x) + u(y) + u(2). Nevermind that the uncertainty of q = x + y is not u(q) = u(x) + u(y) when x and y are independent anyway.
For the lurkers Taylor says (and is consistent with JCGM 100:2008) you should use equation 3.16 for addition/subtraction and 3.18 for multiplication/division for independent quantities.
Well now…what is the reality…
q_sum = x + y
dq/dx = 1
dq/dy = 1
u(q_sum) = root[ 1^2 * u^2(x) + 1^2 * u^2(x) ]
= RSS[ u(x), u(y) ]
r = 2
dr/d2 = 0
u(r) = 0
s = q_avg = q_sum / r
ds/q_sum = 1
ds/dr = -1
u(s) = root[ 1^2 * u^2(q_sum) + (-1)^2 * u^2(r) ]
= root[ u^2(q_sum) + u^2(r) ]
= root[ u^2(q_sum) + 0 ]
= u(q_sum)
So…
u(q_avg) = u(q_sum)
Oh my! How did this happen?!??
karlomonte said: “Oh my! How did this happen?!??”
It is because you incorrectly computed ds/dq_sum = 1 and ds/dr = -1. Neither of those are correct. Try again. This time use https://www.symbolab.com/solver/partial-derivative-calculator to help you out.
karlomonte,
You’re on the verge of a major breakthrough. Don’t stop now. Compute ds/dq_sum and ds/dr correctly and plug the correct values into your derivation here. Remember that symbolab has partial derivative calculator if you need it. You’re so close!
You are as bad as bellman at reading simple algebraic equations.
No one is calculating an average as x + y + 2.
They are calculating the uncertainties as additions
u(x) + u(y) + u(2)
Did you leave your reading glasses in the car?
Attached are Eq 3.16 and 3.17 from Taylor. You are trying to gaslight everyone over what they say!
Eq. 3.18 is what is used when doing quotients and products such as when doing an average
It shows *exactly* what we are talking about as well.
u(q_avg)/q_avg = sqrt[ u(x)/x ^2 + u(y)/y ^2 + u(2)/2 ^2 ]
The last entry is the uncertainty of the divisor of 2. You do *NOT* divide the sum of the uncertainties of x and y by 2 in order to get the uncertainty of the average.
The uncertainty of the average is *NOT* the average uncertainty. [u(x)/x ^2 + u(y)/y ^2] / 2 is an average relative uncertainty and is *NOT* the uncertainty of the average.
I don’t understand why so many so-called climate *scientists* have such a problem with this simple concept.
The average uncertainty is *NOT* the uncertainty of the average!
TG said: “Eq. 3.18 is what is used when doing quotients and products such as when doing an average”
Do think an average is devoid of a sum?
TG said: “u(q_avg)/q_avg = sqrt[ u(x)/x ^2 + u(y)/y ^2 + u(2)/2 ^2 ]”
Taylor 3.18 says that:
u(q)/q = sqrt[ (u(x)/x)^2 + (u(y)/y)^2 + (u(2)/2)^2 ]
is for the quantity:
q = x * y / 2
That does not look like an average to me.
Why do you always insist in having your nose rubbed in it before you learn?
Break it down.
q = (x+y)2 = (x/2) + (y/2)
Now, you have two terms whose uncertainty will add.
what is the relative uncertainty of x/2?
It’s u(x)/x + u(2)/2 = u(x)x
Same for y: u(y)/y
So now we get:
[ u(q) / q ]^2 = [ u(x)/x ]^2 + [ u(y)/y ]^2
Now tell us how you can’t break down (x+y)/x.
And then tell us how the uncertainties of x/2 and y/2 can’t be added.
TG said: “q = (x+y)2 = (x/2) + (y/2)”
That still contains a + (sum).
TG said: “[ u(q) / q ]^2 = [ u(x)/x ]^2 + [ u(y)/y ]^2″
Look at 3.18 above. That’s the uncertainty formula for q = x/y or q = x*y. Remember, the goal is to find u(q) when q = (x + y) / 2.
Can you fix this math mistake and resubmit for review?
Clown.
karlomonte, I think you’re on the verge of a major breakthrough yourself. Would you mind fixing the partial derivatives in your derivation and resubmitting what you get. Symbolab.com has a partial derivative calculator that you may find helpful.
Pin a bright shiny tin star on your clown hat.
You just like your guru, Nitpick Nick.
What’s truly sad is level of basic math scientists have today.
They can’t even understand that (x+y)/2 *is* x/2 + y/2
In case you hadn’t noticed, karlo is the one who insisted they are not the same.
Your problem is that you know they are the same, but you still don’t understand the difference between adding and dividing.
You’ve lost me now – too deep in the weeds without landmarks.
Are you saying that the mistake involves the calculation of an arithmetic mean containing addition and division (trivially true), or that the uncertainty of the numerator is divided by the uncertainty of the denominator (3.18 says all uncertainties add)?
He doesn’t understand that in the equation
q = x/y
that y can be a constant!
I was saving the divide by zero for later…
That’s why all of this must be done in Kelvin to be correct.
Not necessarily.
You could use Rankine.
TG said: “He doesn’t understand that in the equation q = x/y
that y can be a constant!”
I understand what Taylor 3.9 says about it.
When q = Bx then u(q) = B * u(X).
So if B = 1/y and q = x/y then u(q) = u(x/y) = u(x) / y.
That’s essentially a scaling operation.
For the arithmetic mean, it divides the total absolute uncertainty by the number of readings to yield what Tim characterised as the average uncertainty.
Scaling. That seems like reasonable terminology. I can accept that.
What Taylor says we cannot accept is a claim that u(x/2) = u(x)/x.
Note that Taylor 3.9 and 3.16 say u(x/2) = u(x) / 2.
I’d like to claim credit, but I saw it used somewhere in my revision reading and liked it.
old cocky said: “Are you saying that the mistake involves the calculation of an arithmetic mean containing addition and division (trivially true), or that the uncertainty of the numerator is divided by the uncertainty of the denominator (3.18 says all uncertainties add)?”
The mistake is treating a + (sum) as if it were / (quotient). Those are not equivalent arithmetic operations.
Per Taylor 3.16 (and others) the uncertainty u(x + y) = sqrt[u(x)^2 + u(y)^2].
Per Taylor 3.18 (and others) the uncertainty u(x / y) = sqrt[ (u(x)/x)^2 + (u(y)/y)^2].
What we want to find is u(x/2+x/2) or equivalently u((x+y)/2) which requires using both the sum/difference rule (Taylor 3.16) and the product/quotient) rule (Taylor 3.18) because it contains both a + (sum) and a / (quotient).
Nobody is saying x/2+y/2 is not the same as (x+y)/2.
Nobody is saying that the average of x and y is not x/2+y/2 or equivalently (x+y)/2.
What is being said is that the final result in the post here is for u(x/y) and not u((x+y)/2). This can be verified easily against the Taylor 3.18 equation I posted above.
Thanks for clarifying. If I have this right, addition/subtraction use absolute uncertainty, and multiplication/division use relative uncertainty. Is that your take on it as well?
Taylor 3.18 seems to be for the case where numerator and denominator both have uncertainty. In this case, the relative uncertainties add.
An arithmetic mean seems more like a scaling operation.
old cocky said: “If I have this right, addition/subtraction use absolute uncertainty, and multiplication/division use relative uncertainty. Is that your take on it as well?”
Yes. That is what Taylor says. Note that formally it is u(x +or- y) = sqrt [ u(x)^2 + u(y)^2 ].
old cocky said: “Taylor 3.18 seems to be for the case where numerator and denominator both have uncertainty. In this case, the relative uncertainties add.”
Yes. That is what Taylor says. Note that formally it is u(x *or/ y) = sqrt[ (u(x)/x)^2 + (u(y)/y)^2] * (x / y)
old cocky said: “An arithmetic mean seems more like a scaling operation.”
It is a mix of sums and quotients.
Here is the full derivation of u(q) when q = (x + y) / 2.
Let
q = (x + y) / 2
q_sum = x + y
q = q_sum / 2
Using Taylor 3.18
(1) u(q) / q = sqrt[ (u(q_sum)/q_sum)^2 + (u(2)/2)^2 ]
(2) u(q) / q = sqrt[ (u(q_sum)/q_sum)^2 + (0/2)^2 ]
(3) u(q) / q = sqrt[ (u(q_sum)/q_sum)^2 ]
(4) u(q) / q = u(q_sum) / q_sum
(5) u(q) = u(q_sum) / q_sum * q
Using Taylor 3.16
(6) u(q_sum) = sqrt[ u(x)^2 + u(y)^2 ]
Substitute u(q_sum) in (5) with (6)
(7) u(q) = sqrt[ u(x)^2 + u(y)^2 ] / q_sum * q
(8) u(q) = sqrt[ u(x)^2 + u(y)^2 ] / (x + y) * ((x + y) / 2)
(9) u(q) = sqrt[ u(x)^2 + u(y)^2 ] * [1 / (x + y)] * [(x + y) / 2]
(10) u(q) = sqrt[ u(x)^2 + u(y)^2 ] * (1/2)
As you can see it is not an easy derivation. Because an average contains both sums and quotients you must use both Taylor 3.16 and 3.18. You’ll notice the substitution step (7) using both step (5) [Taylor 3.18] and step (6) [Taylor 3.16].
We can extend this derivation as follows.
Let
u = u(x) = u(y)
Then
(10) u(q) = sqrt[ u(x)^2 + u(y)^2 ] * (1/2)
(11) u(q) = sqrt[ u^2 + u^2 ] * (1/2)
(12) u(q) = sqrt[ 2*u^2 ] * (1/2)
(13) u(q) = sqrt[2] * u * (1/2)
(14) u(q) = u / sqrt[2]
As you can see when the uncertainty of x and y are the same it reduces to the canonical u/sqrt(N) formula.
BTW…all of these derivations were verified by symbolab.com.
Thanks.
I think Shackleton’s ship was still stuck in the ice when I did this stuff at Uni, so I’ve been brushing up.
Up to (10), that agrees with https://users.physics.unc.edu/~deardorf/uncertainty/UNCguide.html provided the conditions are met to use root sum squares.
In the case of division by a constant, you can perform the step 9 simplification much earlier to effectively avoid the complexity of the relative uncertainty calculations, and just treat it as a scaling exercise (total absolute uncertainty divided by the number of readings)
Why did you declare u(x) = u(y) for the additional calculations, though? That will only rarely be the case.
It does give you u/sqrt(N), but that will be quite rare.
The issue that none of the trendologists will acknowledge is that the u/root(N) result is only valid for multiple measurements of the same thing. Blindly stuffing an average into the uncertainty formulas of multiple measurements of different things leads to nonphysical uncertainties. As N grows u goes to zero which is absurd. But it gets them the small numbers they need for the GAT.
karlomonte said: “The issue that none of the trendologists will acknowledge is that the u/root(N) result is only valid for multiple measurements of the same thing.”
On the contrary many of Taylor’s examples in section 3.9 and 3.10 and the problems starting on pg. 79 are of functions of different measurements of different things. Taylor makes it pretty clear. The rules he gives his readers in section 3 are meant to be used for measurements of different things even to point where those things have different units of measure.
Clown.
You basically pegged the problem. u/root(N) is the SEM. The SEM tells you how close you are to the average value. It does *NOT* tell you how accurate that average value is.
The SEM is only a valid indicator when the probability distribution is Gaussian and when all measurement uncertainty can cancel.
Neither restriction is met for temperatures.
Climate science needs to join the 21st century. Some of this can be excused because early records only recorded Tmax and Tmin (although sometimes very poorly). But many measuring stations since the 80’s make measurements much more often that twice per day. Those would give us a 30 year record to look at and analyze as something other than a psuedo-probability distribution.
I predict the whining about Kip’s next installment will be loud and long.
It’s a bit like the old saying – “if you laid all the economists in the world end to end they wouldn’t reach a conclusion”
That’s a good one.
old cocky said: “Up to (10), that agrees with https://users.physics.unc.edu/~deardorf/uncertainty/UNCguide.html provided the conditions are met to use root sum squares.”
That link uses the law of propagation of uncertainty which is JCGM 100:2008 equation 10 or Taylor equation 3.47. That is actually the method I prefer because it works for any measurement model. It’s also one of the methods the NIST uncertainty machine uses.
old cocky said: “In the case of division by a constant, you can perform the step 9 simplification much earlier to effectively avoid the complexity of the relative uncertainty calculations, and just treat it as a scaling exercise”
Absolutely. There are a lot of ways to skin the cat here. Each way gives you the same exact answer. Using Taylor 3.16 and 3.18 is a kludgy way of doing it in general. I’m just proving that TG’s preferred here works. Like I said above, I’d rather use Taylor 3.47 or JCGM 100:2008 equation 10 as I find it more elegant and works more generally.
old cocky said: “Why did you declare u(x) = u(y) for the additional calculations, though? That will only rarely be the case.”
u(x) would be close if not outright equal to u(y) in many cases. This is especially true if you are measuring x and y with the same instrument.
Uncertainty adds term by term.
The terms are x/2 and y/2.
The accuracy of the average is
u(q_avg) = sum of the uncertainties of each term in the average.
u(avg)^2 = [u(x)/x]^2 + [u(y)/y]^2 (best case)
u(avg) = u(x) + u(y) (worst case)
If you have two boards, one 24″ +/- 1″ and one 120″ +/- 2″ their average length is 144/2″ = 72″. What is the uncertainty of the two boards when nailed together?
AVG = 72, uncertainty of the combined length is sqrt[1^2 + 2^2] = ±2.23″ *IF* you assume cancellation happens. The worst case uncertainty would ±1″ + ±2″ = ±3″.
The 24″ board could vary from 23″ to 25″ The 120″ board could vary from 118″ to 122 inches.
The average of the negative side would be (23 + 118)/2 = 70.5. The average of the positive side would be (25 + 122)/2 = 73.5. So your average value should be stated as 72 ± 1.5.
It would make no sense to divide the uncertainty by 2. That would unjustifiably make the uncertainty smaller than it is.
And that is what climate scientists and statisticians do, unjustifiably divide the accuracy of the mean by 2 in order to make it look smaller.
Nor does it help to calculate the SEM. The SEM of a 24″ data point and a 120″ data point is like 48. I’m not even sure what that would tell you.
I picked widely separated numbers to try and show what happens when you find the mid-range value of daily temperatures, each of which has a measurement uncertainty. First, it’s not an average temperature, it is a mid-range temperature. Second when you average day1 mid-range temp with day2 mid-range temp you *must* propagate the uncertainties of each onto the resulting average. The uncertainty is *not* the SEM of the two values, just like it isn’t in the example above.
TG said: “u(q_avg) = sum of the uncertainties of each term in the average.”
None of the Taylor rules say that.
TG said: “u(avg)^2 = [u(x)/x]^2 + [u(y)/y]^2 (best case)”
Taylor 3.16 says:
u(avg)^2 = u(x/2)^2 + u(y/2)^2.
What you posted is Taylor 3.18 which says:
[u(x)/x]^2 + [u(y)/y]^2 = [u(x/y) / (x/y)]^2
Notice that [u(x/y) / (x/y)]^2 (what Taylor said) does not equal u((x+y)/2)^2 (what you said).
TG said: “It would make no sense to divide the uncertainty by 2.”
And yet that is exactly what Taylor 3.9 and 3.18 say to do when q = Bx where B is known exactly and x is a measurement with uncertainty u(x) resulting in u(q) = u(x) * B.
“The worst case uncertainty would ±1″ + ±2″ = ±3″.”
“So your average value should be stated as 72 ± 1.5.”
“It would make no sense to divide the uncertainty by 2.”
That’s literally what you’ve just done. 3 / 2 = 1.5.
addition/subtraction doesn’t always have to add. Say you are calculating the length of a pole you are putting up for an antenna. The pole consists of a 20″ piece of irrigation pipe and a shorter piece of something else that is 3″ long, perhaps by welding them together. Because of sag in the measuring tape you may only be able to measure the long pipe to the nearest 1″. The shorter pipe you can measure to the nearest 0.15″. Big difference in the two absolute uncertainties. But their relative uncertainties are both the same, 0.05%. Add the two absolute uncertainties in quadrature and you get sqrt[ 1^2 + 0.15^2] = +/- 1.07″. Do the relative uncertainties and you get sqrt[.05^2 + .05^2] = .07 (7%). 23 x .07 = 1.7″. Big difference. Which one would you use?
“Which one would you use?”
thje first obviously. You don’t use relative uncertainties when adding or subtracting.
The importance of the uncertainty of the shorter piece will be less than that of the longer piece to the uncertainty of the sum, but when you use relative uncertainty you’re saying they are each as important as the other.
Suppose the absolute uncertainty where the same, ±1 inch. Using the relative uncertainties you would be claiming it’s uncertainty of 0.33 could be projected on to the total uncertainty sqrt[0.05^2 + 0.33^2] ~= 0.33. This would make the uncertainty of the two pieces added together 23 ± 7.59″, despite the fact that both pieces only had an uncertainty of ±1″.
Typo.
Per Taylor 3.18 (and others) the uncertainty u(x / y) = sqrt[ (u(x)/x)^2 + (u(y)/y)^2].
should be
Per Taylor 3.18 (and others) the uncertainty u(x / y) = sqrt[ (u(x)/x)^2 + (u(y)/y)^2] * (x / y)
y can be a constant in which case its contribution to uncertainty is ZERO!
And if (x+y)/2 = x/2 + y/2 then the uncertainty of each term, x/2 and y/2, ADDS.
And the uncertainty of x/2 ==> u(x)/x. The uncertainty of y/2 ==> u(y)/y.
You simply have no math legs to stand on.
TG said: “y can be a constant in which case its contribution to uncertainty is ZERO!”
The example presented was q = (x + y) / 2 where x and y are measurements with uncertainty u(x) and u(y). If you want to reframe the problem as q = (x + c) / 2 where c is a constant and only x is the measurement with uncertainty u(x) then fine. Just know that we are discussing a different example.
TG said: “And if (x+y)/2 = x/2 + y/2 then the uncertainty of each term, x/2 and y/2, ADDS.”
The uncertainty of u(x/2) and u(y/2) adds in quadrature. But the uncertainty of u(x/2) is not the same as u(x) nor is u(y/2) the same as u(y).
TG said: “And the uncertainty of x/2 ==> u(x)/x. The uncertainty of y/2 ==> u(y)/y.”
That’s not what Taylor 3.18 says.
(1) u(x/2) / (x/2) = sqrt[ (u(x)/x)^2 + (u(2)/2)^2 ]
(2) u(x/2) / (x/2) = sqrt[ (u(x)/x)^2 + (0/2)^2 ]
(3) u(x/2) / (x/2) = sqrt[ (u(x)/x)^2 ]
(4) u(x/2) / (x/2) = u(x)/x
(5) u(x/2) = u(x) / x * (x / 2)
(6) u(x/2) = u(x) / 2
There is a similar derivation for u(y/2) as well.
Then Taylor 3.16 says.
(7) u((x/2) + (y/2)) = sqrt[ u(x/2)^2 + u(y/2)^2 ]
(8) u((x/2) + (y/2)) = sqrt[ (u(x)/2)^2 + (u(y)/2)^2 ]
(9) u((x/2) + (y/2)) = sqrt[ u(x)^2/4 + u(y)^2/4 ]
(10) u((x/2) + (y/2)) = sqrt[ u(x)^2/4 + u(y)^2/4 ]
(11) u((x/2) + (y/2)) = sqrt[ u(x)^2 + u(y)^2 ] / 2
And if we let u = u(x) = u(y) then we have.
(12) u((x/2) + (y/2)) = sqrt[ u^2 + u^2 ] / 2
(13) u((x/2) + (y/2)) = sqrt[ 2 * u^2] / 2
(14) u((x/2) + (y/2)) = sqrt[2] * sqrt[u^2] / 2
(15) u((x/2) + (y/2)) = sqrt[2] * u / 2
(17) u((x/2) + (y/2)) = u / sqrt[2]
TG said: “You simply have no math legs to stand on.”
I verified these derivations with MATLAB and symbolab.com.
I DID a sum!
Uncertainties of a sum ADD.
q = (x+y)/2 which does go to: q = x/2 + y/2 A SUM!
You have *NOT* shown that the relative uncertainty of x/2 is different from u(x) / x.
Do you not understand that in the formula q = x/y that Y CAN BE A CONSTANT? E.g x/2!
The only thing that needs to be redone is your understanding of how to do basic algebra!
TG said: “I DID a sum!”
Taylor 3.18 is for products and quotients only.
TG said: “q = (x+y)/2 which does go to: q = x/2 + y/2 A SUM!”
I see / (quotients) in there. Do you seriously not seen them?
TG said: “Do you not understand that in the formula q = x/y that Y CAN BE A CONSTANT? E.g x/2!”
x and y can be any value. That’s not the point. The point is that x/y is not the same as x/2 + y/2 or (x+y) / 2 nor is u(x/y) the same as u(x/2 + y/2).
q = x + y
u(q) = u(x) + u(y)
q_avg = (x + y) /2
u(q_avg) = u(x) + u(y) + u(2) = u(x) + u(y)
u(q_avg) = u(q)
u(avg) = [ u(x) + u(y) ] /2
u(avg) ≠ u(q_avg)
It’s really just that simple!
Because you have zero understanding of the toys you play around with.
The hammerticians in this case seem to think you can hammer integer measurements with statistics until you can say, “look, we just got precision down to the one-thousandths decimal place. That they destroyed the information imparted by the original measurements is just so much flotsom on the floor.
I thought this site was going to institute some form of review process before contributions got published?
If so it has failed miserably on this occasion.
Ahhh, so you are one of the censorship advocates.
I’ve provided you authoritative references showing Kip is correct. The fact that you believe they are liars or are ignorant is *your* problem, not Kip’s.
I’ve yet to see an authoritative reference which backs up the method of uncertainty propagation in the article. That’s because there isn’t one.
It’s not about censorship it is about having at least some clue about what you are writing about.
What is your expertise?
Taylor (from around equation 3.15)
“Suppose, for example, that q = x + y is the sum of two lengths of x and y measured with the same steel tape. Suppose further than the main source of uncertainty is our fear that the tape was designed for use at a temperature different from the present temperature. If we don’t know this temperature (and don’t have a reliable tape for comparison), we have to recognize that our tape may be longer or shorter than its calibrated length and hence may yield readings under or over the correct length. This uncertainty can be easily allowed for. The point, however, is that if the tape is too long, then we underestimate both y and x, and if the tape is too short, then we overestimate both x and y. Thus, there is no possibility for the cancellations that justified using the sum in quadrature to compute the uncertainty in q = x + y.”
tpg note, the uncertainty can be easily allowed for *IF* you know the calibration temperature and and its coefficient of expansion. But, and this is a big but, how many steel tapes come with this marked on the tape so you will know both?
I’m aware you must consider Taylor a charlatan and a liar but to many of us he is considered an expert on uncertainty. When you just dismiss his expertise in favor of your own unproven knowledge of uncertainty procedures it doesn’t help your credibility at all.
They are not independent uncertainties. They are correlated and that is also dealt with in the standard sources.
On top of the bias in the measuring tape, there is also random uncertainty in reading it. Which would have a normal distribution.
It’s also clear that if you have a bias you should try and correct for it, whilst also propagating the uncertainty of that correction factor.
I never said Taylor is a charlatan. What I am saying is you have some serious misunderstandings about what he has written.
No, I understand Taylor, Bevington, et al just fine.
How do you correct for the systematic bias of temperature measurements for the instrument in Chicago? In Des Moines? In Colorado Springs?
u_total = u_systematic + u_random
If you don’t know the term u_systematic then how much cancellation can you assume for u_random? Can you even be confident that u_random is Gaussian?
You need to propagate the *entire* uncertainty interval forward if you don’t know u_systematic. It’s why the Federal Meteorology Handbook No 1 specifies +/- 0.6 C for federal temperature measurement devices. That’s an estimate of both systematic and random uncertainty for those devices.
Tell you what, watch this and tell everyone what the uncertainty of an average of two single and different temperature measurements from different times and/or locations actually is.
Uncertainties – Physics A-level & GCSE
Now let’s do some trim carpentry. You are doing an outside corner. You measure the angle and the length. You go out and mark the length using a different tape. You set you saw to the angle you measured. What is the uncertainty in the length and the angle of the finished piece? Will it match precisely? Will you need to cut another board? Why or why not?
This is the everyday stuff trades people have to deal with. Measurement uncertainty is an issue. Why do you think these folks pay dearly for high precision tools?
Lastly read this post from long ago. The same questions are still here.
https://wattsupwiththat.com/2012/01/26/decimals-of-precision-trenberths-missing-heat/
Then search for this post from E.M. Smith
January 27, 2012 2:59 am
You forgot the link to the physics paper, you just posted the title.
I don’t get how your carpentry example alters anything. All you are saying is the measurements need to be done with low uncertainty. OK I agree, but that does not alter the theory we are discussing one way or the other.
The theory we are discussing seems to be whether the uncertainty of the average is also the average uncertainty. Theory says they are two different things.
No. None of us are discussing that. We all know that u(Σxi/N) != Σu(xi)/N already. Furthermore none of us care about Σu(xi)/N. The discussion is and has been of u(Σxi/N) and why it is less than u(Σxi) where xi are independent.
It seems to you that is what we are discussing, because you keep asserting it’s what anyone believes, no matter how many time it’s pointed out we don’t.
There has always been some informal review. I’ve had an article rejected.
“The hammerticians in this case seem to think you can hammer integer measurements with statistics until you can say, “look, we just got precision down to the one-thousandths decimal place.”
Nobody claims the precision of global anomalies is down to the thousandths of a decimal. The quoted uncertainties are usually around ±0.05°C, becoming much larger as you go back in time.
This is just your obsession with decimal places. I would prefer they publish to as many decimal places as they want. Needless adherence to some perceived “rule” about how many decimal places you report just increases the uncertainty.
But, yes, one of the benefits of statistics is that it is possible get a know a mean with more precision than any of the individual measurements. I feel sorry for anyone who can’t or won’t understand this.
“That they destroyed the information imparted by the original measurements is just so much flotsom on the floor.”
Again, taking an average does not destroy data.
You never answer the question that was posed do you? Answer this:
That they destroyed the information imparted by the original measurements is just so much flotsom on the floor.
One-hundredths or one-thousandths, ho hum.
How do you get either from integer readings of a thermometer? How do you get error intervals of one-hundredths or one-thousandths from single integer readings? How do you get one-thousandths from single measurements of even tenths readings?
You ignore that these are single readings from which you do not have data from multiple readings of the same thing. Neither the GUM nor any uncertainty text I have deals with finding uncertainty based upon single readings of DIFFERENT things. NONE!
Experimental uncertainty can be calculated from different readings of similar things such as the same experiment done multiple times and performed as similarly as possible. Multiple rods stamped from the same machine can have tolerances, i.e., error calculations, but again from the same machine. But you should know that experimental uncertainty is usually expanded as Dr. Possolo did in TN1900 for a reason. It seldom allows for uncertainty in the ranges you see with temperature. Look at Example E2 in TN1900. The temps were recorded to the hundredths decimal. What did the error calculation result in?
“You never answer the question that was posed do you?”
I try to answer as many as possible, but I’m always being bombarded with inane questions about the lengths of planks or the specifications of lawn mowers, and I don’t have infinite time. Which particular question did you think I missed? The comment I was replying to didn’t have a single question in it, just assertions.
“Answer this”
I’ll try.
“That they destroyed the information imparted by the original measurements is just so much flotsom on the floor.”
Not a question. But I already told you they don’t destroy data.
“One-hundredths or one-thousandths, ho hum.“
Still not a question.
“How do you get either from integer readings of a thermometer?”
Almost a question, but I’m not sure what the “either” refers to. The flotsam, the one-hundredths, the one-thousandths, the supposedly destroyed data.
If all you are asking is how do you get more precise measurements from low resolutions thermometers, I’ve already told you. By averaging.
“How do you get error intervals of one-hundredths or one-thousandths from single integer readings?”
Have you heard that uncertainty is not error? Someone keeps shouting it.
If you want to get a very narrow uncertainty interval from integer readings you will need a very large sample, and to define what uncertainty you are talking about. If you can assume that there are no systematic biases, that the only source of uncertainty is rounding to the nearest integer, etc. And if you are only interested in measurement uncertainty, and not the much larger uncertainty from sampling. Then the standard formula is the individual uncertainty divided by the square root of the sample size. The individual uncertainty in this case has a uniform distribution and a range of 1, so I make the standard uncertainty to be around 0.3°C.
To get the standard uncertainty of the mean down to 0.01°C would require (0.3 / 0.01)^2 samples, around 900. To get it to 0.001°C would require 90,000.
Liar! Still clinging to your nonphysical nonsense.
Bellman has already said he doesn’t believe in significant digit rules – that they are a farce!
Therein lies the world view of a statistician who lives in another dimension than the rest of us.
I don’t remember saying they were a farce, but I do think some here take them to a farcical conclusion.
And the difference is … what?
Over your head.
Exactly!
Continued.
“You ignore that these are single readings from which you do not have data from multiple readings of the same thing.”
I’m not ignoring it, I’m assuming that is the case. (It’s possible that some thermometers do give a reading based on a short term average, but I assume that’s accounted for in the stated uncertainty.)
“Neither the GUM nor any uncertainty text I have deals with finding uncertainty based upon single readings of DIFFERENT things. NONE!”
That’s completely wrong, as has been pointed out many times. All the rules for propagating error or uncertainty are based on the assumption of functions of multiple different things. Look at the example Tim gets very exited about, the uncertainty of the volume of a cylinder based on the measurement of two different things, height and radius.
“But you should know that experimental uncertainty is usually expanded as Dr. Possolo did in TN1900 for a reason.”
If you are going to throw out random examples like this, a link would be a help. I’m guessing you mean this
https://nvlpubs.nist.gov/nistpubs/TechnicalNotes/NIST.TN.1900.pdf
But I’m not sure what point you are making. In that example, whilst there’s discussion about different types of uncertainty, the actual uncertainty calculated from the SEM of the stated values. That is the sampling uncertainty.
In any case the SEM is expanded. Why would it not be? The point is the standard uncertainty is multiplied by a covering factor to get the 95% confidence interval. (In this using a student-t distribution because the sample size is small).
I’m really not sure what point you are making here.
“It seldom allows for uncertainty in the ranges you see with temperature.”
But that example is for just one station with partial data. Of course it will be much less certain than the average of hundreds of stations.
“The temps were recorded to the hundredths decimal.”
They were recorded to the nearest quarter of a degree, not to a hundredth of a degree. Beware basing assumption on the number of decimal places.
“What did the error calculation result in?”
Much larger, because as I said, but it’s worth repeating, this is not based on measurement uncertainty, but on the random nature of the fluctuating temperatures. across the month. It’s saying we have 22 random readings and want to find the best estimate of the average maximum temperature for that month.
Personally, I’m not sure if this makes much sense, as you have 2/3rds of the population in your reading.
And, once again, we throw away measurement uncertainty by assuming a Gaussian distribution of measurements so we can use the SEM as a measure of uncertainty.
Once again, GUM, Annex H, Equation H.36 tells you how to handle individual measurements of different things.
H.36 u_total^2 = s_avg^2(x1)/m + s_avg^2(x2)/n
where m and n are the number of measurements taken for each object.
When you have single measurements of different things then s_avg becomes the uncertainty interval for that single measurement and m,n, etc become 1.
That makes H.36
u_total^2 = u^2(x1) + u^2(x2)
And you can extend x_i to as many terms, i.e. as many different objects, as you have in the population.
This is why he is called “bellcurveman”.
Notice how he ignores the GUM when it doesn’t give the answer he wants (and needs).
He still hasn’t explained how H.36 contradicts his assumptions about all measurement uncertainty cancelling.
I did, but as usual you ignored it.
I’ve still no idea why you think it proves your point. All you are saying it reduces to is
u(x – y) = root[u(x)^2 + u(y)^2]
just as every other rule states.
As always you are equations for propagation involving adding or subtraction, and ignoring what happens when you take an average.
What do you think the word “total” means? Does it mean the same as “average”?
It doesn’t, just an angry word salad rant.
“If all you are asking is how do you get more precise measurements from low resolutions thermometers, I’ve already told you. By averaging.”
So now we are back to both ignoring measurement uncertainty propagation AND ignoring significant digit rules.
If I average measurements 1 and 4 then *YOU* would get 2.5 as the average. You would go from 1 significant digit in the elements to 2 significant digits in the average.
If you had measurements of 2.7 and 2.8 you would average them and get 2.75. You would go from 2 significant digits to 3 significant digits.
And *YOU* think this is a valid use of significant digits? That you can *really* increase the precision of measurements from the tenths to the hundredths merely by averaging?
Significant digit rules say you should have no more digits to the right of the decimal point than the smallest number of digits to the right of the decimal in all of the elements being added.
There *IS* a reason for using significant digits in the physical world, the *real* world. It gives an indication of how precise your actual measurements were. Using more significant digits in your result than are in your actual measurements is misleading people.
Of course statisticians and climate scientists seem to have no problem with misleading people.
“If I average measurements 1 and 4 then *YOU* would get 2.5 as the average. You would go from 1 significant digit in the elements to 2 significant digits in the average. ”
I could say you get 2 ½. This is one of the problems with significant figure “rules” they are entirely dependent on using base 10. You try to give the impression that 2.5 is an order of magnitude more precise, when really it’s just the natural mid point of the two values you measured. And self-evidently that’s a better estimate of the average than 2 or 3.
“Significant digit rules say you should have no more digits to the right of the decimal point than the smallest number of digits to the right of the decimal in all of the elements being added.”
Then the rule is a ass. As I keep saying I don’t agree with that “rule”. It makes sense if you have no time or ability to do the actual uncertainty analysis, and if you are only averaging a small number of things, but it’s no substitute for the rules laid down in all your metrology works.
Oh the irony.
“I could say you get 2 ½”
How many engineering or physics papers have you read where they use fractions with their measurements?
Yes, *YOU* could say this. I don’t know of anyone in the engineering field or the physical science field that does this! Have you ever seen any measurement quoted as 1/2 x 10^-3?
“You try to give the impression that 2.5 is an order of magnitude more precise, when really it’s just the natural mid point of the two values you measured.”
You totally miss the point, as usual. 2.5 has 2 significant digits and one significant digit to the right of the decimal. What does that have to do with it being more or less precise than 1/2?
“Then the rule is a ass.”
There is a REASON why significant digit rules are used in the real world. No one really cares that you think it is an ass of a rule. I don’t know of any one in the real world of engineering or physical science that doesn’t follow the significant digit rules.
As usual, you are living in your own dimension of statistics and consider .333333…. to be infinitely precise!
“ it’s no substitute for the rules laid down in all your metrology works.”
The significant digit rules lie at the very base of metrology! No one has an infinitely precise measurement device for measuring anything!
From the mathdoctors.org site: “So significant digits are one way to keep track of how much accuracy you can expect in a calculation based on measurements, so that you can tell if an observation differs significantly from the theory being tested.”
But we should all be aware by now that you don’t really care about being able to tell if an observation differs from theory. You don’t even care about measurement uncertainty and just assume that all stated values are 100% accurate!
“The significant digit rules lie at the very base of metrology!”
Sol much so, that the standard guide to expressing uncertainty, completely fails to mention them, and just says you should not express the uncertainty to an excessive number of digits.
“But we should all be aware by now that you don’t really care about being able to tell if an observation differs from theory.”
Of course I care. I would just prefer to tell using established statistical techniques rather than some hand waving about an arbitrary counting system.
“Sol much so, that the standard guide to expressing uncertainty, completely fails to mention them, and just says you should not express the uncertainty to an excessive number of digits.”
The very first part of Taylor’s book is about significant digits. Bevington discusses them on Pages, 4 and 5 of his book, right at the start.
Even the GUM says: “It usually suffices to quote u_c(y) and U [as well as the standard uncertainties u(xi) of the input estimates xi] to at most two significant digits, although in some cases it may be necessary to retain additional digits to avoid round-off errors in subsequent calculations.”
“Of course I care. I would just prefer to tell using established statistical techniques rather than some hand waving about an arbitrary counting system.”
Significant digit rules are *NOT* an “arbitrary counting system”. They are a system developed long ago to guide scientists and engineers in how to specify results.
I’m not surprised you don’t believe in these rules. It’s just one more plank in the box you live in with your delusions.
“The very first part of Taylor’s book is about significant digits. Bevington discusses them on Pages, 4 and 5 of his book, right at the start.”
Yes, but neither of them, nor the GUM, are using the rules I’m complaining about. The ones where you determine the number of figures based simply by looking at the value with the smallest number of figures, or whatever. And especially not the one you keep stating, that you cannot use more decimal places in an average than in any individual measurement.
What they all say is to calculate the propagates the uncertainty, quote that uncertainty to a few significant figures, and quote the measurement to the same position as the quoted uncertainty.
As I pointed out before, Taylor at least, says that the number of figures used in an average may be more than the individual measurements.
“Significant digit rules are *NOT* an “arbitrary counting system”.”
By arbitrary counting system I meant counting in base 10.
“They are a system developed long ago to guide scientists and engineers in how to specify results. ”
The Appeal to Tradition Fallacy, you are so fond of pointing out.
“It’s just one more plank in the box you live in with your delusions.”
You’re the one bound in by rules. I’m trying to think outside the box.
“Yes, but neither of them, nor the GUM, are using the rules I’m complaining about.”
Marlarky!
“As I pointed out before, Taylor at least, says that the number of figures used in an average may be more than the individual measurements.”
Taylor says: “Experimental uncertainties should almost always be rounded to one significant figure”
He then says: “The last signficant figure in any stated answer should usually be of the same order of magnitude (in the same decimal position) as the uncertainty.”
He also says: “To reduce inaccuracies caused by rounding, any numbers to be used in subsequent calculations should normally retain at least one significant figure than is finally justified. At the end of the calculations, the final answer should be rounded to remove extra, insignificant figures.”
You continue to gaslight people. What Taylor says is in perfect alignment with the rules you are complaining about – *all* of them!
And I agree with Taylor’s suggestions, (although I’d prefer more than one significant figure for uncertainty), but this is not what you are claiming for the rules of significant figures. In particular they don’t agree with the claim that you must not report an average to more decimal places than the individual measurements.
As I said in another part of the thread. The resolution of a measurement conveys a given amount of information to other people. It is important, especially in science, to not use numbers that leads one to believe that measurements were done using a device with more resolution than what was actually done. Adding additional information to a measurement or portraying a calculation as having more information that was originally measured is basically writing fiction.
You can’t take two measures to the nearest inch and do any statistics, calculations, or other procedure to make those measures any more than what is stated. Averaging them and saying the result is more accurate and has more precision is adding fictitious information.
From:
Microsoft Word – Document3 (purdue.edu)
From:
Microsoft Word – SIGNIFICANT FIGURES (chem21labs.com)
From:
Significant Figures Lab | Middlebury College Chem 103 lab
Look at the NIST TN1900, Example 2. Why do you think the average temperature he computes has one decimal digit while the temperatures were shown as having two? Read the above references and you’ll have a good idea.
Do I need to supply more laboratory references before you believe that these are common rules used by everyone dealing with measurements?
Yes, all those are what I say I don’t agree with. Just keep pointing out that some educational establishments have documents saying that a mean cannot be more accurate than the least accurate measurement, doesn’t make it correct. That’s just an argument from authority.
If you don’t think a mean can be more accurate (or should that be precise) than an individual measurement, why are you even taking a mean? Why take multiple measurements of the same thing if it’s not to reduce uncertainty?
Note, that your third example has a different rule for significant figures in a mean.
Using that rule could well mean that an average of measurements taken in integers could be written to one decimal place.
It’s not an argument from authority. It’s not a “because so-and-so says so. It’s backed up by factual information as to why it’s done that way.
If you abandon the rules then .3333333… *is* INFINITELY accurate and precise. Where would *you* draw the line? It sounds like it would be just where *YOU* choose to make it.
“Using that rule could well mean that an average of measurements taken in integers could be written to one decimal place.”
No, it doesn’t.
“It’s backed up by factual information as to why it’s done that way.”
Then show me the factual evidence that proves the mean can’t be more accurate than the accuracy of any of the elements, rather than just regurgitating the rules.
“If you abandon the rules then .3333333… *is* INFINITELY accurate and precise.”
It is, but that’s got nothing to do with this argument.
“Where would *you* draw the line?”
In mathematics I’d say it was ⅓ and there would be no need to draw any line.
In the cruder sciences I’d look at how it was measured and report the digits to what was required by the uncertainty.
“No, it doesn’t.”
I see we’re into pantomime season.
Yes it does. Say I measure something a number of times to the nearest cm. Half the time it comes up 2cm the other times 3cm. The average is 2.5cm, the standard deviation is 0.5cm. The first decimal place of the standard deviation is in the 0.1 column. Therefore the last significant digit of the average should be the 0.1 column. Hence the average is 2.5cm, not 3cm.
“Then show me the factual evidence that proves the mean can’t be more accurate than the accuracy of any of the elements, rather than just regurgitating the rules.”
Go buy two boards of different lengths (both greater than 3m) from two different lumberyards. Measurement them with a ruler marked in centimeters and millimeters and estimate the uncertainty of your measurements. Did you place each board exactly at the zero mark on the ruler? Did you place the ruler zero mark *exactly* at the end mark from the previous measurement? How wide was your pencil mark, 5mm, 7mm, 1mm?
Now calculate the average length of the two boards combined. Tell me how that average can be more accurate than the accuracy of the individual measurements you took.
Show your work.
As usual, rather than giving evidence, you give a simple example which can not refute my argument. My point is that the average can have an uncertainty sufficient to justify using fewer decimal places than the original measurements, not that it will under all circumstances. Even if your example made sense, it cannot prove your general claim.
When I say an average can have better resolution than the individual measurements I’m think of examples such as a global temperature made up of hundreds or thousands of measurements, not two wooden boards.
“Measurement them with a ruler marked in centimeters and millimeters and estimate the uncertainty of your measurements.”
It’s really up to you to tell me what the uncertainty is, and then to tell me how many digits you want to report the individual measurements in. Most of the claims here are about the resolution of the measuring device which in this case would be 1mm, or possibly 0.1mm. But you then suggest there are bigger uncertainties, so the resolution isn’t that important.
So how many places would you report the measurement of one board to?
“Now calculate the average length of the two boards combined.”
As always you don’t explain why you want to average the two boards. I’ll assume you want an exact average, rather than a sample, though I can’t think why you would be doing this.
“Tell me how that average can be more accurate than the accuracy of the individual measurements you took.”
OK, lets make some assumptions here. Say the standard uncertainty of each measurement was 5.0mm, the standard combined uncertainty of the mean is 5 / sqrt(2) ~= 3.5mm. So it would be more accurate than wither of the individual measurements, but you would report it to the same number of decimal places. Depending on which rules you were using this could be reported to 1mm or 0.1mm.
But now what happens if you have 100 different boards, or if you measured the same board 100 times. The measurement uncertainty would now be 5 / sqrt(100) = 0.5mm, and by any of the meteorological rules you could report it to an extra decimal place.
If you want an example of how even a few measurements (just 3 measured to the nearest second) can increase the number of significant figures, see Taylor exercise 4.15, and specifically his comment that the exercise will illustrate
While we are at it, look at exercise 2.30 to see an example of why the “well-known” rule can be misleading in the opposite direction. Giving you more precision than is warranted.
“Look at the NIST TN1900, Example 2. Why do you think the average temperature he computes has one decimal digit while the temperatures were shown as having two?”
I pointed out before, the measurements are not being made in hundredths of a degree, but in quarter degrees. Which I think illustrates the danger of relying on the number of decimal places to indicate uncertainty.
But that’s irrelevant to this example, because the uncertainty ignores the uncertainty of the measurements and just uses the SEM of the 22 daily measurements to estimate the uncertainty of the monthly average. (You know, like I keep suggesting and people here keep insisting is not the uncertainty of the average.)
In this case the SEM is 0.872°C, and the coverage factor of 2.08 is used to give the 95% confidence interval.~= ±1.81°C. From this two significant figures are used, and so the result is given to 0.1 of a degree. That is 25.6 ± 1.8°C, or [23.8, 27.4].
The reason this is given to 1 decimal place is for the same reason I’ve said before, it’s the last digit of the uncertainty of the mean.
Read the assumptions Possolo makes. He basically made the same assumption you do – measurement uncertainty all cancels out!
I’m not the one who keeps bringing this up as an example.
Being a devil’s advocate, much 19th and 20th Century automotive engineering did use halving of intervals.
I’m currently mucking abut with repairs to an engine where 2 lines of holes are 1 3/16″ apart, and the outer holes of 1 line are 3 11/16″ apart. Converting those to thou for the DRO on the mill is “interesting”
I think we are talking about two different things. Specifying distances in fractions, e.g. two holes are supposed to be 1 3/16″ apart is different than specifying measurements of that distance in terms of 1/16″ intervals. That’s a wide enough tolerance that two mating pieces, e.g. holes in the block and holes in the mounting plate that goes there on the block, might not match up! It’s why reproduction parts don’t always work like they should. I’ve even seen it with OEM parts!
I know what you mean, but early work was done using halved interval rather than decimal. The specification on that 1 3/16″ might have been +/- 1/512″ (2 thou), though late 19th and early 20th century production equipment probably couldn’t hold that.
That 1 3/16″ is 1.1875″, which gives spurious precision, so it was probably specced as 1.185″ to 1.190″, and produced using a multi-head drill.
If all the readings are to the nearest inch, then your distribution will be calibrated in inches. How do you then say the average will tell you a true value that is less than an inch increment?
Your calculator may show 9 decimal digits if the numbers come out right but do you think that is an accurate depiction of the true value?
You are basically saying that you can measure something to the nearest yard and if you do it enough times you can know the measurement to the nearest one-ten thousandths of and inch.
Your common sense should tell you that can not be!
“If all the readings are to the nearest inch, then your distribution will be calibrated in inches. How do you then say the average will tell you a true value that is less than an inch increment?”
Why shouldn’t it be. Suppose you were taking average of things that could only be measured in integers, such of number of children in a family. The true average is unlikely to be an integer, and can easily be calculated to a higher precision than an integer. So why should it be different if you are rounding something to the nearest integer?
“Your calculator may show 9 decimal digits if the numbers come out right but do you think that is an accurate depiction of the true value?”
Of course not. You should use the size of the uncertainty to guide the number of places. But it’s quite likely that the uncertainty will be less than an inch.
“You are basically saying that you can measure something to the nearest yard and if you do it enough times you can know the measurement to the nearest one-ten thousandths of and inch.”
I am basically not saying that. I’m just saying an average can be more precise then the individual measurements.
The difference is that there are no resolution limits with discrete values.
My point is that if you can have a fractional average for discrete integer values, there’s no reason why you can’t also have a fractional average of continuous values that have been rounded to the nearest integer.
What’s the difference between saying the average of 2 children and 3 children is 2.5 children, and saying the average
height of two children measured at 122cm and 123cm is 122.5cm?
2 children and 3 children are discrete values, so there is no uncertainly.
The arithmetic mean is just a division of the sum by the count, so 2.5 is fine.
122cm and 123cm are spot estimates taken from a continuum (1.22m and 1.23m would be more better). To the resolution implied by the given figures, they each have uncertainty of +/- 0.5 cm.
As a result, the measures of centrality also have uncertainty. Depending on the wind speed and direction, phase of the moon, etc, the total absolute uncertainty can be calculated using RSS or just plain addition. Using just plain addition (’cause I’m lazy), the total uncertainty is +/1 icm. We divided by 2 to get the average, so divide the uncertainty by 2 as well, to give an average of 122.5.+/- 0.5 cm
Assuming that “average” here is arithmetic mean, that’s just the ratio of the total over the count, so it may indeed be reported to spuriously arbitrary decimal precision.
Decimal representation can introduce spurious precision, but as a matter of practically, the number of significant digits should really be log10(count). For small counts, it would ideally be shown as the actual sum/count, but that’s almost never done.
The variance (or standard deviation if you prefer to use that) provides information about the dispersion, and ideally the sample size should also be reported.
“…but as a matter of practically, the number of significant digits should really be log10(count).”
Not entirely. In general uncertainty decreases with the square root of the count so it should log10(sqrt(count)).
Ooh, I wasn’t even thinking of uncertainties, just the decimal representation of fractions. You can add 1 significant figure per order of magnitude.
1/10 = 0.1
1/100 = 0.01
1/1000 = 0.001
Bellcurveman: “Yes, its true, I have zero engineering training or experience…”
“The hammer-ticians have argued among themselves as to which hammer “
I guess it is the hammer-ticians vs the hammer-chewers…
Said the guru to the disciple…
Kip,
My overall take from my series plus your essay to date is in 3 parts.
1. It is unclear what the purpose of uncertainty estimation is. To illustrate this, I use the example of measurement of sea surface temperatures by Argo floats. Here is a link:
https://www.sciencedirect.com/science/article/pii/S0078323422000975
It has claim that “The ARGO float can measure temperature in the range from –2.5°C to 35°C with an accuracy of 0.001°C.”
I have contacted bodies like the national Standards laboratories of several countries to ask what the best performance of their controlled temperature water baths is. The UK reply included: National Physical Laboratory | Hampton Road | Teddington, Middlesex | UK | TW11 0LWDear Geoffrey,
“NPL has a water bath in which the temperature is controlled to ~0.001 °C, and our measurement capability for calibrations in the bath in the range up to 100 °C is 0.005 °C.”
Without a dive into the terminological jungle, readers would possibly infer that Argo in the open ocean was doing as well as the NPL whose sophisticated, world class conditions are controlled to get the best they can. It would be logical to conclude that the Argo people were delusional. It is only on deeper study that you start to find why things are said.
2. Uncertainties derived from statistics have a purpose that is different to uncertainties from practical measurements.
To use statistics, you need to be able to show that your numbers are samples from a population. A population is best considered to arise when extraneous variables are absent or minimal.
When discussing thermometers at weather stations, two different stations will produce two different statistical populations because of different types and intensities of extraneous variables.
Just as two people have different variables – one should not take the body temperature of person A by inserting a thermometer in person B.
(Posted by me in comments to your essay).
3. “uncertainty” and “error” are not the same, usually. Pat Frank has explained this many times. It would be best to ask Pat for a short dissertation about the difference. (Pat and I are friends – I would not want to upset him by misquoting his real meaning of the difference).
Geoff S
Geoff ==> all good points.
1. It is unclear what the purpose of uncertainty estimation is.
In my incredibly simple example, we just want to know how uncertain the measurements were. If they were rather wide ranges, then this must be included in subsequent analysis.
2. Uncertainties derived from statistics have a purpose that is different to uncertainties from practical measurements.
Yes, they do, and they are not the same animal. Giraffes are African animals and so are elephants — but they are not the same animal. One may use statistics in uncertainty ONLY if one knows the limitations of each of those statistical approaches
3. “uncertainty” and “error” are not the same, usually..
Unfortunately, the terminology is confused and often at cross purposes with reality. My examples are really “original measurement uncertainty” but the identical cases is often mistakenly called “original measurement error”.
So, quite agree, but haven’t any real idea – other than more careful and pragmatic education – on how to resolve — over 600 comments, mostly arguing statistics.
Measurements are typically given as “stated value +/- uncertainty”.
Statistical analysis proponents are always forced into assuming Gaussian distributions with total cancellation of measurement error. ALWAYS. It’s like assuming all the stated values are 100% accurate measurements.
Then they can use the standard deviation of the stated values as a measure of uncertainty in the data set.
When it comes to temperatures the assumption that all measurement uncertainty cancels is just wrong. Even in the same season, temperature measurements in Dallas will have a different variance in the stated values than temperature measurements in Kansas City. You can’t just jam them together by taking an average.
It just boils down to the fact that statisticians and climate scientists don’t want to have to handle measurement uncertainty so they just ignore it and use the methods they learned in STAT 101.
If you really want to see how to handle independent measurements of different things then look at Eq H.36 in the GUM. It shows how to combine independent measurements of different things. It basically boils down to the much used root-sum-square of the measurement uncertainties.
We are not forced into assuming Gaussian distributions. Other distribution shapes are dealt with. You’ve been told this repeatedly on here.
In the current debate, about temperature rounding to the nearest whole degree, that digital rounding uncertainty will have a rectangular distribution. Again you’ve been told this several times.
We’re not debating nailing planks of wood together, we are debating uncertainty in temperature measurement. That being the case, the uncertainty assessment method is quite standard.
There is an excellent temperature measurement uncertainty assessment in a link posted on here previously. I found it very instructive and so should you.
1) It would be more appropriate to ask them before attacking them and labeling them as “delusional”.
2) Uncertainties are inherently probabilistic.
3) You don’t need to ask Pat about this. There are plenty of materials you can reference to learn the legacy, contemporary, and preferred usages of these terms. What you should ask Pat is where he gets the equation σ_avg = sqrt[N * σ^2 / (N-1)] which I cannot find in Bevington even that is the citation and why it is inconsistent with NIST TN 1297, JCGM 100:2008, etc.
“2) Uncertainties are inherently probabilistic.”
The problem is that no one knows the probability distribution. Uncertainties are unknown and unknowable.
Why do you never address GUM Eq H.36 that shows how to address independent measurements of different things?
When you have single measurements H.36 just boils down to the root-sum-square of the uncertainties. It is *NOT* the average uncertainty!
Maybe you should read his paper?
I don’t want to get into a long discussion, but it is useful I think for future discussions of this issue to consider one aspect of climatology, the use of “average temperature.” As has been discussed many times here and elsewhere, a global average of local temperatures such as GISSTEMP is not technically a “temperature” as the quantity is defined in thermodynamics. There are a variety of ways to derive the notion of temperature, but all are rooted in the notion that temperature is itself an intensive, or local, quantity and is the derivative of an extensive quantity of a system, the total internal energy typically denoted U, by another extensive quantity of the system, total entropy typically denoted S. (It gets more complicated if the system in not in equilibrium, but the basic concept is the same.) That means that an average of temperatures (or anomalies) at a variety of locations around the globe is not a temperature (except in the unlikely event that they are all identical) but rather should be considered simply an index of the set of temperatures, much like a stock index is an index of prices and not the price of any particular stock or even all stocks taken as a whole. An index is not the same as a measurand in metrology, in the sense that there is no idealized actual physical value that it is expected to have. Its purpose is descriptive of the whole rather than something that can actually be measured. Its uncertainty may or may not be directly linked to the uncertainties of the underlying quantities.
An illustration may provide some food for thought about the difference between indexes and physical quantities. One thing that is useful to consider about the world is how many children are being born to women in different countries and the world in total. This property is typically characterized by the fertility of women in some area and that property is typically captured in a fertility index, sometimes called the fertility rate. This is calculated by summing up the number of children born (with absolute error or resolution per woman +-1) and dividing by the total number of women in the region being considered. Various organizations track these numbers and the CIA, for example, gives them in their fact book here
https://www.cia.gov/the-world-factbook/field/total-fertility-rate/country-comparison
Notice that the first couple of entries given are Niger 6.82 and Angola 5.83 and if one goes down to the U.S. one finds the index is 1.82. In this case we know that the number of children born to any one woman cannot be more accurate than an integer, but the indexes are reported, in this case, to hundredths of a child. It is worth thinking about the difference between an index and a physical quantity.
fah said: “An index is not the same as a measurand in metrology, in the sense that there is no idealized actual physical value that it is expected to have. Its purpose is descriptive of the whole rather than something that can actually be measured.”
The only constraint that I see JCGM 200:2012 puts on a measurand is that it be a quantity with a magnitude. There is no restriction that I see that precludes intensive or index quantities. Furthermore, NIST TN 1900 has an example where they use JCGM 100:2008 to determine the uncertainty of an average temperature. It also has an example where they determine the uncertainty of an average of 4 spatial fields of an intensive property. And JCGM 100:2008 itself has an example where the uncertainty of the “hardness index” is computed.
Absolutely correct. The techniques presented in the various metrology standards are simply selections of standard statistical techniques which are widely applied to a variety of calculations and have a firm theoretical basis. A more significant distinction, as indicated by the fertility index, and also reflected in various other indices such as consumer price indices, stock indices, money supply indices, etc. is that the precision of the index can be, and usually is, greater than the precision of the underlying measurements or quantities. For example a measurement of the number of children any particular woman has is necessarily integral, but the index itself has meaningful accuracy down to 0.01 child per woman. This is because they represent a quality of the totality of the set itself (in abstraction) and not a specific measurement of a specific physical quantity.
That index will have an uncertainty based on the size of the count. You can’t get away from uncertainty by averaging.
“There is no restriction that I see that precludes intensive or index quantities.”
An intensive property is calculated from measured extensive properties.
The length of a board is *NOT* an intensive property. It is a measured quantity.
B.2.1 (measurable) quantity
attribute of a phenomenon, body or substance that may be distinguished qualitatively and determined quantitatively
B.2.2 value (of a quantity)
magnitude of a particular quantity generally expressed as a unit of measurement multiplied by a number
B.2.5 measurement
set of operations having the object of determining a value of a quantity
B.2.6 principle of measurement
scientific basis of a measurement
B.2.8 measurement procedure
set of operations, described specifically, used in the performance of particular measurements according to a given method
B.2.9 measurand
particular quantity subject to measurement
B.2.11 result of a measurement value attributed to a measurand, obtained by measurement
These are all from the GUM. The GUM does not cover an the properties of an “index”. It is a document about making physical measurements of physical things and making judgements about the accuracy and precision of those measurements.
You still haven’t figured out what functional relationship means in terms of a measurable physical quantity. The ” hardness index” is a result of measurements.
The hardness “index” is determined by a physical measurement of how deeply a calibrated machine penetrates into a transfer block. The measurement is then converted to a hardness index. It is kinda like looking up the Kelvin temperature from a measurement of a Fahrenheit temperature. Now if you want to convert a Fahrenheit temperature into some other scale have at it. I don’t see where it would buy you anything.
If you are doing relative measurement uncertainty what happens when the stated value is 0C or 0F?
Climate science ignores this by just ignoring measurement uncertainty altogether.
“One thing that is useful to consider about the world is how many children are being born to women in different countries and the world in total.”
I agree with much of what you day. But this is comparing a “counting” distribution with a measurement distribution. The first order estimate of the uncertainty in a *rate* over time determined from a count is the square root of the count. We don’t know how may births were counted in Niger and Angola so it is impossible to estimate the uncertainty. Example: you count 14 births in a two-week period. The rate is thus 14 for the two-week period. The uncertainty would be estimated as the sqrt(14) ≈ 4. So the value given for births in 2-weeks would be 14 +/- 4.
If you have x births per year the rate would be x +/- sqrt(x) per year.
Note that in this case, the fertility index is the average number of births per woman. So the numerator is the total number of births and the denominator in the calculation is the number of women in the population, which generally is quite large. I think in this case they calculate the index based on the number of births per fertile lifetime of women, but that is a minor detail. The point is that the index is trying to describe something about the total population, not any individual per se. A significant digit in the index of 0.01 in a total population of say 10 million women, would correspond to about 100,000 births, which could well be easily within the resolution of the count. Changes in indices like this are generally used to think about population issues such as availability of healthcare, nutrition, presence of diseases, availability of water, social mores, etc. etc. There are places where there is uncertainty in assessing how many births occur and to whom but in general the policy makers pay attention to slight differences in these indices. This is not much far removed from the global average temperature anomaly in the sense that it tells more about the population of local temperatures as a population of numbers rather than any individual measurement of temperature, but I think the index is not easily related to a globally defined thermodynamic property. I apologize in advance if I don’t reply promptly, but I am immersed right now in grading final exams, late homework, lab reports etc and I want to get that over with.
Let’s face it – the notion of certainty for a global average temperature anomaly better than around + and – 0.5C from the 1930s or whenever on a sub-optimal widely spaced observation network is frankly absurd.
It’s like we’ve lost complete sense of physical reality. 0.5C is a tiny little magnitude. Every bit of anecdotal evidence and common sense tells us at least 0.5 is perfectly reasonable. It seems to about match the posted uncertainty on raw temperature measurements. That’s good enough for me.
Call me old fashioned but resorting to data hacking or whatever to demonstrate a hypothesis does not pass a basic stink test.
Good article. I remember doing Metrology at Uni for my Engineering. It was exactly as per this article, in that the sum of the uncertainties were added. The slip gauges had a table of tolerances, which is the +/- of uncertainty. Also the room was air conditioned so the measurements of the slip gauges were valid, at 25C I think. There was a known uncertainty about this measurement. I think you weren’t meant to hold the steel slip gauges for too long either as the hand warmth will expand them.
I think some people are confusing the known uncertainty with other terminology such as probability, which may seem interchangeable but are different concepts.
The temperature effect on the gauges not a “known uncertainty” it is a “known bias”. As such it should be corrected for.
The tolerances are indeed a kind of uncertainty, and in the absence of further information they would be treated as rectangular distributions, converted to standard uncertainties and follow the addition in quadrature method.
With you thru “rectangular distributions”. Essentially none of these “tolerances” are so distributed. And when added, quadrature fine.
The tolerances could well be normally distributed. I said in the absence of any further information you would probably assign a rectangular distribution.
It does not make much difference though, and it certainly does not affect the immediate topic of discussion.
“The tolerances could well be normally distributed.”
Agree.
” I said in the absence of any further information you would probably assign a rectangular distribution.”
Disagree.
“It does not make much difference though.”
Disagree. Even those here who deny centuries worth of accumulated statistical knowledge would as well.
“and it certainly does not affect the immediate topic of discussion.”
Maybe not. I was responding to your claim.
shawno69 said: “I remember doing Metrology at Uni for my Engineering. It was exactly as per this article, in that the sum of the uncertainties were added.”
Can you post a link to the text book you used for your metrology class? I’d like to review it and see why it is inconsistent NIST TN 1297, JCGM 100:2008, Taylor, Bevington, UKAS, NIST uncertainty machine, etc.