Guest Essay by Kip Hansen — 3 January 2023
“People use terms such as “sure” to describe their uncertainty about an event … and terms such as “chance” to describe their uncertainty about the world.” — Mircea Zloteanu
In many fields of science today, the word “uncertainty” is bandied about without much thought, or at least expressed thought, about which meaning of “uncertainty” is intended. This simple fact is so well known that a group in the UK, “Sense about Science”, published a booklet titled “Making Sense of Uncertainty” (.pdf). The Sense about Science group promotes evidence-based science and science policy. The Making Sense of Uncertainty booklet, published in 2013, is unfortunately an only vaguely-disguised effort to combat climate skepticism based on the huge uncertainties in Climate Science.
Nonetheless, it includes some basic and necessary understandings about uncertainty:
Michael Hanlon: “When the uncertainty makes the range of possibilities very broad, we should avoid trying to come up with a single, precise number because it creates a false impression of certainty – spurious precision.”
A good and valid point. But the larger problem is “trying to come up with a single … number” whether ‘spuriously precise’ or not.
David Spiegelhalter: “In clinical medicine, doctors cannot predict exactly what will happen to anyone, and so may use a phrase such as ‘of 100 people like you, 96 will survive the operation’. Sometimes there is such limited evidence, say because a patient’s condition is completely novel, that no number can be attached with any confidence.”
Not only in clinical medicine, but widely across fields of research, we find papers being published that — despite vague, even contradictory, and limited evidence with admitted weaknesses in study design — state definitive numerical findings that are no better than wild guesses. [ See studies by Jenna Jambeck on oceanic plastics. ]
And, perhaps the major understatement, and the least true viewpoint, in the booklet:
“There is some confusion between scientific and everyday uses of the words ‘uncertainty’ and ‘risk’. [This first sentence is true. – kh] In everyday language, we might say that something that is uncertain is risky. But in scientific terms, risk broadly means uncertainty that can be quantified in relation to a particular hazard – and so for a given hazard, the risk is the chance of it happening.”
A Lot of Confusion
“The risk is the chance of it happening.” Is it really? William Briggs, in his book “Uncertainty: The Soul of Modeling, Probability & Statistics”, would be prone to point out that for there to be a “chance” (meaning “a probability”) we first need a proposition, such as “The hazard (death) will happen to this patient” and clearly stated premises, most of which are assumed and not stated, such as “The patient is being treated in a modern hospital, otherwise healthy, the doctor is fully qualified and broadly experienced in the procedure, the diagnosis is correct…”. Without full exposition of the premises, no statement of probability can be made.
I recently published here two essays touching on uncertainty:
Each used almost childishly simple examples to make several very basic true points about the way uncertainty is used, misused and often misunderstood. I expected a reasonable amount of push-back against this blatant pragmatism in science, but the ferocity and persistence of the opposition surprised me. If you missed these, take a look at the essays and their comment streams. Not one of the detractors was able to supply a simple example with diagrams or illustrations to back their contrary (almost always “statistical”) interpretations and solutions.
So What is the Problem Here?
1. Definition In the World of Statistics, uncertainty is defined as probability. “Uncertainty is quantified by a probability distribution which depends upon our state of information about the likelihood of what the single, true value of the uncertain quantity is.” [ source ]
[In the linked paper, uncertainty is contrasted to: “Variability is quantified by a distribution of frequencies of multiple instances of the quantity, derived from observed data.”]
2. Misapplication The above definition becomes misapplied when we consider absolute measurement uncertainty.
Absolute error or absolute uncertainty is the uncertainty in a measurement, which which is expressed using the relevant units.
The absolute uncertainty in a quantity is the actual amount by which the quantity is uncertain, e.g. if Length = 6.0 ± 0.1 cm, the absolute uncertainty in Length is 0.1 cm. Note that the absolute uncertainty of a quantity has the same units as the quantity itself.
Note: The most correct label for this is absolute measurement uncertainty. It results from the measurement process or the measurement instrument itself. When a temperature is always (and only) reported in whole degrees (or when it has been rounded to whole degrees), it has an inescapable absolute measure uncertainty of ± 0.5°. So, the thermometer reading reported/recorded as 87° must carry its uncertainty and be shown as “87° ± 0.5°” — which is equivalent to “any value between 87.5 and 86.5” —there are an infinite number of possibilities in that range, all of which are equally possible. (The natural world does not limit temperatures to those exactly lining up with the little tick marks on thermometers.)
Dicing for Science
Let’s take a look at a simple example – throwing a single die and throwing a pair of dice.
A single die (a cube, usually with slightly rounded corners and edges) has six sides – each with a number of dots: 1, 2, 3, 4, 5 and 6. If properly manufactured, it has a perfectly even distribution of results when rolled many times. Each face of the die (number) will be found facing up as often as every other face (number).
This represents the distribution of results of 1,000 rolls of a single fair die. If we had rolled a million times or so, the distribution values of the numbers would be closer to 1-in-6 for each number.
What is the mean of the distribution? 3.5
What is the range of the result expected on a single roll? 3.5 +/- 2.5
Because each roll of a die is entirely random (and within its parameters, it can only roll whole values 1 through 6), for the every next roll we can predict the value of 3.5 ± 2.5 [whole numbers only]. This prediction would be 100% correct – in this sense, there is no doubt that the next roll will be in that range, as it cannot be otherwise.
Equally true, because the process can be considered entirely random process, every value represented by that range “3.5 ± 2.5” [whole numbers only] has an equal probability of coming up in each and every “next roll”.
What if we look at rolling a pair of dice?
A pair of dice, two of the die’s described above, rolled simultaneously, have a value distribution that looks like this:
When we roll two dice, we get what looks like an unskewed “normal distribution”. Again, if we had rolled the pair of dice a million times, the distribution would be closer to perfectly normal – very close to the same number for 3s and for 11s and the same numbers for
1s 2s as for the 12s.
What is the mean of the distribution? 7
What is the range of the result expected on a single roll? 7 ± 5
Because each roll of the dice is entirely random (within its parameters, it can only roll whole values 2 through 12), for the every next roll we can predict the value of “7 ± 5”.
But, with a pair of dice, the distribution is no longer even across the whole range. The value of the sums of the two dice range from 2 through 12 [whole numbers only]. 1 is not a possible value, nor is any number above 12. The probability of rolling a 7 is far larger than rolling a 1 or 3 or 11 or 12.
Any dice gambler can explain why this is: there are more combinations of the values of the individual die that add up to 7 than add up to 2 (there is only one combination for 2: two 1s and one combination for 12: two 6s).
Boxing the dice
To make the dicing example into true absolute measurement uncertainty, in which we give a stated value and its known uncertainty but do not (and cannot) know the actual (or true) value, we will place the dice inside a closed box with a lid. And then shake the box (roll the die). [Yes, Schrödinger’s cat and all that.] Putting the dice in a lidded box means that we can only give the value as a set of all the possible values, or, the mean ± the known uncertainties given above.
So, then we can look at our values for a pair of dice as the sum of the two ranges for a single die:
The arithmetic sum of 3.5 ± 2.5 plus 3.5 ± 2.5 is clearly 7 ± 5. (see Plus or Minus isn’t a Question).
The above is the correct handling of addition of Absolute Measurement Uncertainty.
It would be exactly the same if adding two Tide Gauge Measurements, which have an absolute measurement uncertainty of ± 2 cm, or adding two temperatures that have been rounded to a whole degree. One sums the value and sums the uncertainties. (Many references for this. Try here.)
Statisticians (as a group) insist that this is not correct – “Wrong” as one savvy commenter noted. Statisticians insist that the correct sum would be:
7 ± 3.5
One of the commenters on Plus or Minus gave this statistical view: “the uncertainties add IN QUADRATURE. For example, (25.30+/- 0.20) + (25.10 +/- 0.30) = 50.40 +/- SQRT(0.20^2 + 0.30^2) = 50.40 +/-0.36 … You would report the result as 50.40 +/- 0.36”
Stated in words: Sum the values with the uncertainty given as the “square root of the sum of the squares of the uncertainties”.
So, let’s try to apply this to our simple dicing problem using two dice:
(3.5 ± 2.5) + (3.5 ± 2.5) = 7 ± SQRT (2.5^2 + 2.5^2) = 7 ± SQRT(6.25 + 6.25) = 7 ± (SQRT 12.5) = 7 ± 3.5
[The more precise √12.5 is 3.535533905932738…]
Oh, my. That is quite different from the result of following the rules for adding absolute uncertainties.
Yet, we can see in the blue diagram box that the correct solution including the full range of the uncertainty is 7 ± 5.
So, where do the approaches diverge?
Incorrect assumptions: The statistical approach uses a definition that does not agree with the real physical world: “Uncertainty is quantified by a probability distribution”.
Here is how a statistician looks at the problem:
However, when dealing with absolute measurement uncertainty (or in the dicing example, absolute known uncertainty – the uncertainty is known because of the nature of the system), the application of the statistician’s “adding in quadrature” rule gives us a result not in agreement with reality:
One commenter to the essay Limitations of the Central Limit Theorem, justified this absurdity with this: “there is near zero probability that both measurements would deviate by the full uncertainty value in the same direction.”
In our dicing example, if we applied that viewpoint, the ones and sixes of our single dies in a pair would have a ‘near zero’ probability coming up together (in a roll of two dice) to produce sums of 2 and 12. 2 and 12 represent the mean ± the full uncertainty value of plus or minus 5.
Yet, our distribution diagram of dice rolls shows that, while less common, 2s and 12s are not even rare. And yet, using the ‘adding in quadrature’ rule for adding two values with absolute uncertainties, 2s and 12s can just be ignored. We can ignore the 3s and 11s too.
Any dicing gambler knows that this is just not true, the combined probability of rolling 2, or 3, or 11, or 12 is 18% – almost 1-in-5. Ignoring a chance of 1-in-5, for example “there is a 1-in-5 chance that the parachute will malfunction”, is foolish.
If we used the statisticians regularly recommended “1 Standard Deviation” (approximately 68% – divided equally to both sides of the mean) we would have to eliminate from our “uncertainty” all of the 2s, 3s, 11s, and 12s, and about ½ of the 3s and 4s. — which make up 34% (>1-in-3) of the actual expected rolls.
Remember – in this example, we have turned ordinary uncertainty about a random event (roll of the dice) into “absolute measurement uncertainty” by placing our dice in a box with a lid, preventing us from knowing the actual value of the dice roll but allowing us to know the full range of uncertainty involved in the “measurement” (roll of the dice). This is precisely what happens when a measurement is “rounded” — we lose information about the measured value and end up with a “value range”. Rounding to the “nearest dollar” leaves an uncertainty of ± $ .50 ; rounding to the nearest whole degree leaves an uncertainty of ± 0.5°; rounding to the nearest millennia leaves an uncertainty of ± 500 years. Measurements made with an imprecise tool or procedure produce equally durable values with a known uncertainty.
This kind of uncertainty cannot be eliminated through statistics.
1. We always seem to demand a number from research — “just one number is best”. This is a lousy approach to almost every research question. The “single number fallacy” (recently, this very moment, coined by myself, I think. Correct me if I am wrong.) is “the belief that complex, complicated and even chaotic subjects and their data can be reduced to a significant and truthful single number.”
2. The insistence that all “uncertainty” is a measure of probability is a skewed view of reality. We can be uncertain for many reasons: “We just don’t know.” “We have limited data.” “We have contradictory data.” “We don’t agree about the data.” “The data itself is uncertain because it results from truly random events.” “Our measurement tools and procedures themselves are crude and uncertain.” “We don’t know enough.” – – – – This list could go on for pages. Almost none of those circumstances can be corrected by pretending the uncertainty can be represented as probabilities and reduced using statistical approaches.
3. Absolute Measurement Uncertainty is durable – it can be diluted only by better and/or more precise measurement.
4. Averages (finding means and medians) tend to disguise and obscure original measurement uncertainty. Averages are not themselves measurements, and do not properly represent reality. They are a valid view of some data — but often hide the fuller picture. (see The Laws of Averages)
5. Only very rarely do we see original measurement uncertainty properly considered in research findings – instead researchers have been taught to rely on the pretenses of statistical approaches to make their results look more precise, more statistically significant and thus “more true”.
# # # # #
Hey, I would love to be proved wrong on this point, really. But so far, not a single person has presented anything other than a “my statistics book says….”. Who am I to argue with their statistics books?
But I posit that their statistics books are not speaking about the same subject (and brook no other views). It takes quite a search to even find the correct method that should be used to add two values that have absolute measurement uncertainty stated (as in 10 cm ± 1 cm plus 20 cm ± 5 cm). There are just too many similar words and combinations of words that “seem the same” to internet search engines. The best I have found are physics YouTubes.
So, my challenge to challengers: Provide a childishly simple example, such as I have used, two measurements with absolute measurement uncertainties given added to one another. The arithmetic, a visual example of the addition with uncertainties (on a scale, a ruler, a thermometer, in counting bears, poker chips, whatever) and show them being added physically. If your illustration is valid and you can arrive at a different result than I do, then you win! Try it with the dice. Or a numerical example like the one used in Plus or Minus.
Thanks for reading.
# # # # #